Notes On fMRI Coursera PDF

fMRI part 1
1 of 60
fMRI
fMRI data structure
standard fMRI consist of both structural and functional data
balance spatial with temporal resolution
TR: temporal resolution
1: structural images: distinguish bw tissue types
T2*; functional images: can related changes in signal to an experimental manipulation
MRI images usually acquired in axial slices
sequentially
or
interleaved manner or acquisition
Field of View (FOV): extent of the brain that is inside the image
Slice thickness
Matrix size:
Experiment< subject< sessions< run persession< volume
fMRI part 1
2 of 60
Module 4
Psychological inference
reverse inference
Neuroimaging is the answer. What is the question?
Typical results: statistical map
colored areas indicate reliable non-zero eects

fMRI data are useful only if interpreted in the context of theories
Brain mapping provides forward inference

Forward inference: given an induced physiological state we observe brain activity

probability of observing an activity
Reverse inference: Can we infer a psychological state given the brain activity
fallacious reverse inference

Bayes rule: related forward and inverse inference:
[P(Brain/Psych)= P (brain) * P(psy/Brain)] / P (Psy)
positive predictive value:

P (enjoyment| caudate): the probability I am enjoying something given that the caudate is
activated
P (brain | caudate ): 0.9 < sensitivity

P( caudate | no enjoyment)= 0.4 < specificity
P (enjoyment) : baseline of enjoyment = 0.2
Calculate positive predictive value using Bayes rule:
P( caudate | enjoyment) = 0.2
when people make reverse inferences they assume that there is high positive predictive
value (PPV)
fMRI part 1
For a brain activation to have high PPV it must have high sensitivity and high specificity
When does reverse inference can be done
strategy 1: Leverage Neuroscience

strategy 2: Quantitative reverse inference
Assess activation of a region

Compute its sensitivity, specificity and positive and negative predictive values

Require testing many tasks, contexts
brain imaging is not good for

estimating eect sizes and predictive accuracy

testing assumptions

comparing evidence for dierent theories
3 of 60
fMRI part 1
4 of 60
Basic understanding of fMRI principles

Acquisition and reconstruction
net magnetization M is a vector of two components

longitudinal component: parallel to the magnetic field

transverse component: perpendicular to the magnetic field
in the absence of an external magnetic field there is no net magnetization. the nuclei of
magnetic atoms are randomly oriented
when placed in a strong magnetic field there is a net longitudinal magnetization in the direction
of the field
The nuclei precess about the filed with an angular frequency determined by the Larmor
frequency
A radio Frequency pulse is to used to align the phase and tip over the nuclei.
This causes the longitudinal magnetization to decrease and establishes a transversal

magnetization
The nuclei point to a direction orthogonal of the magnetic field
Longitudinal relaxation: The restoration of net magnetization along the longitudinal direction
as spins return to their parallel state
When the RF is removed the longitudinal magnetization grows back to its original size and the
transverse component decreases
A signal is created measured by a receiver coil
exponential grown described by time constant T1
Transverse relaxation: loss of net magnetization in the transverse plane due to loss of phase
coherence
Exponential decay described by time constant T2
fMRI part 1
5 of 60
dierent tissues (gray, white matter, CFS) have dierent T1 values
gray: 1000 msec
white: 600 msec
CFS: 3000 msec
TR: how often we excite the nuclei
TE: how soon after excitation we begin data collection
by altering the above we control which characteristic we emphasize on
The image represents the density or relaxation time of atoms in a tissue
Proton density: Long TR, short TE
T2 image: long TR, Long TE
T1: short TR short TE
T2* : combined eect of T2 and local inhomogeneities in the magnetic field. used to image
brain function
The scanner can eliminate or emphasize the eects of inhomogeneities.
by emphasizing them we obtain the BOLD signal
How can we use the signal to create an image

scan involve the construction of a 3D image from 2D slices
a brain slice is split into equally size voxels
(x, y): number of protons in each protons
magnetic field gradient: we sequentially control the spatial inhomogeneity of the magnetic field,
meaning that we slightly change the field
The measurements that we make are the Fourier transform of the image we would like to
reconstruct
the measurements are acquired in the frequency domain (k-space)
fMRI part 1
6 of 60
constructing an image of the whole brain is an inverse problem (). we need

enough measurements of kx and ky so that we reconstruct ( x,y)
discrete Fourier transforms
the better the spatial resolution the more k-space measurements I need to make (e.g. for 64 x
64 resolution I need 4096 measurements, meaning that I need to change the values of kx and
ky 4096 times.
methods of acquiring the data is k-space
EPI: echo plane imaging
spiral
the measured k-space data is complex valued
we work with the magnitude of the complex numbers
k-space< inverse fourier transform< image space
K-space
data is acquired in the k-space. By application of the inverse fourier transform we acquire an
image
there is not an one-to-one relationship between image and k-space. So each individual point in
an image space depends on all the points acquired in the k-space
low frequency portions of k-space: higher periodicity of the image waves

high frequency: low periodicity of the image waves
move left or right: change the direction of waves
if we are interested in the relative contribution of the high versus the low frequency waves in
the k-space, we select for one of the two
high frequency parts oscillate quickly and are responsible for detail in the picture. they
represent small structures whose size is on the same order as the voxel size (tissue
boundaries)
fMRI part 1
7 of 60
low frequency parts (center of k-space) give the contrast. They represent parts of the object
that change is a slow manner
Physiology, Signal and Noise

BOLD fMRI: ratio of oxygenated to deoxygenated hemoglobin in the blood
it measures the metabolic demands of active neurons
oxyhemoglobin is diamagnetic
deoxyhemoglobin is paramagnetic. it suppresses the MRI signal.
Active neurons increase blood flow< decrease in deoxy-hemo< increase T2* weighted image
hemodynamic response function (HRF):
As neuronal activation increases so do the metabolic demands for oxygen. So as oxygen is

extracted from the blood the hemoglobin becomes paramagnetic and therefore T2* decreases:
initial dip:i nitial decrease in T2* signal
peak BOLD 4-6 s after activation: over-compensation in blood flow dilutes the concentration
of deoxy-hemo and the BOLD signal increases
post-stimulus undersut: then the BOLD decreases below baseline levels
properties of HRF:
1. small magnitude of signal change
2. response is delayed and slow
3. exact shape of response vary across subjects and regions
how BOLD reflects neuronal activity:
fMRI part 1
8 of 60
it corresponds closely with the local field pontential (LFP)
point spread function of BOLD
but does not always reflect changes in neuronal activity: e.g .blood steal phenomena
Signal to noise ratio (SNR): the strength of a signal divided by an estimate of noise variability
basic measure of eect size
Contrast-to-noise (CNR): The dierence between two signals divided by an estimate of noise
variability
ways of calculation
spatial signals : calculate across one image
spatial SNR: mean intensity within signal area of interest () divided by standard
deviation outside signal area ().
spatial CNR : dierence in intensity between two tissue types divided by the variability
of measurements (1-2)/1,2
temporal signals: calculated at each voxel across time
time series measures: calculated at each voxel across time

temporal SNR (or functional SNR): mean signal across time divided by variability (i.e.
standard deviation) across time. /

Temporal CNR (or signal sensitivity)
dierence in intensity for on vs o states divided by variability
related to sensitivity to task
Scaling:
the absolute scaling of BOLD is arbitrary
fMRI part 1

depends on
1. the field strength
2. acquisition parameters
3. tissue type
implications
non-linearity in BOLD response

refractory eects

saturation: reductions in amplitude of a response as a function of inter-stimulus
intervals
Artifacts and Type of Noise

artifacts appear as

high frequency spikes

image distortions

periodic fluctuations

slow drift across time
how to prevent noise:
1. acquisition
2. Analysis:
Issues to check:

1. coverage

2. RF noise and malformed images

3. Transient gradient artifacts/spikes

4. Ghosting

5. dropout

6. task-correlated movement
Non-signal related components
drift: slow changes in voxel intensity over time. low frequency noise
9 of 60
fMRI part 1
10 of 60

main reason: scanner instabilities
experimental designs should use high frequency. quick changes for on to o
motion correction in the preprocessing stages of analysis
but
spin-history artifacts: due to complex interactions with the magnetic field
aliasing: if TR is too low. periodic signals that occur more rapidly than the sampling rate will
often be aliased back to lower frequencies.
to avoid aliasing we must sample at least twice as fast as the fastest frequency in the signal
Spatial and temporal resolution of BOLD

spatial resolution: below 1 mm
with BOLD fMRI we can collect info on :
functional maps
functional column
large-scale networks
BOLD point-spread function: related to microvasculature bed area aected by local neuronal
activity and venous/arterial flow contributions
what is the eect of resolution in group analysis: 1-15 mm
why?
artifacts
inter-subject normalization
ind. dierences in functional anatomy
diuse modulatory eects
spatial alignement
fMRI part 1
11 of 60
hyper acuity: the patter of activity across voxels may contain more info than any one voxel

can sometimes detect functional topography even if voxels are not small enough

to fit within one functional area
even if neurons are randomly intermixed MVPA may still identify patterns that are dierentially
associated with each
the classification of neuronal activation for events types is still possible because patterns of
activation for events are uncorrelated
hyperalignement: direct inter-subject alignment of brains in a functional space defined by

activity during movie watching
better cross subject matching
Temporal Resolution
limited in fMRI
peak of response: 5-6 s
onset: 2-3 s
Temporal hyperacuity:
if events are averaged can detect dierences in response latency of 100-200msec
sampling of the whole brain in 100-200 sec
Experimental and Design

Block design: Similar events are grouped
E.g. examining brain activation upon presentation of famous vs non famous faces. We present
a block of famous faces and after a while a block of non famous faces
Event-related design: Events are mixed
Rule of thumb: two condition block design with 16-20 sec blocks maximize power
16-20 sec optimal power
fMRI part 1
12 of 60
Goal: induce human subject to do or experience the psychological states you are studying
Considerations:
1. eect of stimulus predictability

Predictability influences psychological state. the predictability of a stimulus influences
how fast someone responds to it
2. eect of time on task

subjects should be engaged on the task as much time as possible
for this sometimes a rapid presentation of stimuli is needed
3. Participant strategy

dierent stimuli configurations aord dierent strategies. e.g stroop test: compatible
(the word and the colour are the same) vs

incompatible trials.

you can not block compatible and incompatible trials
4. Temporal precision of psychological manipulations

what we expect subjects to do should fit with what they can do.

E.g recall happy vs sad memories. people cant switch back and forth from sad faces
5. consequences of unintended psychological activity

e.g .spatial attentional shifting
subject shift attention to fixation cross spontaneously
Kinds of Designs
Design
How many independent variables?
What kinds of measure variables
Trial Structure
How will events be organized in time?
Blocked/event related/ mixed
rapid/single trial partial trial
fMRI part 1
13 of 60
Basic principle: designing a study that is powerful for one purpose vs designing a study for
many purposes
within person variables: manipulated across time .

1. within factors

2. within-levels of factors

within observed variables (e.g. peoples performance)
between-person variables:
between factors

between levels

between observed
fMRI part 1
14 of 60
Examples
Simple subtraction: compare task A vs task B.

look at negative and neutral pictures

condition A: just look at them

condition B: look and reappraisal

Pure insertion principal: processes in complex conditions are simply added on top of
those in simpler, baseline conditions. Cognitive subtraction
Problem: the processes might interact with context
individual dierences designs: by making correlations with individual dierences we can

increase the specificity of inference.

E.g. brain- behavior correlation: correlate brain activity during A stimulus with
the level of the psychological state the stimulus is supposed to induced. E.g. brain activity
during reappraisal of negative stimuli vs level of psychological reappraisal.
Multiple subtraction: triangulate between event A and brain process by subtracting multiple

kinds of events

If the dierent conditions that we are subtracting o have the di

characteristics we can control for some of the drawbacks of pure

insertion.

E.g. compare brain activation at faces and brain activation with objects

and intact faces with scrubbed faces. this is to make sure that the FFA is

activated when someone sees a face and not other things
process overlap/dissociation

double dissociation: task A activates more than task B in one area and B

activated more than A in another area.
separate modifiability: task A activates one area but B not and vice versa
factorial design: view two factors at once and test of dissocciations in the area activated.

e.g task switching experiment:

factor 1: internal switch of attention between objects (2 levels)

factor 2: external switch of attention (2 levels)

2 x 2 factorial space
fMRI part 1
15 of 60
parametric modulation: manipulation variables in a parametric fashion within persons

enhance specificity

e.g. parametric increase in blood flow as the task complexity

increases
fMRI part 1
16 of 60
Preprocessing of fMRI data
Goals: 1. minimize the influence of data acquisition and physiological artifacts

2. check statistical assumptions and transform data to meet the assumptions

3. standardize the locations of brain regions across subjects to achieve validity and

sensitivity in group analysis
Steps:
Visualisation/ Artifact Removal:

e.g. drift

principal component analysis
Slice Time Correction:

we might sample multiple slices of the brain during each individual repletion time

(TR) to construct a brain volume
fMRI part 1
17 of 60

Typically each slice is sampled at slightly dierent time points . e,g, the top of
the brain volume might be sample a second later than the bottom
slice time correction shifts each voxels time series so that they appear to have been sampled
simultaneously
slices 1, ,2 ,3 are
di time points
have di time
sampled at
so they
curves
fMRI part 1
18 of 60
How to correct for this?
temporal interpolation: use info from near time points to estimate the amplitude of the MR
signal at the onset of each TR

use a linear, spline or sinc function
phase shift: slide the time course by applying a phase shift to the furrier transform of the time
course
Head motion:
when analyzing the time series associated with a voxel we assume that it depicts the same
region of the brain at every point
by head motion might compromise this assumption
we correct it with rigid body transformation

the goal is to find the best possible alignment between an input image

and some target image
it involves 6 variable parameters. 3 sets of translation in the x,y direction and 3 sets of rotations
(6 DOF).
fMRI part 1
19 of 60
Rigid Body transformation (6DOF): 6 translations and 6 rotations
Similarity (7DOF): 3 translation, 3 rotation and a single global scaling
Ane (12 DOF): 3 translation, 3 rotation, 3 scaling, 3 shearing
Warping
Transformations where the equations relating the coordinated of the image are non-linear
in motion correction the target image is usually the first one
minimize some function: e.g. sum of square dierences, mutual information
Co-registration
of the functional image with the structural image
simplifies later transformation of the fMRI to a standard coordinate system
the functional and structural images do not have the same signal intensity in the same areas so
they cannot be subtracted
So we have to use at least one ane transformation and a cost function
fMRI part 1
20 of 60
Warping to atlas template

normalization allows one to stretch, squeeze and warp each brain so that it is the same as
some standard brain
the structural MRI image used in the coregistration is wrapped onto a template image
Talairach space
based on a single subject , on a single hemisphere
Montreal Neurological Institute: combination of many subjects but right handed only
fMRI part 1
Spatial Filtering (Smoothing)

increase signal to noise ration and remove artifacts
21 of 60
fMRI part 1
22 of 60
matched filter theorem: a filter that is matched to the signal will give optimum signal to noise
typically the amount of smoothing is chose airport and is idependent to the data
so
adaptive smoothing might be an option

Non-stationary spatial Gaussian Markov random field

smoothing varies across space and time
fMRI part 1
23 of 60
General linear model

two level hierarchical analysis

1. within subject

2. across subjects
do it in stages
i. design specification (model building)
ii. estimation . model combined with real data (1st level)
iii. group analysis (2nd level)
iv. inference about areas activated in the group
The GLM approach treats the data as a linear combination of model functions (predictors) plus
noise (error)
the model functions have known shapes (lines, curve) but unknown amplitudes
simple regression
anova
Multiple regression
mixed eects/hierarchical
time series (e.g .autoregressive)
fMRI part 1
24 of 60
robust penalized regression (LASSO, Ridge)
non normal errors (logistic regression)
in most cases there is a closed form solution while others require iterative solutions
Simple regression

one prediction one outcome
Step 1: specify the model: there is a linear relationship between the predictor and the outcome
Step 2: Estimate: estimate slope and intercept
Step 3: Statistical inference: test slope and get a p value: how likely is that i have observed this
slope under the null hypothesis
Steo 4: Scientific interpretation
GLM: one continuous DV
repeated measures design:

paired t test
generalised least square (iterative): correlated errors. e.g .time series

each measure is not independent

from the previous one
aim at regression model: calculate
we do so by minimizing sum of squared residuals
fMRI part 1
25 of 60
fMRI part 1
26 of 60
apply GLM to fMRI

Typically two-level hierarchical analysis

within subjects

between groups
do it in stages
i. design specification (model building)
ii. estimation . model combined with real data (1st level)
iii. group analysis (2nd level)
iv. inference about areas activated in the group
Mass Univariate Analysis

typical model
we construct a separate model for every voxel
Brain activity in each voxel is the outcome (Y)
Stimulus, task and/ or behavior are the predictors
assume voxels are independent
Consider a single voxel in a single subject
block of famous vs non famous faces
fMRI part 1
27 of 60
matrix of bloc design
intercept: constant: mean level of fMRI signal across time
task regression: capture the eect of famous vs nonfamous faces
1: activation parameter estimate: estimated response amplitude

estimate of how large the dierence in activation is between famous and nonferrous
faces
event related design:
fMRI part 1
28 of 60
famous and non famous faces have their own beta
hemodynamic delay: BOLD responses are delayed and dispersed relative to neural activity
Assume impulse response in model
Common model: Fixed linear

combination of two gamma functions
How to turn assumed neural responses

into a predictor in a GLM model?
Assume a linear time invariant system: the

neural activity acts as the input or impulse
and the HRF (hemodynamic response)
acts as the impulse response function
x (t)= v(h(t))
x(t): fMRI signal over time
v(t): stimulus function
h(t): hemodynamic response
stimulus function (blocks or

events )convolved with
hemodynamic response function
linear: same HRF no matter what

came before
fMRI part 1
29 of 60
Details of building GLM models

GLM for more than two conditions: we specify number of conditions (e.g. A, B, C, D) convolved
each of them with an assumed basis function and end up with a design matrix.
multiple predictors and design contrast

block design;
a single predictor that captures the dierence
event-related design:
one regression for each condition
in this case we can assess
the dierence between famous and non famous

faces
each one separately
their average
dierent linear contrast for each parameter
fMRI part 1
Contrast: linear combination of GLM parameters

T-contrast: single, planned contrast > t test

Specified by a vector of weights (c) so that cT= a scalar value

signed: positive and negative values
dierence contrast [0 (for intercept) 1 (for famous) -1 (for non-famous) ]
Sum (average) contrast (famous vs rest) [ 0

1
0]
Single event contrast[ 0

1
1]: test only the famous faces against intercept
Factorial repeated measures design

Apply contrast
linear combination which equals the

parameter estimate for column A plus
B minus C plus D
Rules for T contrast

1. C can be a matrix

columns are applied independently
first column reflects main eect of factor 1
second column: main eect of

factor 2
third column: cross over

interaction: test if the eect of
factor 1 depends on the level of
30 of 60
fMRI part 1
31 of 60
factor 2
2. the scaling aects the magnitudes but not the inferences I make (t, p values)
Contrast weights must be
all participants. so no
runs
3: contrast weights typically sum to 0

0 is the null hypothesis
Exception: I can test the average of one or more conditions against the implicit

baseline
in order to construct a linear design matrix we have to assume that:
the same for

missing
fMRI part 1
32 of 60
1. neural function is correct
2. HRF is correct
3. Linear time interval system
Linear Basis Set

How to relax these assumptions
often fixed canonical HRF is used to model responses to neuronal activity
but if this is not the case the power of analysis is reduced
The HRF shape depends both on the vasculature and the time course of neuronal activity
so assuming a fixed HRF is not

usually appropriate
first picture: o in timing
second picture: missed shape of

peaks later
bottom left: missed duration
bottom right:
response, the true response
Also HRF varies across brain

regions
Temporal basis functions

model HRF as a linear combination of temporal basis functions, f(t) , such that the overall
response is a sum of three activation parameters * the respective factors
Data fit with three choices of basis set:
fMRI part 1
one predictor per time point
33 of 60
FIR model
purple predictor: captures what is happening in the

first couple seconds following even onset
and so on
so 4 predictors
Canonical HRF: two predictors for the onsets
3-parameter model: regressors, each of those is 3

basis functions convolved with the event onsets
FIR: six regressors for event time
Choosing basis set:

1. accuracy(bias): can the model capture the true response without systematic
variance?
canonical model of HRF: strong assumptions about shape, more bias
FRI: weak assumptions about shape but less bias
2. precision(variance): are the model parameters estimated with littles error variance?
canonical model: few parameters so high precision
FRI: many parameters, low precision (noisy)
fMRI part 1
34 of 60
bias/variance tradeo
SO we need a model which will be

1. simple:

few parameters so high precisions

parameters are interpretable measures of neuroscientific interest

3. Accurate in ways that count:captures the true response amplitude int he
physiological range
Filtering a Nuisance Covariance

model factors associated with known sources of variability but that are not related to
experimental hypothesis need to be included in the GLM
Examples:

signal drift

physiological artifacts (e.g.
respiration)

head motion
Drift
slow changes in voxel-intensity over time
due to scanner instabilities
need to include drift parameters in our model
design matrix has 11 columns:

1 corresponds to task

2. baseline
we want to remove the black line and get to the

red line
Transient gradient artifacts

control for spikes
Physiological Noise
fMRI part 1
35 of 60
Respiration and heart beat give rise to periodic noise often alliased into task frequencies
if the TR is too low there will be problems with alliasing
the sampling rate must be twice as big as the frequency of the curve you seek to model
Head Motion
Basic motion correction in the pre-processing stages takes into account gross dierences bet
How to deal with head motion artifacts?
Nuissance regressors:movement and CFS related artifacts
Scrubbing: drop images with high movement estimates. treat as missing data
physiologically implausible whole brain activation or deactivation
fMRI part 1
how to estimate ?
36 of 60
GLM Estimation
t
the desired line is the line that makes the square of the residuals e1, e2, e3 as small as
possible
The least square criterion:
Q= (Y-X) (-)
taking the derivative with respect to and setting it to 0 gives us the normal equations:

XX(hat)=XY
fMRI part 1
37 of 60
by solving for we get the ordinary least squares (OLS) estimations:
Properties:
the estimated value of hat equals to beta
the variance of beta hat
that means that any other unbiased estimator of beta will have a larger variance that the OLS
so hat is the best linear unbiased estimator (BLUE)
fMRI part 1
38 of 60
R= residual inducing matrix
Noise Models
is not typically the identity matrix because fMRI data typically exhibit significant
autocorrelation caused by physiological noise and drift that has not bee appropriately modeled
typically modeled with AR(p) or ARMA (1,1)
Autoregressive model of order 1

the autocorrelation function (ACF) between adjacent time points depends on how closely
dierent time points are to one another
fMRI part 1
39 of 60
the autocorrelation is equal to 1 if the lag is 0. However if we have a lag of one time point the
autocorrelation is equal to and decays as we move further
In general the form of the comatrix is unknown therefore it

needs to be estimated.
Therefore, estimating V depends on and estimating

depends on V. We need a iterative procedure (assume a value
of V estimate and the re estimate V)
Iterative Procedure
fMRI part 1
40 of 60
Inference, Contrast and t-tests
After fitting the GLM we use the estimated parameters to determine whether there is significant
activation present in the voxel.
our estimate is normally

distributed
Contrasts
test whether linear combinations of parameters are significant
0 for 1
1 for 2
-1 for 3
null hypothesis: 2=3
in order to test the null hypothesis against the non null we use t statistics
fMRI part 1
41 of 60
if we want to make simultaneous test of several contrast at once, c is a contrast matrix
we want to test if 1 and 2 are simultaneously or both equal to 0
c is just an indicator of the drift components
if s are simultaneously equal to 0 the drift will not

contribute
if we want to examine whether the drift components

contribute to the model we test whether there is a
dierence between the full model (including the design matrix for the drifts) from the reduced
model (excluding the design matrix for the drift)
we test that using F Statistics
So for each voxel a hypothesis test is performed

and the statistics corresponding to that test is
used to create a statistical image over all voxels
fMRI part 1
42 of 60
image of T statistics across space:
Use GLM to analyze fMRI data
1. construct a model for each voxel of the brain
1. massive univariate approach: separate models are fixed to each voxel of the brain
2. Perform a statistical test to determine whether

there is task related activation in each voxel. test the null hypothesis. Get a statistical image
3. Choose an appropriate threshold for determining statistical significance.
Get a statistical parametric map
How to determine the threshold?

Problems: many false alarms
many hypothesis tests are performed

simultaneously so many test statistics are
inflated due to noise
choosing a threshold in to determine a balance between sensitivity (true positives) and

specificity (true negative rate)
=0.05 threshold so 5000 false positive voxels
Group-level
fMRI part 1
43 of 60
Analysis I
Muti-level Analysis
fMRI experiments are often repeated for:

several runs in the same session

several sessions on the same subject

several subjects
the first level deals with individuals subjects
the second level deals with groups of subjects
contrast images between subjects are compared
backwards from group results to individual subject result
fMRI part 1
44 of 60
the most basic kind of group result is contrast values between task and control. Each dot is a
score for one subject
the spread between the dots reflects the variance of the data or the noise
Group Analysis II
multi-level models allow dierent variance

components to be introduced
at each level (within or between subjects)
Mixed-Eect Models
Hierarchical Models
Random-Eects (RFX) Models in

neuroimaging
all have multiple sources of variation
fixed eect model only one source of

variation
Sources of variation of the signal strength

a). measurement error

b) response magnitude
Fixed Eect: always the same from experiment

to experiments.

sex
fMRI part 1

45 of 60
drug type
Random eects

subject

word (e.g. choose one positive words and one negative)
if eect is treated as fixed, error terms in model do not include variability across levels
we cannot generalize to unobserved levels
so if we treat subjects as fixed eects we cannot generalize
Contrast design matrix: all 0s and 1s
Group Level Analysis III
fMRI part 1
46 of 60
Design matrix of the second-level

for four subjects
the second level relates the subject

specific parameters contained in
to the population parameters g
fMRI part 1
47 of 60
it assumed that the first level parameters are randomly sampled from a population of possible
regression parameters
Statistical techniques define the loss function that should be minimized in order to find the
parameters of interest

used techniques: minimize loss function, reduced minimized loss function
algorithms define the manner in which the chosen loss function is minimized
fMRI part 1
48 of 60
Multiple Comparison Problem in fMRI
Null Hypothesis: no eect

1-2=0
Test statistic T :measures the compatibility between the null hypothesis and the data
P-value: probability that the test statistic would take a value as or more extreme than actually
observed if Ho is true
P(T>t| Ho)
Fixed threshold: significance level:

Threshold u controls false positive rate at level

= P(T>|o)
The probability that the test statistic lies above that value is equal to value a
Errors:
1. Type I error: the null hypothesis is true but we mistakenly reject it (false positive)
control by significance level . small controls for that
2. Type II error: the null hypothesis is false but we fail to reject it
Power of the test: probability that a hypothesis test will correctly reject a false null hypothesis
Choosing an appropriate threshold
more than one hypotheses are performed so the the risk of making a single Type 1 error is
greater than the value for a single test.
the more tests one performs the greater the likelihood that he will perform a false positive
100.000 voxels with a 0.05 threshold will give us 5.000 false alarms
choosing a threshold is a balance between sensitivity (true positive) and specificity (true
negative)
How to quantify the likelihood of obtaining false positives?
fMRI part 1
49 of 60
1. Family-Wise Error Rate (FWER)

probability of making any false positives
2. False Discovery Rate (FDR)

controls the proportion of
Family Wise Error Rate (FWER)
Ti= value of the test statistic at voxel i
the probability of making one or more Type I errors in a family of tests under the null hypothesis
Bonferroni correction
Random field theory
In the FWER hypothesis, Ho states that there is no activation in any of the m voxels
If we reject a single voxel null hypothesis we reject the whole FWER hypothesis
a false positive at any voxel gives a Family-Wise Error (FWE)
we want to control the probability that any of the test statistics in any of the voxels is above
some value u
if it is above u we will reject the null hypothesis in that voxel
Bonferroni correction
m= total number of voxels
fMRI part 1
50 of 60
if we have 10 hypothesis and we
want to control for the probability
that the sum of them is above 0.05
then we take the threshold for each
to be 0.005 (0.05/10)
100 x 100 voxels from iid N (0.1)

distribution
500 false positives
Bonferroni is conservative and leads to a decrease in the power of the test
in general not optimal for correlation data
Random Field Theory

allow one to incorporate the correlation into the calculation of the appropriate threshold
it is based on approximating the distribution of the maximum statistic over the whole image
fMRI part 1
51 of 60
The FWER is the probability of getting a FWE
So we choose the threshold u such as the

max only exceeds it % of the time
Random Field is a set of random variables

defined at every point in D-dimensional
space
A Gaussian random field

has a Gaussian distribution at every point and every collection of points.

and so it is defined by its mean and covariance
we consider a statistical image to be a lattice representation of a continuous random field
Random field methods are able to:

approximate the upper tail of the maximum distribution which is the part needed to find
the appropriate thresholds and

account for spatial dependence
Euler Characteristic: the property of an image after it has been thresholded
fMRI part 1
52 of 60

no of holes
counts the number of bolbs-
if we threshold at u=0.5 we get 27 bolbs (whites in the picture) and one hole (in one of the
bottom bolbs)
Euler characteristic =2
Euler characteristic=1
How do we know the expected Euler

characteristic?
fMRI part 1
53 of 60
Properties:
as u increases, FWER decreases
as V (no of voxels we are controlling for) increases FWER increases
as smoothness increases, FWER decreases
RFT assumptions

1. The entire image is either multivariate Gaussian or derived from multivariate Gaussian
images

2. The statistical image must be suciently smooth to approximate a continuous
random field

FWHM smoothness of 3-4 voxel sizes is preferable
False Discovery Rate Correction
FWER: controls the probability of any false positives
FDR controls the proportion of false positives among all rejected tests
Suppose we perform test on m voxels
V: false positive
U,V, T and S are unobservable radom

variables
R is an observable random variable
FWER=P(V>or = 1)
V: number of false positives
FDR= E*( V/R)
fMRI part 1
54 of 60
FDR is 0 if R=0
A procedure controlling for FDR ensures that on average the FDR is no bigger than a prespecified rate q which lies between 0 and 1
anything below the black line is active and anything above is not active
Pitfalls and Multiple Comparisons
Cluster-level inference

1, define clusters by arbitrary threshold Uclus
2. Retain clusters larger than -level threshold Ka
fMRI part 1
55 of 60
better sensitivity especially to weak distributing signals
Worse spatial specificity (cluster of voxels instead of individual voxels)
Threshold-free cluster enhancement (TFCE)
Integral M (blue are) must be

above some threshold
combines info about the intensity (how big the t statistics are and how many voxels there are)
What people are doing?
fMRI part 1
56 of 60
Uncorrected thresholds: many studies use them especially when the sample size is small
because correcting for thresholds would decrease the power
Extent Threshold: use an arbitrary threshold and retain clusters of k contiguous active voxels
(e.g. p<0.001, 10 contingent voxels)
but problematic because false-positives are contiguous regions of multiple voxels due to
smoothness
the thresholds used are the default ones in each package (!)
in each blob (orange, blue) there

is at least one are that is
significantly activated but we do
not know what this is.
False discoveries: most

(45-70%) activated maps are not
truly active (Wood et al. 2014)
fMRI part 1
uncorrected thresholds must be set at 0.001
57 of 60
fMRI part 1
58 of 60
MTI fMRI lecture

in the linear combination predictors are dierent conditions
e.g .x1 is predicted signal upon presentation of words

x2 is the predicted signal upon presentation of nonsense stimuli
1 is the predictor estimate which tells you how much the bold signal in one voxel increases in
response to sentences
2 is the predictor estimate that tells you how much the signal increases in response to
nonsense words
1 and 2 are weighted meaning that if a voxel responds more to words that nonsense
stimulation 1 should be greater than 2
Analysis: the approach that works

1. find beta weights that best approximate a voxels signal

the best approximation is the one with less error

2. compare the beta weights for dierent predictors (sentences vs nonsense words)
in a GLM model we know

the design matrix

the BOLD signal
and we look for the beta weights
we look for beta weights that give the best approximation that is the one the minimizes the
sum of the squared errors
in the GLM we include as predictors : six predictors for head motion and predictors for time
derivatives that enable us to shift our approximation one time point back or forward.
that is because not all participants have the same hemodynamic function
GLM
1 Extract the signal time-series from a given voxel
2. run a GLM (with the signal and the design matrix as input)
3. compute the sum of squared errors (SSE)
4. Define the contrast and test it (with a t value)
fMRI part 1
59 of 60
We repeat that for every voxel
significant values are those that have t values that are not random
RFX using Summary statistics

fixed eect: models the mean
random eects: models the variance. random university eect
1. dierent GLM for each subjects or multiple subject-separable GLM
2. contrast vector for each subject
3. feed contrast vector to GLM
subjects should have identical design matrices
balanced design matrices at the first level
fewer voxels deemed active compared to average eect analysis. Why? because it takes into
account between subject-variability
Study Random Eects with:
1. Summary Statistics: we take the sample mean for each subject
2. Maximum Likelihood Estimators: we take the observed mean
Smoothing:

simple smoothing by mean: replace the values of 10 voxels by the mean value

with a Gauss kernel: weighted average for each voxel, in relation to the neighbors
RPV image: shows rebel per voxel= smoothness of data in each voxel
Bonferroni: uses number of independent voxels
Random Field: uses number of independent vessels
Convolusion of HDR:

canonical HRF: same for all brain regions

temporally basis function: dierent HRF for dierent brain regions
basis functions are determined by the parameter estimates
fMRI part 1
60 of 60
in the GLM framework the convolution of the stimulus function with each basis function is a
column in the design matrix
Finite Impulse response: one HRF for each event type in each voxel
Methods of convolution of HRF:

1. canonical HRF: fixed HRF

2. canonical HRF+ temporal derivative

3. canonical HRF+ temporal derivative+ dispersion derivative

4. FIR

5. smooth FIR

six. Inverse Logit Model
parametric modulation: a stimulus is parametrically varied across repetitions and this might
be reflected in the neuronal response. they are used to model trial to trial variation
we account for it by including an additional regressor in the design matrix
temporal autocorrelation:

nearby time-points are positively correlated because of

physiological noise

drift
Modeling of noise:

AR (p)

ARMA (1,1):

Notes On fMRI Coursera PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Notes On fMRI Coursera PDF

Загружено:

Авторское право:

Доступные форматы

fMRI part 1

fMRI data structure

standard fMRI consist of both structural and functional data

balance spatial with temporal resolution

TR: temporal resolution

1: structural images: distinguish bw tissue types

T2*; functional images: can related changes in signal to an experimental manipulation

MRI images usually acquired in axial slices

interleaved manner or acquisition

Experiment< subject< sessions< run persession< volume

Neuroimaging is the answer. What is the question?

Typical results: statistical map

colored areas indicate reliable non-zero eects

Brain mapping provides forward inference

fallacious reverse inference

[P(Brain/Psych)= P (brain) * P(psy/Brain)] / P (Psy)

positive predictive value:

P (brain | caudate ): 0.9 < sensitivity

Calculate positive predictive value using Bayes rule:

P( caudate | enjoyment) = 0.2

When does reverse inference can be done

strategy 1: Leverage Neuroscience

brain imaging is not good for

Basic understanding of fMRI principles

net magnetization M is a vector of two components

This causes the longitudinal magnetization to decrease and establishes a transversal

The nuclei point to a direction orthogonal of the magnetic field

A signal is created measured by a receiver coil

exponential grown described by time constant T1

Exponential decay described by time constant T2

dierent tissues (gray, white matter, CFS) have dierent T1 values

gray: 1000 msec

white: 600 msec

CFS: 3000 msec

TR: how often we excite the nuclei

TE: how soon after excitation we begin data collection

by altering the above we control which characteristic we emphasize on

The image represents the density or relaxation time of atoms in a tissue

Proton density: Long TR, short TE

T2 image: long TR, Long TE

T1: short TR short TE

The scanner can eliminate or emphasize the eects of inhomogeneities.

by emphasizing them we obtain the BOLD signal

How can we use the signal to create an image

a brain slice is split into equally size voxels

(x, y): number of protons in each protons

constructing an image of the whole brain is an inverse problem (). we need

discrete Fourier transforms

methods of acquiring the data is k-space

EPI: echo plane imaging

we work with the magnitude of the complex numbers

k-space< inverse fourier transform< image space

low frequency portions of k-space: higher periodicity of the image waves

move left or right: change the direction of waves

Physiology, Signal and Noise

it measures the metabolic demands of active neurons

deoxyhemoglobin is paramagnetic. it suppresses the MRI signal.

hemodynamic response function (HRF):

As neuronal activation increases so do the metabolic demands for oxygen. So as oxygen is

initial dip:i nitial decrease in T2* signal

post-stimulus undersut: then the BOLD decreases below baseline levels

1. small magnitude of signal change

2. response is delayed and slow