Вы находитесь на странице: 1из 58

Mixed effects modelling for cognitive science: A tutorial

Michael A. Lawrence

Tutorial structure
A simple Gaussian mixed effects model Slightly-less-simple Gaussian MEMs Binomial MEMs Generalized Additive MEMs Using MEMs for inference Models, parameter estimation & quantifying
predictive success

A simple Gaussian model


Data: Simple RT measured across multiple trials within each of multiple Ss A naive (non-MEM) model: The RT value on a given trial for a given S = Grand Mean + error error = N(0,e)

A simple Gaussian MEM


To account for variability across Ss in mean RT, incorporate a new term representing each Ss deviation from the grand mean: The RT value on a given trial for a given S = Grand Mean + Sdev + error error = N(0,e) Sdev= N(0,S)

The model as an algorithm


Set grand mean to some value -> GM For subject i: Determine subject is deviation from GM by sampling a value from N(0, S) -> Sdevi For trial j: Determine error for this trial by sampling a value from N(0, e) -> eij RTij = GM + Sdevi + eij

Parameters in the simple MEM


1 constant & 2 variances Grand mean (GM) Error variance ( ) Variance of Ss deviations from GM ( )
e S

Fitting the model to a given set of data involves estimating the value of these 3 parameters.

Slightly more interesting data


Data: Simple RT measured across multiple trials within each of multiple Ss in each of 2 conditions

Multiple MEMs are possible. Presented in increasing complexity...

Constant condition effect


The RT value on a given trial for a given S in a given condition = GM + Sdev + Condition effect + error Parameters: 2 constants (GM, CE) 2 variances (e, S)

By-Ss varying condition effect


The RT value on a given trial for a given S in a given condition = GM + Sdev + Condition effect + SdevCE + error Parameters: 2 constants (GM, CE) 3 variances (e, S, SCE)

By-Ss varying condition effect


With 2 variances associated with Ss, we must specify the covariance of these (S-SCE) correlation between Ss deviations from the grand mean and Ss deviations from the overall condition effect correlation between Ss means and Ss condition effects

Possible MEMs
No by-Ss Condition effect variance GM , CE , e , S By-Ss Condition effect variance, xing S-SCE at zero GM , CE , e , S , SCE By-Ss Condition effect variance, estimating S-SCE GM , CE , e , S , SCE , S-SCE

2 predictor variables
Data: RT measured across multiple trials within each of multiple Ss in each of 2 conditions crossed with 2 sets of instructions

2 predictor variables
Most complicated model: GM , CE , IE , CE:IE , e
S S SCE SIE SCE:IE
x

SCE
x x

SIE
x x x

SCE:IE
x x x x

Benets of MEMs so far


No assumption of Sphericity; covariance matrices
are explicitly estimated of the raw data

Permit models to account for variance at the level more power to evaluate effects more power to evaluate correlations more accuracy in evaluating correlations (shared
error variance)

Additional benets
Multiple random effects random effect = a variable over which non-error
variance manifests, but you are uninterested in the specic effect of a specic level
S

, SCE , Item

permits more powerful inference by accounting


for even more variance in the data

More benets: binomial data


Dixon (2008) & Jaeger (2008) Proportions are evil MEMs can account for binomial nature of binomial
data

replace

e with

random binomial sampling

The binomial model as an algorithm


Set grand mean to some value on a continuous scale of bias (-,) -> GM For subject i: Determine subject is deviation from GM by sampling a value from N(0, S) -> Sdevi For trial j: Sample from a binomial (1/0) distribution with p(1) = inv.logit( GM + Sdevi )

Exciting features of binomial MEMs


One-step application of Signal Detection Theory

Probability Density

SDT

Signal-ness

Probability Density

SDT

Signal-ness

Probability Density

SDT

Signal-ness

Probability Density

SDT

Signal-ness

Probability Density

SDT

Signal-ness

Probability Density

SDT

Signal-ness

Probability Density

SDT

Signal-ness

Exciting features of binomial MEMs


One-step application of Signal Detection Theory Dissociate effects on response bias vs effects on
discriminability

Simply analyze response (not accuracy) and


include reality as a predictor variable reect effects on response bias effects on discriminability

effects that do not include the reality variable effects that include the reality variable reect

Exciting features of binomial MEMs


One-step analysis of psychometric function
features

ex. Temporal order judgements

0 1

Probability of saying Square rst

TOJs

0 1s

0 1

Probability of saying Square rst

TOJs

0 1s

0 1

Probability of saying Square rst

TOJs

0 1s

0 1

Probability of saying Square rst

TOJs

0 1s

Exciting features of binomial MEMs


One-step analysis of psychometric function
features

ex. Temporal order judgements explicitly model binomial responses as forming explicitly evaluate effects of manipulations on
bias and sensitivity independently

a psychometric function parameterized by bias and sensitivity

Exciting features of binomial MEMs


One-step analysis of psychometric function
features

ex. Temporal order judgements ex. SAT curves etc

Super exciting extension of MEMs


Problem of continuous data Treat as linear excellent power to detect linear effects poor power to detect non-linear effects Treat as categorical mediocre power to detect non-linear effects mediocre power to detect linear effects

Super exciting extension of MEMs


Solution to continuous data: Generalized Additive
Modelling effects

GAM exibly models both linear and non-linear GAM balances functional complexity by optimizing
prediction of future data hierarchical designs

GAMEM (aka. GAMM) simply extends GAM to

Contingent 480 460 440 420 400

Noncontingent

Response time (ms)

Intensity +dB 0dB

2.2 2.4 2.6 2.8 0 200 400 600 800 1000 0 200 400 600 800 1000

Logodds of error

SOA (ms)

Lawrence & Klein (In Press, JEP:Gen)

An intriguing application
Resolving the problem of RT data RT data is not Gaussian Shape of RT distributions are known to be
affected by experimental manipulations

Label each RT from each participant in each

condition with its %ile, add %ile as a predictor variable that can interact with condition

800

700

RT (ms)

Flanker 600 congruent incongruent 500

400

0.0

0.2

0.4

Quantile

0.6

0.8

1.0

80

75

Flanker Effect (ms)

70

65

60

55 0.0 0.2 0.4

Quantile

0.6

0.8

1.0

2.8 2.7

Shape (ms)

2.6 2.5 2.4 2.3

congruent

Flanker

incongruent

Using MEMs for scientic inference


p-values are hard to compute in MEMs p-values are not what we think they are (Haller &
Krauss, 2002)

Likelihood ratios are what we want: metrics of


evidence (Johansson, 2011)

The likelihood ratio


Probability Density

\ \ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ \ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ \ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ \ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ \ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Compute p( data | model 1 ) ~ L( model 1 | data ) Compute p( data | model 2 ) ~ L( model 2 | data ) Compute ratio of these likelihoods Most easily computed on a log transformed scale Most nicely represented on the log-base-2 scale Bits of evidence for/against an model relative
to another model

Correcting for complexity


Problem: The models have parameters that are Model 1 had 2 parameters: mean & variance Model 2 had 3 parameters: 2 means & 1
variance estimated from the data, and the models differ in the number of parameters

Model 2 may be tting better simply because it has


more exibility to spuriously t error

Correcting for complexity


Gold-standard solution: Cross-validation

The likelihood ratio


Probability Density

\ \ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ \ \\ \\\\ \\ \\ \\ \ \ / / // // //// // // // / / / / / \ Height

The likelihood ratio


Probability Density

\ Height

The likelihood ratio


Probability Density

\ Height

Correcting for complexity


Gold-standard solution: Cross-validation Ensures that model comparison focuses on Easier approximation: Akaikes Information
Criterion models ability to predict new data (i.e. the aim of science)

Asymptotically equivalent to CV

MEMs for inference


Model 1: RT ~ GM + Sdev + error Model 2 RT ~ GM + Sdev + error + CE

MEMs for inference


Model 1: RT ~ GM + Sdev + error + CE + IE Model 2 RT ~ GM + Sdev + error + CE + IE + CE:IE

Вам также может понравиться