Tom Gijssels | 26 Mar 19:15
Picon

Conflicting p-values from pvals.fnc

Dear R-langers, I'm trying to run a mixed effect model using the lmer() function and have run into some issues in interpreting the p-values generated by pvals.fnc(). The design is a between-subjects design, with two fixed effects (condition & block; each with two levels), and one random effect (subject). Additionally, I have a set of weights that I want to include. When looking at the pvals.fnc() output,there appears to be a large discrepancy between the pMCMC values and the t-statistic p-values. Whereas one of the main effects and the interaction are far from significant judging by the pMCMC values, they are highly significant when looking at the t-statistic p-values (e.g. Condition: pMCMC = 0.2294; Pr(>|t|) = 0.0000 & Condition*Block: pMCMC = 0.3296; Pr(>|t|) = 0.0000) . I have read that the t-statistic based p-values are less conservative, but the difference between these two values seems really extreme. Below some code that simulates the model and the data. The original data set has two precise characteristics that might influence the results, so I tried to simulate those characteristics in the mock data. That is: 1) there's fewer observations in block A than in block B; and 2) the weights for observations in block A generally are lower than those for block B. Running this code reproduces the original observation of conflicting pMCMC and p-T-test values. However, when excluding the weights argument from the lmer model, these values seem to converge, suggesting that the weights specification might be underlying these problems. In short, my question is whether anyone knows why these values diverge and what I could do to address this issue. Many thanks in advance! Tom block <- as.factor(c(rep('a', times = 20), rep('b', times = 200))) condition <- as.factor(c(rep(c('x', 'y'), each = 10), rep(c('x','y'), each = 100))) contrasts(block) <- c(-0.5, 0.5) contrasts(condition) <- c(-0.5, 0.5) subject <- c(rep(1:4, each = 5), rep(1:4, each = 50)) intercept <- 100 block.me <- 20 condition.me <- 30 err <- rnorm(length(block), sd = 20) weights <- c(rep(1, times = 20), rep(10, times = 200)) y <- intercept + ifelse(block == 'a', block.me, 0) + ifelse(condition == 'x', condition.me, 0) + ifelse(block == 'a' & condition == 'x', 30, 0) + (subject * 10) + err fm.1 <- lmer(y ~ block * condition + (1 | as.factor(subject)), weights = weights, REML = FALSE) fm.1.mcmc <- pvals.fnc(fm.1, addPlot=F) [[alternative HTML version deleted]]

Sverre Stausland | 19 Mar 17:37
Picon

Simpler model with random slopes

Hi all,

this question is ultimately based on Florian's lecture1 slides here:
http://hlplab.wordpress.com/2010/05/10/mini-womm/

I'm doing a mixed model logistic regression, with random intercepts
for items and random slopes for items with respect to the fixed effect
Indep2 (cf. slide 85):

(a) glmer(formula = Dep ~ 1 + (1 | Item) + (0 + Indep2 | Item) +
Indep1 + Indep2, data = my.data, family = binomial(link = "logit"))

As per slide 88, I can also reduce the random effects to (1 + Indep2 | Item):

(b) glmer(formula = Dep ~ 1 + (1 + Indep2 | Item) + Indep1 + Indep2,
data = my.data, family = binomial(link = "logit"))

It's not exactly clear to me what (1 + Indep2 | Item) does, since the
output of both (a) and (b) includes random intercepts for items and
random slopes for items by Indep2. At the same time, model (a) and (b)
differ in their exact estimates.

I would appreciate if someone could explain what the difference
between model (a) and (b) is.

Thanks
Sverre

Goldberg, Ariel M | 20 Feb 21:15
Picon

Questions about reporting mixed-effects results

Dear all,

I have a few questions about how to report the results of mixed-effects analyses for publication. I have
been perusing the Jaeger & Kuperman presentation but a few questions remain.  

I have been asked by the reviewers to include a full regression table, which I take to comprise coefficient
estimates, MCMC-based confidence intervals and MCMC-based p-value estimations.
-Should the model that I use to report these values contain uncentered predictors, centered predictors,
or centered and scaled predictors?
-A few of my models involve random intercepts, and I believe that pvals.fnc() is not currently defined for
models with random intercepts.  Do you have any suggestions for how I should report these models? 

My models contain many control variables and only one or two variables that I am actually concerned with.  As
such, I have not worried about multicollinearity among the control variables.  I suppose I should just
state this somewhere to facilitate the interpretation of the regression tables?

Lastly, is there any way to do a power analysis for mixed-effects models?  One reviewer asked whether this
was possible and noted that there may be rough approximations such as "the t-approximation to the
coefficient-wise test".

Thank you!
Ariel

T. Florian Jaeger | 28 Jan 19:27
Picon

negative deviances (again)

Hi,


I am writing because I'm trying to run an lmer model and I keep getting negative deviances (and positive log-likelihoods, etc.). I've reinstalled R and updated all packages:

platform       x86_64-pc-mingw32            
arch           x86_64                       
os             mingw32                      
system         x86_64, mingw32              
status                                      
major          2                            
minor          14.1                         
year           2011                         
month          12                           
day            22                           
svn rev        57956                        
language       R                            
version.string R version 2.14.1 (2011-12-22)

lme4 version 0.999375-42

But the problem persists. It's not due to the data set. For example, I've rerun a simple model from Harald's languageR library (on the data set lexdec):

library(languageR)
data(lexdec)
lmer(RT ~ Frequency + (1 | Subject), lexdec)

Linear mixed model fit by REML 
Formula: RT ~ Frequency + (1 | Subject) 
   Data: lexdec 
    AIC    BIC logLik deviance REMLdev
 -858.4 -836.8  433.2   -880.9  -866.4
[snip]

Linear mixed model fit by REML 
Formula: RT ~ Frequency + Trial + (1 | Subject) 
   Data: lexdec 
    AIC  BIC logLik deviance REMLdev
 -846.1 -819  428.1   -887.2  -856.1
[snip]

The likelihood seems to develop in the expected (inverted) direction and so does the deviance estimate for the maximum REML model (REMLdev). Has anyone on this list been able to resolve this? I saw this behavior for version 2.13.1, but it persists after updating to 2.14.1. Do you get the same? I thought this had been resolved.

Florian
Angel Tabullo | 17 Jan 16:07
Picon
Favicon

questions about logit mixed model with R

(I re send the message because it seems that it was not sent properly the last time)

In my experiment, subjects were exposed to artificial languages with different word orders (two of them frequent among world languages: SOV, SVO and two of them infrequent: VSO, OSV). After training, subject had to classify new sentences as "correct" or incorrect, according to what they have learned. Sentences could either be correct, contain a syntax violation or a semantic violation (mismatch between a scene and the sentences describing it). Dependent variables were response latency and accuracy (right or wrong answer). I'm trying to analyze the accuracy (1 = right answer, 0 = wrong answer) data using a mixed logit model with "word order (OSV, SVO, SOV, VSO)" and "type of sentence" (correct, semantic violation, syntax violation) as fixed factors, and subject as a random factor. Word order is a between subjects variable, while type of sentences is a repeated measures factor. 

My questions are:

1) In order to contrast each level of each factor with all the others, as well as their interactions: should I ran different models changing the reference category? Does this mean I should run 4 x 3 = 12 models?
2) Would it be correct to compare interaction levels with post hoc Tukey contrasts (for instance: OSV - correct vs. OSV semantic violation, SVO correct vs. OSV correct and so on?).
3) How do I interpret a significant interaction? For instance:

ModeloAngel = lmer(respuest=="1" ~ 1 + grupo * tipoF + (1|sujeto), data=DatosAngel, family="binomial") 

Fixed effects:
                       Estimate Std. Error z value Pr(>|z|)    
(Intercept)             1.79585    0.19196   9.356  < 2e-16 ***
grupoOSV                0.25816    0.26740   0.965   0.3343    
grupoSOV                0.70875    0.29315   2.418   0.0156 *  
grupoSVO                0.59607    0.26769   2.227   0.0260 *  
tipoFVsemanti          -1.01756    0.14765  -6.892 5.51e-12 ***
tipoFVsintact          -1.46088    0.14566 -10.029  < 2e-16 ***
grupoOSV:tipoFVsemanti -0.29214    0.20841  -1.402   0.1610    
grupoSOV:tipoFVsemanti -0.39714    0.23265  -1.707   0.0878 .  
grupoSVO:tipoFVsemanti  0.03181    0.21459   0.148   0.8821    
grupoOSV:tipoFVsintact  0.83284    0.21107   3.946 7.95e-05 ***
grupoSOV:tipoFVsintact  0.42079    0.23408   1.798   0.0722 .  
grupoSVO:tipoFVsintact  0.16667    0.21136   0.789   0.4304    

If the reference levels are VSO and "correct": does this mean that performance of OSV in syntax violations trials is better than that of VSO in syntax violation trials. Or does this mean that OSV - syntax violations performance is better than VSO - "correct" performance?

Helene Kreysa | 3 Jan 10:37
Picon
Picon

Bielefeld Mixed Models Workshop 2012

Dear colleagues, 
we would like to alert you to our upcoming workshop, which we believe may be interesting for many of the users of this list:


*** BiMM 2012: Bielefeld Mixed Models Workshop ***

Mixed-effects models are a powerful tool for the statistical analysis and modelling of psycho-/linguistic data and are increasingly used in experimental and corpus-oriented work. The Bielefeld Mixed Models Workshop (BiMM 2012) offers an opportunity to gain insight into the application of mixed-effects models. Focus will be placed on combining theoretical background with practical data analysis (using R). Participants can look forward to lively discussions with and between experts about how to use, report and interpret these methods.
The workshop is targeted at students and researchers from psycho-/linguistics and related disciplines working or willing to work with mixed-effects models. No prior knowledge of these models is required. However, participants should be familiar with general statistics and R.

Place and dates: 
Bielefeld University, Germany 
*** 22-24 February 2012 ***

Topics: 
-Random effects models 
-Hierarchical Models 
-ANOVA vs. Mixed models 
-Model validation 
-Unbalanced designs 
-Outliers and data transformation, correlated predictors 
-Interpretation, visualisation and reporting

Experts: 
-Hugo Quené (Utrecht University) 
-Dale Barr (Glasgow University) 
-Holger Mitterer (Max Planck Institute for Psycholinguistics, Nijmegen) 
-Shravan Vasishth (Potsdam University) 
-Marco van de Ven (Radboud Universität)

Registration: 
To request a place, please send an email with a short description of your work and a motivation for participating in the workshop to BiMM2012-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org by 
*** 22 January 2011 ***. 
The workshop fee (50 EUR) should be paid after receiving confirmation from the organisers.

For further information please see http://www.spectrum.uni-bielefeld.de/BiMM2012/ 

With best regards,
the organizers: 
Annett Jorschick 
Helene Kreysa 
Zofia Malisz 
Andreas Windmann
Marcin Włodarczak 


*********************************************************************************************
Helene Kreysa
Post-doctoral researcher
Language and Cognition Group

Cognitive Interaction Technology (CITEC)
Room H1-132 
Morgenbreede 39
33615 Bielefeld
Germany

tel: +49 (0)521 106 12248 (office)





Xiao He | 29 Nov 07:50
Picon
Gravatar

Re: Collinearity and centering multi-level (more than 2 levels) fixed predictors

Hi,

I think part of my concern is Type 1 and Type 2 Errors. From my
experience with my own data, Type II error is a big concern, as I've
encountered cases where reducing collinearity changes test results
from non-significant to highly significant.

A tutorial that Dr. Jaeger created on mixed effects model does mention
that Type 1 error does not increase much in the presence of
collinearity. But if I recall correctly, there is an example in the
tutorial using the lexdec dataset where the presence of extremely high
collinearity causes an (?)overestimate of significance.

So in essence, without being able to reduce collinearity, I am afraid
of not being able to detect significances when there are in fact
significant differences (or vice versa), and interpreting coefficient
estimates is also problematic.

Thanks.

Best,
Xiao

Xiao He | 28 Nov 00:57
Picon
Gravatar

Collinearity and centering multi-level (more than 2 levels) fixed predictors

Dear r-lang users,

I have a set of binary data from a 2 by 3 design study. I centered the
two-level predictor ('LHcenter': local & LD) but did not centered the
3 level predictor ('cond': A, B, & C). As you can see below in the
triangular matrix towards the end of the lmer output, there is
significant collinearity; the absolute values of some of the
correlations are above 0.6.

---8<----------------------------------------------------------------------------------------------------------------------------
>glmer(true ~ cond * LHcenter + (1 | subject) + (1 | items), data=offlineTarget, family=binomial)

Generalized linear mixed model fit by the Laplace approximation
Formula: true ~ cond * LHcenter + (1 | subject) + (1 | items)
   Data: offlineTarget
   AIC   BIC   logLik deviance
  747.8 785.9 -365.9    731.8

Random effects:
 Groups        Name    Variance     Std.Dev.
 items     (Intercept)     0.13838     0.37199
 subject   (Intercept)    1.44652     1.20271
Number of obs: 864, groups: items, 30; subject, 29

Fixed effects:
                          Estimate    Std. Error     z value      Pr(>|z|)
(Intercept)           -0.09775      0.30932      -0.316      0.75199
condB                  0.62445      0.25534       2.446      0.01446   *
condC                  0.70552      0.27286       2.586      0.00972   **
LHcenter              4.86821      0.42482      11.459      < 2e-16   ***
condB:LHcenter   -2.31392      0.51162      -4.523      6.1e-06    ***
condC:LHcenter   -1.00381      0.54643      -1.837       0.06621   .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
                       (Intr)    condB    condC    LHcntr    cnB:LH
condB            -0.526
condC            -0.491    0.607
LHcenter        -0.085    0.130      0.118
cndB:LHcntr    0.070   -0.053     -0.093      -0.801
cndC:LHcntr    0.066   -0.093      0.033      -0.735     0.607
---8<----------------------------------------------------------------------------------------------------------------------------

I also tested whether the interaction is significant by comparing the
full model with one without the interaction term:

glmer(true ~ cond * LHcenter + (1 | subject) + (1 | items),
data=offlineTarget, family=binomial)
glmer(true ~ cond  +  LHcenter + (1 | subject) + (1 | items),
data=offlineTarget, family=binomial)

And I observed a significant difference, suggesting that there is
significant interaction.

So I proceeded to conduct planned comparisons. To do so, I created a
new predictor in the table object by merging the two factors (a 3
level factor and a 2 level factor) together, resulting in a new factor
named 'posthoc_cond' with 6 levels: Local_A, Local_B, Local_C, LD_A,
LD_B, & LD_C

I conducted glmer WITHOUT centering the 6-level fixed predictor 'posthoc_cond'.

posthoc_result = glmer(true~posthoc_cond + (1|subject) + (1|item),
data=offlineTarget, family="binomial")

and then use glht() from 'multcomp' to conduct paired comparisons.

The problem with posthoc_result, again, is high collinearity (see below)
---8<----------------------------------------------------------------------------------------------------------------------------
Correlation of Fixed Effects:
                         (Intr)   pst_A_   p_B_LD   pst_B_   p_C_LD
psthc_cndA_    -0.865
psthc_cB_LD    -0.835  0.712
psthc_cndB_    -0.944  0.896         0.734
psthc_cC_LD    -0.928  0.810        0.779      0.896
psthc_cndC_     -0.891  0.925        0.710      0.914      0.834
---8<----------------------------------------------------------------------------------------------------------------------------

So my question is, in my case, what can be done to reduce
collinearity. It seems that centering those multi-level predictors is
not applicable in my case since I am interested in whether different
levels of the fixed predictors have different means. Centering these
multilevel predictors would not allow me to test that.

Thank you in advance for your help!!

Best,
Xiao

Sunfa Kim | 19 Nov 14:07
Picon

Re: ling-r-lang-L Digest, Vol 2, Issue 49

Dear Reinhold,

Thank you very much for your response. Please let me ask two more questions.

I started with the following Full Model.

> priming_Full_Model.lmer<-lmer(RT~cPrimeType*cCondition*cPresOrder+clogLength+cLogPresOrder+(1+cPrimeType*cCondition*cPresOrder+clogLength+cLogPresOrder|Subject)+(1+cPrimeType*cCondition*cPresOrder+clogLength+cLogPresOrder|Item),data=priming)

But the Full Model did not work. I received the following warning message.

Warning message:
In mer_finalize(ans) : iteration limit reached without convergence (9)


In order to examine variances associated with each random effect, I used the following function.

> summary(priming_Full_Model.lmer)


Then, I started removing random effects one by one (I started with the random effect with the "smallest" variance) until the model finally worked in R.

<<Question 1>>
You recommended "(1) Start with a full RE model but without correlation parameters."
How can I carry out this (i.e., "a model without correlation parameters")?

Do I add "corr = FALSE" somewhere in the command line?

<<Question 2>>
You also recommended "(2) Test for which REs you have reliable evidence for between-subject (betweein-item) variance components (drop-1 tests)."
How can I carry out this test? Is there a specific function to carry out this test?

Thank you very much,

Sincerely yours,
Sunfa

On Sat, Nov 19, 2011 at 7:17 AM, <ling-r-lang-l-request-xGejAJT2w6xeRyQZnrz35w@public.gmane.org> wrote:
Send ling-r-lang-L mailing list submissions to
       ling-r-lang-l-xGejAJT2w6xeRyQZnrz35w@public.gmane.org

To subscribe or unsubscribe via the World Wide Web, visit
       https://mailman.ucsd.edu/mailman/listinfo/ling-r-lang-l
or, via email, send a message with subject or body 'help' to
       ling-r-lang-l-request-xGejAJT2w6xeRyQZnrz35w@public.gmane.org

You can reach the person managing the list at
       ling-r-lang-l-owner-xGejAJT2w6xeRyQZnrz35w@public.gmane.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of ling-r-lang-L digest..."


Today's Topics:

  1. Re: Figuring out maximum random effects in mixed-effect
     regression models (Reinhold Kliegl)


----------------------------------------------------------------------

Message: 1
Date: Sat, 19 Nov 2011 13:01:06 +0100
From: Reinhold Kliegl <kliegl <at> uni-potsdam.de>
Subject: [R-lang] Re: Figuring out maximum random effects in
       mixed-effect    regression models
To: Reinhold Kliegl <kliegl <at> uni-potsdam.de>
Cc: r-lang-J8UlxMCpGvdVdeU8dOWy+g@public.gmane.org, sunfakim-LjRBWs/Th6SVc3sceRu5cw@public.gmane.org, Sunfa Kim
       <sunfakim-Re5JQEeQqe8@public.gmane.orgm>
Message-ID: <96042652-0391-4085-BB59-1BD1451AA641-KiJTPfjSHVUubQSw3dWLiw@public.gmane.org>
Content-Type: text/plain; charset="us-ascii"

Sorry for the typos ...

in (4):  "if you have NO evidence ..."
In (5): "Test whether (4) fits significantly better than (3)."

I add:
(6) If you have specific hypotheses about correlation parameters, you may be able to test just those correlation parameters and keep the rest at zero.

All tests are LRTs--with their own problems, but usually not so severe with our kind of data, I think. Finally, note that correlation parameters are linked to  the design (or model) matrix. They depend on what you pick as the intercept; they are NOT invariant to transformations of covariates, linear or non-linear.

Reinhold Kliegl

----
Reinhold Kliegl
http://read.psych.uni-potsdam.de
http://www.dlexdb.de/

On 19.11.2011, at 12:34, Reinhold Kliegl wrote:

> I would recommend a slightly different approach.
>
> (1) Start with a full RE model but without correlation parameters.
> (2) Test for which REs you have reliable evidence for between-subject (betweein-item) variance components (drop-1 tests)
> (3) Remove the non-significant variance components
> (4) Estimate the correlation parameters only for the remaining model. (It does not make much sense to me to estimate a correlation parameter if you have evidence for reliable variance for one of the contributing components.)
> (5)  Test whether (4) is significantly better than (4). It is actually not trivial to estimate reliable correlation parameters in our usual psycholinguistic or psychological expertiments.
>
> Reinhold Kliegl
>
> On 19.11.2011, at 12:14, Sunfa Kim wrote:
>
>> To whom it may concern:
>>
>> I would like to use a mixed-effect regression approach to examine the adaptation effect (i.e., learning) during sentence comprehension.
>>
>> Right now, I am trying to figure out the maximum random effects.
>>
>> One of the models produced the following summary
>> >summary(priming.lmer)
>>
>> Random effects:
>>
>> Groups
>>
>> Name
>>
>> Variance
>>
>> Std.Dev.
>>
>> Corr
>>
>> Subject
>>
>> (Intercept)
>>
>> 19867.54
>>
>> 140.952
>>
>>
>>
>> cPrimeType
>>
>> 367.20
>>
>> 19.162
>>
>> -1.000
>>
>>
>> cCondition
>>
>> 5558.49
>>
>> 74.555
>>
>> -1.000
>>
>>
>> clogLength
>>
>> 8128.49
>>
>> 90.158
>>
>> 0.954
>>
>>
>> cLogPresOrder
>>
>> 3033.62
>>
>> 55.078
>>
>> -0.216
>>
>>
>> cPrimeType:cCondition
>>
>> 17606.58
>>
>> 132.690
>>
>> 0.164
>>
>>
>> cPrimeType:cCondition:cPresOrder
>>
>> 351.92
>>
>> 18.759
>>
>> 0.142
>>
>> Item
>>
>> (Intercept)
>>
>> 2870.65
>>
>> 53.578
>>
>>
>>
>> cCondition
>>
>> 3391.10
>>
>> 58.233
>>
>> 0.668
>>
>>
>> cCondition:cPrimeType
>>
>> 1224.53
>>
>> 34.993
>>
>> -0.652
>>
>> Residual
>>
>>
>> 31654.56
>>
>> 177.917
>>
>>
>>
>>
>> Baayen, Davidson, and Bates (2008) noted that "the high correlation of the intercept and slope for the subject random effects (-1.00) indicates that the model has been overparameterized" (p. 395).
>>
>> So, I inspected the correlations and identified three high correlations from the table. Therefore, I simplified the model by removing the by-subject adjustments to the slopes of cPrimeType, cCondition, and clogLength.
>>
>> The simplified model produced the following summary.
>> >summary(rev_priming.lmer)
>>
>> Random effects:
>>
>> Groups
>>
>> Name
>>
>> Variance
>>
>> Std.Dev.
>>
>> Corr
>>
>> Subject
>>
>> (Intercept)
>>
>> 19443.07
>>
>> 139.438
>>
>>
>>
>> cLogPresOrder
>>
>> 2832.97
>>
>> 53.226
>>
>> -0.341
>>
>>
>> cPrimeType:cCondition
>>
>> 1043.35
>>
>> 32.301
>>
>> -0.133
>>
>>
>> cPrimeType:cCondition:cPresOrder
>>
>> 204.57
>>
>> 14.303
>>
>> 0.364
>>
>> Item
>>
>> (Intercept)
>>
>> 2935.82
>>
>> 54.183
>>
>>
>>
>> cCondition
>>
>> 3545.04
>>
>> 59.540
>>
>> 0.740
>>
>>
>> cCondition:cPrimeType
>>
>> 1878.62
>>
>> 43.343
>>
>> -0.656
>>
>> Residual
>>
>>
>> 36990.34
>>
>> 192.329
>>
>>
>>
>> Now, the correlation values seem to be in the range of acceptable values (i.e., not too high).
>>
>> Finally, in order to verify that the simpler model is justified, I carried out a likelihood ratio test.
>>
>> > anova(priming.lmer, rev_priming.lmer)
>>
>> Df
>>
>> AIC
>>
>> BIC
>>
>> logLik
>>
>> Chisq Chi
>>
>> Df
>>
>> Pr(>Chisq)
>>
>> riming.lmer
>>
>> 27
>>
>> 22598
>>
>> 22744
>>
>> -11272
>>
>>
>>
>>
>> rev_priming.lmer
>>
>> 45
>>
>> 22474
>>
>> 22718
>>
>> -11192
>>
>> 159.35
>>
>> 18
>>
>> < 2.2e-16 ***
>>
>> ---
>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>
>> My understanding is that this likelihood ratio test indicates that the removal of the by-subject adjustments to the slopes is NOT justified.
>>
>> My question is which model should I choose to report my results? priming.lmer or rev_priming.lmer?
>>
>> Thank you very much for your suggestions on this matter in advance!
>>
>> Best,
>> Sunfa
>
>






-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20111119/f0bc13f6/attachment.html

------------------------------

_______________________________________________
ling-r-lang-L mailing list
ling-r-lang-L-xGejAJT2w6xdhJFDKQKK8Q@public.gmane.orgd.edu
https://mailman.ucsd.edu/mailman/listinfo/ling-r-lang-l


End of ling-r-lang-L Digest, Vol 2, Issue 49
********************************************

Sunfa Kim | 19 Nov 12:14
Picon

Figuring out maximum random effects in mixed-effect regression models

To whom it may concern:

I would like to use a mixed-effect regression approach to examine the adaptation effect (i.e., learning) during sentence comprehension.

Right now, I am trying to figure out the maximum random effects.

One of the models produced the following summary
>summary(priming.lmer)

 

Random effects:

Groups

Name

Variance

Std.Dev.

Corr

Subject

(Intercept)

19867.54

140.952

 

 

cPrimeType

367.20

19.162

-1.000

 

cCondition

5558.49

74.555

-1.000

 

clogLength

8128.49

90.158

0.954

 

cLogPresOrder

3033.62

55.078

-0.216

 

cPrimeType:cCondition

17606.58

132.690

0.164

 

cPrimeType:cCondition:cPresOrder

351.92

18.759

0.142

Item

(Intercept)

2870.65

53.578

 

 

cCondition

3391.10

58.233

0.668

 

cCondition:cPrimeType

1224.53

34.993

-0.652

Residual

 

31654.56

177.917

 

 


Baayen, Davidson, and Bates (2008) noted that "the high correlation of the intercept and slope for the subject random effects (-1.00) indicates that the model has been overparameterized" (p. 395).

So, I inspected the correlations and identified three high correlations from the table. Therefore, I simplified the model by removing the by-subject adjustments to the slopes of cPrimeType, cCondition, and clogLength.

The simplified model produced the following summary.
>summary(rev_priming.lmer)

 

Random effects:

Groups

Name

Variance

Std.Dev.

Corr

Subject

(Intercept)

19443.07

139.438

 

 

cLogPresOrder

2832.97

53.226

-0.341

 

cPrimeType:cCondition

1043.35

32.301

-0.133

 

cPrimeType:cCondition:cPresOrder

204.57

14.303

0.364

Item

(Intercept)

2935.82

54.183

 

 

cCondition

3545.04

59.540

0.740

 

cCondition:cPrimeType

1878.62

43.343

-0.656

Residual

 

36990.34

192.329

 

 

Now, the correlation values seem to be in the range of acceptable values (i.e., not too high).

Finally, in order to verify that the simpler model is justified, I carried out a likelihood ratio test.

> anova(priming.lmer, rev_priming.lmer)

 

Df

AIC

BIC

logLik

Chisq Chi

Df

Pr(>Chisq)

riming.lmer

27

22598

22744

-11272

 

 

 

rev_priming.lmer

45

22474

22718

-11192

159.35

18

< 2.2e-16 ***

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

My understanding is that this likelihood ratio test indicates that the removal of the by-subject adjustments to the slopes is NOT justified.

My question is which model should I choose to report my results? priming.lmer or rev_priming.lmer?

Thank you very much for your suggestions on this matter in advance!

Best,
Sunfa
Jason Kahn | 10 Nov 22:49
Picon

Helmert (or really any) contrasts, log-likelihood, and specificity

Hello everyone,


I'm trying to set up a model with a three-way contrast: between linguistic, non-linguistic, and control. I'd like to compare all three to each other, but for the moment, I'll focus on one set of Helmert contrasts:

  [,1] [,2]
l   -1   -1
n    1   -1
c    0    2

This should compare 1) linguistic to non-linguistic, and 2) 2 times control to the average of linguistic and non-linguistic. So far so good.

My "baseline" model includes some control predictors, and more importantly, random intercepts and slopes for subject and item. The slopes are by "condition," and in this case I've used the contrasts, as follows:

baseline<-lmer(logRT~(1+contr1+contr2|subject)+(1+contr1+contr2|item)+controlpredictors)

This, in order to specify the "maximal" structure allowed/justified by the data. So far still so good (right?).

Then I add the fixed effect contrasts:

ofInterest<-update(baseline,.~.+contr1+contr2)

The fixed effect output of this type of model indicates that, for one set of RT's, the t-value for both contrasts "should be" significant. The output for an identical model for a different set of RT's indicates that one (but not the other) "should be" significant.

To get a usable significance, I have to use log-likelihood tests, because pvals.fnc() won't work with random slopes (or is it just covariances? Regardless...). But because my model comparison includes both contr1 and contr2, the test isn't going to spit out individual significance values for each, t-values or not. Is there some way to tell when contr1 is significant, independent of the significance of contr2, in a model like the one I've described? Can I rely on those t-values at all?

I'm happy to provide more information if it would clarify anything I've muddied.

Jason

Gmane