从lmer模型计算效果的可重复性


28

我刚刚碰到了这篇论文该论文描述了如何通过混合效应建模来计算测量的可重复性(又称可靠性,又称类内相关性)。R代码为:

#fit the model
fit = lmer(dv~(1|unit),data=my_data)

#obtain the variance estimates
vc = VarCorr(fit)
residual_var = attr(vc,'sc')^2
intercept_var = attr(vc$id,'stddev')[1]^2

#compute the unadjusted repeatability
R = intercept_var/(intercept_var+residual_var)

#compute n0, the repeatability adjustment
n = as.data.frame(table(my_data$unit))
    k = nrow(n)
    N = sum(n$Freq)
n0 = (N-(sum(n$Freq^2)/N))/(k-1)

#compute the adjusted repeatability
Rn = R/(R+(1-R)/n0)

我相信这种方法也可以用于计算效果的可靠性(即具有2个级别的变量的总对比度效果),如下所示:

#make sure the effect variable has sum contrasts
contrasts(my_data$iv) = contr.sum

#fit the model
fit = lmer(dv~(iv|unit)+iv,data=my_data)

#obtain the variance estimates
vc = VarCorr(fit)
residual_var = attr(vc,'sc')^2
effect_var = attr(vc$id,'stddev')[2]^2

#compute the unadjusted repeatability
R = effect_var/(effect_var+residual_var)

#compute n0, the repeatability adjustment
n = as.data.frame(table(my_data$unit,my_data$iv))
k = nrow(n)
N = sum(n$Freq)
    n0 = (N-(sum(n$Freq^2)/N))/(k-1)

#compute the adjusted repeatability
Rn = R/(R+(1-R)/n0)

三个问题:

  1. 以上用于获得效果可重复性的点估计的计算是否有意义?
  2. 当我有多个要估计其重复性的变量时,将它们全部添加到相同的拟合中(例如lmer(dv~(iv1+iv2|unit)+iv1+iv2)似乎比为每个效果创建一个单独的模型会产生更高的重复性估计。对我来说,这在计算上是有意义的,因为包含多个影响会趋于减少残留方差,但我对得出的可重复性估计是否有效并不肯定。是吗
  3. 以上引用的论文表明,似然分析可以帮助我获得可重复性估计的置信区间,但据我所知, confint(profile(fit))仅提供了截距和效应方差的间隔,而我还需要剩余方差的间隔来计算可重复性的间隔,不是吗?

Answers:


6

我认为我至少可以回答有关未调整的重复性估计的问题,即经典的类内相关性(ICC)。至于“调整后”的重复性估计,我浏览了您链接的论文,却没有真正看到您可以在论文中找到适用的公式?根据数学表达式,这似乎是平均分数(而不是单个分数)的可重复性。但这仍然不是您问题的关键部分,因此我将忽略它。

(1.)以上用于获得效果可重复性的点估计的计算是否有意义?

是的,您建议的表达式确实有意义,但是有必要对建议的公式进行一些修改。下面我展示了如何得出您建议的可重复性系数。我希望这既阐明了系数的概念含义,又表明了为什么需要对其进行少量修改。

首先,让我们首先在您的第一种情况下考虑可重复性系数,并弄清其含义和来源。理解这一点将有助于我们理解更复杂的第二种情况。

仅随机拦截

ij

yij=β0+u0j+eij,
u0jσu02eijσe2

xy

corr=cov(x,y)var(x)var(y).

然后,通过让两个随机变量得出ICC /重复性系数的表达式xyj

ICC=cov(β0+u0j+ei1j,β0+u0j+ei2j)var(β0+u0j+ei1j)var(β0+u0j+ei2j),
and if you simplify this using the definitions given above and the properties of variances/covariances (a process which I will not show here, unless you or others would prefer that I did), you end up with
ICC=σu02σu02+σe2.
What this means is that the ICC or "unadjusted repeatability coefficient" in this case has a simple interpretation as the expected correlation between a pair observations from the same cluster (net of the fixed effects, which in this case is just the grand mean). The fact that the ICC is also interpretable as a proportion of variance in this case is coincidental; that interpretation is not true in general for more complicated ICCs. The interpretation as some sort of correlation is what is primary.

Random intercepts and random slopes

Now for the second case, we have to first clarify what precisely is meant by "the reliability of effects (i.e. sum contrast effect of a variable with 2 levels)" -- your words.

First we lay out the model. The mixed model for the ith response in the jth group under the kth level of a contrast-coded predictor x is

yijk=β0+β1xk+u0j+u1jxk+eijk,
where the random intercepts have variance σu02, the random slopes have variance σu12, the random intercepts and slopes have covariance σu01, and the residuals eij have variance σe2.

So what is the "repeatability of an effect" under this model? I think a good candidate definition is that it is the expected correlation between two pairs of difference scores computed within the same j cluster, but across different pairs of observations i.

So the pair of difference scores in question would be (remember that we assumed x is contrast coded so that |x1|=|x2|=x):

yi1jk2yi1jk1=(β0β0)+β1(xk2xk1)+(u0ju0j)+u1j(xk2xk1)+(ei1jk2ei1jk1)=2xβ1+2xu1j+ei1jk2ei1jk1
and
yi2jk2yi2jk1=2xβ1+2xu1j+ei2jk2ei2jk1.

Plugging these into the correlation formula gives us

ICC=cov(2xβ1+2xu1j+ei1jk2ei1jk1,2xβ1+2xu1j+ei2jk2ei2jk1)var(2xβ1+2xu1j+ei1jk2ei1jk1)var(2xβ1+2xu1j+ei2jk2ei2jk1),
which simplifies down to
ICC=2x2σu122x2σu12+σe2.
Notice that the ICC is technically a function of x! However, in this case x can only take 2 possible values, and the ICC is identical at both of these values.

As you can see, this is very similar to the repeatability coefficient that you proposed in your question, the only difference is that the random slope variance must be appropriately scaled if the expression is to be interpreted as an ICC or "unadjusted repeatability coefficient." The expression that you wrote works in the special case where the x predictor is coded ±12, but not in general.

(2.) When I have multiple variables whose repeatability I want to estimate, adding them all to the same fit (e.g. lmer(dv~(iv1+iv2|unit)+iv1+iv2) seems to yield higher repeatability estimates than creating a separate model for each effect. This makes sense computationally to me, as inclusion of multiple effects will tend to decrease the residual variance, but I'm not positive that the resulting repeatability estimates are valid. Are they?

I believe that working through a similar derivation as presented above for a model with multiple predictors with their own random slopes would show that the repeatability coefficient above would still be valid, except for the added complication that the difference scores we are conceptually interested in would now have a slightly different definition: namely, we are interested in the expected correlation of the differences between adjusted means after controlling for the other predictors in the model.

If the other predictors are orthogonal to the predictor of interest (as in, e.g., a balanced experiment), I would think the ICC / repeatability coefficient elaborated above should work without any modification. If they are not orthogonal then you would need to modify the formula to take account of this, which could get complicated, but hopefully my answer has given some hints about what that might look like.


You are right Jake. The adjusted ICC referes to the section VII. EXTRAPOLATED REPEATABILITY AND HERITABILITY in the linked paper. The authors write It is important to distinguish between the repeatability of individual measurements R and the repeatability of measurement means Rn.
Gabra
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.