用mgcv gam进行随机效应预测


10

我对使用mgcv中的gam来模拟单个船的简单随机效应(在渔业中随时间反复旅行)建模总的鱼获量感兴趣。我有98个科目,所以我想我会使用gam而不是gamm来模拟随机效果。我的模型是:

modelGOM <- gam(TotalFish ~ factor(SetYear) + factor(SetMonth) + factor(TimePeriod) +     
s(SST) + s(VesselID, bs = "re", by = dum) + s(Distance, by = TimePeriod) + 
offset(log(HooksSet)), data = GOM, family = tw(), method = "REML")

我已经用bs =“ re”和by = dum编码了随机效应(我读到这将使我能够将血管效应预测为其预测值或零)。“ dum”是1的向量。

该模型可以运行,但是我在预测时遇到问题。我选择了其中一个用于预测的容器(Vessel21),并选择了除预测感兴趣的预测变量(距离)以外的所有其他事物的平均值。

data.frame("Distance"=seq(min(GOM$Distance),max(GOM$Distance),length = 100),
                             "SetYear" = '2006',
                             "SetMonth" = '6',
                             "TimePeriod" = 'A',
                             "SST" = mean(GOM$SST),
                             "VesselID" = 'Vessel21', 
                             "dum" = '0', #to predict without vessel effect
                             "HooksSet" = mean(GOM$HooksSet))

pred_GOM_A_Swordfish <- predict(modelGOM, grid.bin.GOM_A_Swordfish, type = "response", 
se = T)

我得到的错误是:

Error in Predict.matrix.tprs.smooth(object, dk$data) : 
    NA/NaN/Inf in foreign function call (arg 1)
    In addition: Warning message:
    In Ops.factor(xx, object$shift[i]) : - not meaningful for factors

我认为之所以这么称呼是因为VesselID是一个因素,但是我正在使用它来平滑随机效果。

我已经能够成功预测使用gam而没有简单的随机效应(bs =“ re”)。

您可以提供关于如何在不使用VesselID术语的情况下预测此模型的任何建议(但仍将其包括在内)吗?

谢谢!

Answers:


20

mgcv的 1.8.8版本predict.gam开始,有了一个exclude论点,它允许在模型中对项进行归零,包括预测时的随机效应,而无需先前建议的虚拟技巧。

  • predict.gampredict.bam现在接受'exclude'参数允许术语(例如随机效应)将被归零用于预测。为了提高效率,不再评估in in terms或in exclude中的平滑项,而是将其设置为零或不返回。请参阅?predict.gam
library("mgcv")
require("nlme")
dum <- rep(1,18)
b1 <- gam(travel ~ s(Rail, bs="re", by=dum), data=Rail, method="REML")
b2 <- gam(travel ~ s(Rail, bs="re"), data=Rail, method="REML")

head(predict(b1, newdata = cbind(Rail, dum = dum)))    # ranefs on
head(predict(b1, newdata = cbind(Rail, dum = 0)))      # ranefs off
head(predict(b2, newdata = Rail, exclude = "s(Rail)")) # ranefs off, no dummy

> head(predict(b1, newdata = cbind(Rail, dum = dum)))    # ranefs on
       1        2        3        4        5        6 
54.10852 54.10852 54.10852 31.96909 31.96909 31.96909  
> head(predict(b1, newdata = cbind(Rail, dum = 0)))      # ranefs off
   1    2    3    4    5    6 
66.5 66.5 66.5 66.5 66.5 66.5
> head(predict(b2, newdata = Rail, exclude = "s(Rail)")) # ranefs off, no dummy
   1    2    3    4    5    6 
66.5 66.5 66.5 66.5 66.5 66.5

较旧的方法

Simon Wood使用以下简单示例来检查它是否正常工作:

library("mgcv")
require("nlme")
dum <- rep(1,18)
b <- gam(travel ~ s(Rail, bs="re", by=dum), data=Rail, method="REML")
predict(b, newdata=data.frame(Rail="1", dum=0)) ## r.e. "turned off"
predict(b, newdata=data.frame(Rail="1", dum=1)) ## prediction with r.e

哪个对我有用。同样地:

dum <- rep(1, NROW(na.omit(Orthodont)))
m <- gam(distance ~ s(age, bs = "re", by = dum) + Sex, data = Orthodont)
predict(m, data.frame(age = 8, Sex = "Female", dum = 1))
predict(m, data.frame(age = 8, Sex = "Female", dum = 0))

也可以。

因此,我将检查您所提供的数据是否newdata符合您的想法,因为问题可能不存在,VesselID该错误来自predict()上述示例中的调用会调用的函数,并且 Rail是第一个例子。


谢谢加文的例子!在解决这些问题时,我发现了。您是正确的-错误出在newdata数据框中。一旦删除了变量“ dum”的“ 0”附近的引号,我就可以进行预测而不会出现任何错误。新秀错误,但我整天都在努力,认为VesselID因子平滑是个问题。非常感谢!
梅根2015年

如何指定一个以上要排除的随机效应exclude?我尝试使用,c()但似乎不起作用。
Stefano,

使用术语的载体,以排除为我的作品:exclude = c("s(x0)", "s(x2)")说从下面的模型b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)?predict.gam例子。您需要在显示给每个平滑项的信息时exclude使用传递的符号指定传递给矢量的字符串summary()
Gavin Simpson,
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.