Which model to use: "top" model or "simple" model?

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Which model to use: "top" model or "simple" model?

Postby luke.eberhart » Fri Aug 28, 2015 10:34 am

Hi Phidot team,

Maybe this is an obvious question, but I'd like some help and understanding.

I'm interested in estimating sex-specific daily survival of plover chicks. I will use these parameters in a projection matrix model, and so my main goal with this analysis is simply to estimate the average daily survival of males and females and their respective variances (which I would use in a stochastic matrix model down the line).

I'm using a CJS model and running the analysis through RMark. I've found that p is best described by an interaction between year and age^2 (i.e., p(~Year * Age^2)), which is what I expected. (Just for your information, "Age" is expressed in days in my analysis.) As for survival, I found that the top model shows an interaction between Sex, Age^2, and Year (i.e., Phi(~Sex * Age^2 * Year)). The second best model's DeltaAICc is 3.35, suggesting that this top model is pretty solid.

So, getting back to what my goal is... I simply want an estimate of daily survival for male chicks and female chicks and their respective variance distributions. One side of me thinks that I should use the reals of a simple model that only has "Sex" as a factor explaining variation in survival (i.e., Phi(~Sex), which had a Delta AICc of 117.71). But the other side suspects that I should somehow incorporate the Age^2 and Year variables so that the Sex estimates are more precise (specifically the variances are more precise)...I have a feeling the latter is the correct route, but I'd like your opinion and help.

Let's say that the latter is the proper route to take....so, when I look at the reals of Phi(~Sex * Age^2 * Year)), there of over 500...one survival estimate for every factor combination (e.g., one real for female chicks 3 days old in year 2006, one real for female chicks 7 days old in 2010, etc...). If I wanted to use the "top model" to generate only two values describing sex-specific survival, would I simply take the average of all male reals in the Phi(~Sex * Age^2 * Year) model and the average of all female reals in the Phi(~Sex * Age^2 * Year) model? Or is there a more "proper" way to do this?...I can imagine that calculating the error distributions must be more complicated than simply taking the averages of all the male standard errors, and the average of all the female standard errors.

Thanks in advance for your help and support!

Cheers,
Luke
luke.eberhart
 
Posts: 14
Joined: Mon Jul 13, 2015 9:51 am

Re: Which model to use: "top" model or "simple" model?

Postby jlaake » Fri Aug 28, 2015 4:12 pm

Luke-

What is the reason for constructing the projection matrix? Here are some random disjointed thoughts as you may not get much from your question.

1) If it is to mimic the population you sampled then I think you should use the best model but possibly estimate a process variance to incorporate the annual effects. However, one thing that wasn't clear to me is whether Year in the model is a numeric variable (trend) or a factor variable. You would want it to be a factor variable to estimate a process variance. Can you fully explain the three-way interaction sex, age and year?

2) If it is to explore population projections in general then I think you should use the sex model with annual variation to estimate process variance in dsr.

That's my 2 cents. I guess it all depends on the critters biology and what you are interested. If you are only interested in the fraction of nests that survive from laying to fledging then maybe what you should compute is a product of those survivals to get a single survival from laying to fledging. If you are interested in the within year survival then you'll need to include the age effect.

Regardless, you don't want to use the set of real estimates for your data. You want to predict the values for a defined set of covariates Age from 0 to whatever for each year and sex. In RMark Use covariate.predictions if Age is an individual covariate.

--jeff
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Which model to use: "top" model or "simple" model?

Postby luke.eberhart » Sat Aug 29, 2015 11:55 am

Hi Jeff,

Thanks for you help on this.

What is the reason for constructing the projection matrix?


My aim is to investigate the sex- and stage-specific origins of biased adult sex ratios (ASR) in the light of mating system and parental care. Snowy plovers are polygamous with uni-parental care (males brood) and previous works have shown there to be a male-biased ASR, which is what one would expect given their breeding behaviour (i.e., unequal mating opportunities between males and females facilitate the evolution of polygamy and uni-parental care). To assess what could cause a biased ASR I will use a two-sex projection matrix model built from sex- and stage-specific vital rates (i.e., survival and fecundity) to estimate the sex ratio of adults at the stable stage distribution. Then I will employ various perturbation analyses (e.g., elasticity and life stage simulation analysis) to determine at which stage and in which sex does the biased ASR originate from. Thus I need a distribution of each vital rate that represents the environmental stochasticity. I hope to use these distributions to project stochastic projections and estimate confidence intervals around the ASR of the deterministic matrix model.

The two-sex matrix model comprises of two stages for each sex: 1) 1st-years, and 2) adults. The 1st-year stage is composed of two "lower-level elements": chick survival (a 25 day period from hatch to fledge, thus it will be calculated as daily-chick-survival^25) and fledgling survival (an 11 month period from fledge until the subsequent breeding season at which individuals enter the adult stage class and can breed); the product of these two probabilities is thus the survival from hatch until the first breeding season. The adult stage is simply annual survival from one breeding season to the next. As you know, the transition intervals of a projection matrix must be equal (i.e., one-year in my case), which is why I've had to put chick and fledgling survival into one stage, yet I'm breaking the 1st year stage into these two elements so that I can explore the contributions of sex-specific chick or fledgling survival on ASR. Fecundity is defined as the average number of chicks hatched per individual per year (also sex-specific). I'm estimating fledgling and adult survival rates in a separate CJS analysis due to the difference in the encounter intervals (i.e., chick survival analysis has daily encounter intervals, whereas the fledgling and adult analysis has annual encounter intervals).

1) If it is to mimic the population you sampled then I think you should use the best model but possibly estimate a process variance to incorporate the annual effects.


Okay, this sounds like what I want to do (right?...based on my aims described above). If it is what I want to do, how do I take the top model (i.e., Phi(~Sex * Age^2 * Year)p(~Year * Age^2)) and estimate a process variance that incorporates annual effects? I remember that it is possible to estimate process variance in the MARK GUI, but can I do this in RMark? Remember, in the end I want only two distributions: one describing environmental variation in female chick survival and one describing environmental variation in male chick survival. In other words, I don't want an estimate for each year (or can I simply take the average of all years after estimating process variance?). As a side note, I only have 7 years of data, and so maybe I'm not able to use the process variance methods anyway since Evan has noted that they don't work well for datasets with less than 10 years of data...here: (http://www.phidot.org/forum/viewtopic.php?f=21&t=2878&p=9160&hilit=process+variance+Rmark&sid=ad251d4c790ea844a646e369cfa7c681#p9161)

However, one thing that wasn't clear to me is whether Year in the model is a numeric variable (trend) or a factor variable. You would want it to be a factor variable to estimate a process variance.


I modelled Year as a factor and Age as a covariate.

Can you fully explain the three-way interaction sex, age and year?


The three-way interaction make sense to me: 1) I expect that survival may differ between male and female chicks because of the evolutionary reasoning described above, 2) I expect that age will strongly describe variation in daily survival because it is expected that older chicks have better chances to, for example, escape predators or combat infections, and 3) plovers live in a very stochastic environment and thus there is large annual variation in chick survival. The interaction is therefore logical because I could imagine a situation in which there is a catastrophic year and males chicks survive better than females, but only at young ages.

Here is my code and stats of the top model (note: "Quadratic" is the "Age^2" variable described above and is numeric):
Code: Select all
# AIC model table, with p structure removed to stop line overflow
# (best structure of p was: p(~Year * Quadratic))
Ceuta_chicks_survival$model.table[,-c(2,3)] 
                                  Phi npar     AICc  DeltaAICc       weight Deviance
24          ~Sex * Quadratic * Year   42 13325.28   0.000000 7.683110e-01 10210.92
30               ~Sex * Time * Year   42 13328.64   3.354000 1.436234e-01 10214.27
22          ~Sex + Quadratic + Year   23 13329.75   4.469463 8.222531e-02 10253.91
28               ~Sex + Time + Year   23 13335.08   9.793463 5.740023e-03 10259.23
15     ~Day_of_Season * Sex * Cubic   22 13344.66  19.377506 4.761735e-05 13300.45
17 ~Day_of_Season * Sex * Quadratic   22 13345.96  20.678506 2.484601e-05 13301.75
7        ~Day_of_Season + Quadratic   17 13347.69  22.411157 1.044759e-05 13313.57
10 ~Day_of_Season + Sex + Quadratic   18 13349.11  23.828970 5.142118e-06 13312.97
14       ~Day_of_Season * Quadratic   18 13349.54  24.254970 4.155634e-06 13313.40
5            ~Day_of_Season + Cubic   17 13350.72  25.442157 2.295319e-06 13316.60
12            ~Day_of_Season + Time   17 13351.65  26.366157 1.446102e-06 13317.52
13           ~Day_of_Season * Cubic   18 13352.09  26.805970 1.160632e-06 13315.95
8      ~Day_of_Season + Sex + Cubic   18 13352.12  26.838970 1.141639e-06 13315.98
18      ~Day_of_Season * Sex * Time   22 13353.06  27.778506 7.136926e-07 13308.85
11      ~Day_of_Season + Sex + Time   18 13353.09  27.803970 7.046635e-07 13316.95
19            ~Day_of_Season * Time   18 13353.55  28.266970 5.590396e-07 13317.41
20                       ~Quadratic   16 13374.43  49.143171 1.637696e-11 10312.70
23                 ~Sex * Quadratic   18 13375.10  49.813970 1.171038e-11 10309.34
21                 ~Sex + Quadratic   17 13376.19  50.905157 6.786139e-12 10312.44
26                            ~Time   16 13378.57  53.287171 2.062410e-12 10316.84
1                            ~Cubic   16 13378.71  53.429171 1.921056e-12 10316.98
3                      ~Sex * Cubic   18 13379.07  53.786970 1.606368e-12 10313.31
29                      ~Sex * Time   18 13379.64  54.359970 1.206201e-12 10313.88
27                      ~Sex + Time   17 13380.34  55.059157 8.503410e-13 10316.60
2                      ~Sex + Cubic   17 13380.46  55.176157 8.020232e-13 10316.72
31                            ~Year   21 13392.96  67.676378 1.548097e-15 10321.15
6                    ~Day_of_Season   16 13393.64  68.355171 1.102555e-15 13361.53
16             ~Day_of_Season * Sex   18 13393.82  68.537970 1.006250e-15 13357.68
32                      ~Sex + Year   22 13394.86  69.572506 0.000000e+00 10321.03
9              ~Day_of_Season + Sex   17 13395.09  69.809157 0.000000e+00 13360.97
33                      ~Sex * Year   28 13396.89  71.606714 0.000000e+00 10310.94
4                                ~1   15 13441.23 115.945011 0.000000e+00 10381.51
25                             ~Sex   16 13443.00 117.715171 0.000000e+00 10381.27

# Define top model as "Sex_Quad_Year"
Sex_Quad_Year <- Ceuta_chicks_survival[[24]]

# Check model structure
Sex_Quad_Year$model.name
[1] "Phi(~Sex * Quadratic * Year)p(~Year * Quadratic)"

# Reveal variable structure of top model
# Sex is factor, Year is factor, Age is numeric (transformed to Quadratic in the model)
str(Sex_Quad_Year$design.data$Phi)
'data.frame':   33345 obs. of  14 variables:
 $ par.index  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ model.index: num  1 2 3 4 5 6 7 8 9 10 ...
 $ group      : Factor w/ 95 levels "FA02006","MA02006",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ cohort     : Factor w/ 26 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ age        : Factor w/ 26 levels "0","1","2","3",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ time       : Factor w/ 26 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ occ.cohort : num  1 1 1 1 1 1 1 1 1 1 ...
 $ Cohort     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Age        : num  0 1 2 3 4 5 6 7 8 9 ...
 $ Time       : num  0 1 2 3 4 5 6 7 8 9 ...
 $ Sex        : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1 1 1 ...
 $ Care_site  : Factor w/ 7 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ CF         : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
 $ Year       : Factor w/ 7 levels "2006","2007",..: 1 1 1 1 1 1 1 1 1 1 ...

# Reveal betas
Sex_Quad_Year$results$beta
                                 estimate           se           lcl           ucl
Phi:(Intercept)              2.2307285000 0.2056190000  1.827715e+00  2.6337418000
Phi:SexM                    -0.3193578000 0.2757177000 -8.597645e-01  0.2210489000
Phi:Quadratic                0.0043496000 0.0017315000  9.558578e-04  0.0077433000
Phi:Year2007                 1.0240777000 0.3075534000  4.212730e-01  1.6268824000
Phi:Year2008                -0.3014357000 0.3558866000 -9.989734e-01  0.3961021000
Phi:Year2009                 1.0747388000 0.3504767000  3.878044e-01  1.7616732000
Phi:Year2010                 0.1529104000 0.2717350000 -3.796903e-01  0.6855111000
Phi:Year2011                 0.4275811000 0.4930861000 -5.388676e-01  1.3940298000
Phi:Year2012                 1.0684765000 0.9014459000 -6.983575e-01  2.8353106000
Phi:SexM:Quadratic           0.0038898000 0.0031714000 -2.326100e-03  0.0101058000
Phi:SexM:Year2007            0.2712880000 0.4388717000 -5.889006e-01  1.1314766000
Phi:SexM:Year2008            1.0247808000 0.5231202000 -5.348494e-04  2.0500964000
Phi:SexM:Year2009           -0.1093902000 0.4699184000 -1.030430e+00  0.8116499000
Phi:SexM:Year2010            0.4945368000 0.4290885000 -3.464767e-01  1.3355503000
Phi:SexM:Year2011           -0.1452010000 0.6607602000 -1.440291e+00  1.1498891000
Phi:SexM:Year2012            0.4263633000 1.5597777000 -2.630801e+00  3.4835276000
Phi:Quadratic:Year2007      -0.0021759000 0.0023351000 -6.752700e-03  0.0024009000
Phi:Quadratic:Year2008       0.0041421000 0.0050173000 -5.691800e-03  0.0139761000
Phi:Quadratic:Year2009      -0.0050761000 0.0021661000 -9.321700e-03 -0.0008304867
Phi:Quadratic:Year2010      -0.0004677159 0.0022324000 -4.843300e-03  0.0039079000
Phi:Quadratic:Year2011       0.0359784000 0.0224855000 -8.093300e-03  0.0800501000
Phi:Quadratic:Year2012      -0.0397396000 0.0171118000 -7.327870e-02 -0.0062004000
Phi:SexM:Quadratic:Year2007  0.0003468555 0.0045491000 -8.569400e-03  0.0092631000
Phi:SexM:Quadratic:Year2008 -0.0048174000 0.0078546000 -2.021240e-02  0.0105775000
Phi:SexM:Quadratic:Year2009 -0.0006174531 0.0039282000 -8.316700e-03  0.0070818000
Phi:SexM:Quadratic:Year2010 -0.0054629000 0.0038924000 -1.309190e-02  0.0021662000
Phi:SexM:Quadratic:Year2011 -0.0112460000 0.0316339000 -7.324840e-02  0.0507565000
Phi:SexM:Quadratic:Year2012  0.0108462000 0.0268837000 -4.184590e-02  0.0635382000
p:(Intercept)               -0.1090783000 0.0811522000 -2.681365e-01  0.0499799000
p:Year2007                   0.1146948000 0.0973359000 -7.608350e-02  0.3054731000
p:Year2008                  -0.7968537000 0.1481832000 -1.087293e+00 -0.5064147000
p:Year2009                   0.9697138000 0.1188300000  7.368070e-01  1.2026206000
p:Year2010                  -0.1746124000 0.1147843000 -3.995897e-01  0.0503648000
p:Year2011                   0.5492895000 0.1229418000  3.083236e-01  0.7902553000
p:Year2012                   0.4151020000 0.3183475000 -2.088592e-01  1.0390631000
p:Quadratic                 -0.0001798633 0.0003280965 -8.229325e-04  0.0004632060
p:Year2007:Quadratic        -0.0004217112 0.0003895933 -1.185300e-03  0.0003418916
p:Year2008:Quadratic         0.0007554197 0.0005983817 -4.174084e-04  0.0019282000
p:Year2009:Quadratic        -0.0011032000 0.0005049107 -2.092800e-03 -0.0001135537
p:Year2010:Quadratic         0.0007851228 0.0004508544 -9.855196e-05  0.0016688000
p:Year2011:Quadratic        -0.0011706000 0.0004596049 -2.071400e-03 -0.0002697411
p:Year2012:Quadratic        -0.0063454000 0.0118501000 -2.957160e-02  0.0168809000


You may note that the Phi SexM beta is not significant, but I'm not really interested in if it is significant or not. Instead I'm interested in the variance distributions of each sex, which I can then plug into the projection matrix.

Thanks Jeff!

Cheers,
Luke
luke.eberhart
 
Posts: 14
Joined: Mon Jul 13, 2015 9:51 am

Re: Which model to use: "top" model or "simple" model?

Postby luke.eberhart » Sat Aug 29, 2015 2:05 pm

I tried to estimate the process variance of each sex using the dipper data and creating a model similar to mine (i.e., three-way interaction with Sex*Year*Covariate...the covariate in this case is a randomly assigned "weight" variable, but it represents the "Quadratic" covariate in my models).

Code: Select all
# import dipper data
data(dipper)

# set seed to make this reproducible
set.seed(42)

# create a random variable called "weight" which represents a covariate in the model
dipper$weight=rnorm(294)

# process data
dipper.proc=process.data(dipper,model="CJS",groups=c("sex"))

# create design data
dipper.ddl=make.design.data(dipper.proc)

# run the Phi(sex*time*weight)p(.) model
dipper_sex_time_weight <- mark(dipper.proc,dipper.ddl, model.parameters=list(Phi=list(formula=~sex*time*weight)),output=F)

# check AICc statistic
dipper_sex_time_weight$results$AICc

# get reals and vcv matrix of Phi from the model
zz=get.real(dipper_sex_time_weight,"Phi",vcv=TRUE)

# extract the female estimates
zf=zz$estimates$estimate[1:6]

#extract the male estimates
zm=zz$estimates$estimate[22:27]

# extract the female portion of the vcv matrix
vcvf=zz$vcv.real[c(1:6),c(1:6)]

# extract the male portion of the vcv matrix
vcvm=zz$vcv.real[c(7:12),c(7:12)]

# calculate the variance components of female survival
varc_f=var.components(zf,design=matrix(rep(1,length(zf)),ncol=1),vcvf)

# calculate the variance components of male survival
varc_m=var.components(zm,design=matrix(rep(1,length(zm)),ncol=1),vcvm)

# extract the design data of females
df_f=dipper_sex_time_weight$design.data$Phi[c(1:21),]

# extract the design data of males
df_m=dipper_sex_time_weight$design.data$Phi[c(22:42),]

# estimate time-specific female survival corrected for process variance
shrinkest_f=data.frame(time=1:6,value=varc_f$betarand$estimate)

# estimate time-specific male survival corrected for process variance
shrinkest_m=data.frame(time=1:6,value=varc_m$betarand$estimate)

# merge the female design data with the corrected time-specific estimates
df_f=merge(df_f,shrinkest_f,by="time")

# merge the male design data with the corrected time-specific estimates
df_m=merge(df_m,shrinkest_m,by="time")

# bind the female and male data together by row
df <- rbind(df_f, df_m)

# rerun the model but with fixed parameters
dipper_sex_time_weight2=mark(dipper.proc,dipper.ddl,
    model.parameters=list(Phi=list(formula=~sex*time*weight,
    fixed=list(index=df$par.index,value=df$value))),adjust=FALSE)

# determine the number of parameters in the corrected model
npar=dipper_sex_time_weight2$results$npar+varc_f$GTrace+varc_m$GTrace

# calculate the new adjusted AICc
dipper_sex_time_weight2$results$lnl+2*(npar + (npar*(npar+1))/(dipper_sex_time_weight2$results$n-npar-1))

Okay, so here is my main question then:

If I wanted estimates of the sex-specific distributions for survival of dippers while incorporating annual and covariate effects, would I use the following distributions?:

Code: Select all
# Corrected estimate of female survival and standard error
varc_f$beta
   Estimate         SE
0.5491448 0.03727932

# Corrected estimate of male survival and standard error
varc_m$beta
   Estimate         SE
0.5723419 0.03571197

Would these distributions then be appropriate for my matrix model?

When I compare them to the simple Phi(sex)p(.) model the estimates are slightly different, suggesting that they are "corrected", and thus the "proper" estimates to use...but this is based only my naive intuition:

Code: Select all
# run simple Phi(sex)p(.) model
dipper_sex <- mark(dipper.proc,dipper.ddl, model.parameters=list(Phi=list(formula=~sex)),output=F)

# extract survival distributions of females and males
dipper_sex$results$real[grepl("Phi", row.names(dipper_sex$results$real)),]
                      estimate        se       lcl       ucl fixed    note
Phi gFemale c1 a0 t1 0.5507352 0.0345707 0.4824541 0.6171564             
Phi gMale c1 a0 t1   0.5702637 0.0353294 0.5000911 0.6377218         

Thanks for the help!

Cheers,
Luke
luke.eberhart
 
Posts: 14
Joined: Mon Jul 13, 2015 9:51 am

Re: Which model to use: "top" model or "simple" model?

Postby luke.eberhart » Sat Aug 29, 2015 4:46 pm

Okay, so it looks like the sigmasqr is the process variance, not the SE of the Beta (which contains both sampling and process variance). I found the section 6.14 of the MARK book really helpful.

I've tried applying my dipper scripts to my plover data. However, I get the following error:
Code: Select all
Error in uniroot(mom.sigCI, lower = lower, upper = upper, vcv = vcv, theta = theta,  :
  f() values at end points not of opposite sign

Any ideas how to proceed?

Cheers,
Luke
luke.eberhart
 
Posts: 14
Joined: Mon Jul 13, 2015 9:51 am

Re: Which model to use: "top" model or "simple" model?

Postby jlaake » Mon Aug 31, 2015 10:47 am

I've not read through all of the messages over the weekend. It it possible there is no solution with a positive process variance. However, before giving up you may want to try var.components.reml.

--jeff
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA


Return to RMark

Who is online

Users browsing this forum: No registered users and 1 guest