Help with var.components

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Help with var.components

Postby bootzies » Wed Dec 03, 2014 3:17 am

Hello list.

I have fitted a relatively complex multistate model, with occasion (annual) covariates. My dataset has 30 encounter periods and >5000 individuals. I am now trying to estimate the process variance so that I can go on to use the survival parameter estimates in a stochastic population model. I am getting pretty confused about the best way to do this for my model and was hoping you may be able to help.

Firstly, I am simply trying to get a single (mean) value for the estimate of process variance for the full 30-year period. I am trying to follow the examples in ?var.components, where it says that if you want a mean estimate you need to use a column matrix of 1’s for the design matrix. I have done this, but I am getting the error: ”Error in var.components(z, design = matrix(rep(1, length(z)), ncol = 1), : Number of rows of vcv matrix must match length of theta vector”.

My theta (vector of parameter estimates) for the entire model output has the length 5220; however the associated vcv matrix is 298 x 298, and this seems to be generating the error message. Is this mismatch in row number something to do with the inclusion of multiple states / covariates in the model?

Secondly I would like the mean process variance over the full time period, but for each of my two age groups (age is just a grouping variable that only represents age at mark). I am unsure how to subset my theta and match up the appropriate parts of the vcv matrix (considering these have different numbers of rows – it doesn’t seem that the link is straight forward).

Thirdly I would like to get a single estimate for process variance for each age group, but over a shorter sub-period within my dataset (I want to compare this sub-period with the full data period, so I can see whether there is a difference in the process variance). I am not sure how to tell the var.components function to only include specific years of model estimates in the calculation.

Finally, I need to do all of this for model averaged outputs (averages from 3 models) – I am starting off just doing this for a single model as I wanted to start simple!

I also wanted to check whether it is actually correct for me to be using tools such as var.components and var.components.reml on a model with covariates?

Note that I have some fixed parameter estimates in my models. As I understand it, specifying parameters as fixed during model fitting means that the betas are set to 0; but estimates are still produced by the function ‘get.real’. Should I set the real estimates of these fixed parameters to ‘NA’ before applying the var.components function?

For your info, the structure of my object zz (the real parameter estimates from a single model) is below:
str(zz)
List of 2
$ estimates:'data.frame': 5220 obs. of 23 variables:
..$ all.diff.index: num [1:5220] 1 2 3 4 5 6 7 8 9 10 ...
..$ par.index : int [1:5220] 1 2 3 4 5 6 1 7 8 9 ...
..$ estimate : num [1:5220] 0.954 0.962 0.965 0.962 0.968 ...
..$ se : num [1:5220] 0.00978 0.00802 0.00754 0.0079 0.00735 ...
..$ lcl : num [1:5220] 0.931 0.942 0.947 0.944 0.95 ...
..$ ucl : num [1:5220] 0.97 0.975 0.977 0.975 0.979 ...
..$ group : Factor w/ 2 levels "Adult","Juvenile": 1 1 1 1 1 1 1 1 1 1 ...
..$ cohort : Factor w/ 29 levels "1982","1983",..: 1 1 1 1 1 1 1 1 1 1 ...
..$ age : Factor w/ 29 levels "0","1","2","3",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ time : Factor w/ 29 levels "1982","1983",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ occ : num [1:5220] 1 2 3 4 5 6 7 8 9 10 ...
..$ occ.cohort : num [1:5220] 1 1 1 1 1 1 1 1 1 1 ...
..$ stratum : Factor w/ 6 levels "A","B","C","D",..: 1 1 1 1 1 1 1 1 1 1 ...
..$ Cohort : num [1:5220] 0 0 0 0 0 0 0 0 0 0 ...
..$ Age : num [1:5220] 0 1 2 3 4 5 6 7 8 9 ...
..$ Time : num [1:5220] 0 1 2 3 4 5 6 7 8 9 ...
..$ Age_grp : Factor w/ 2 levels "Adult","Juvenile": 1 1 1 1 1 1 1 1 1 1 ...
..$ A : num [1:5220] 1 1 1 1 1 1 1 1 1 1 ...
..$ B : num [1:5220] 0 0 0 0 0 0 0 0 0 0 ...
..$ C : num [1:5220] 0 0 0 0 0 0 0 0 0 0 ...
..$ D : num [1:5220] 0 0 0 0 0 0 0 0 0 0 ...
..$ E : num [1:5220] 0 0 0 0 0 0 0 0 0 0 ...
..$ F : num [1:5220] 0 0 0 0 0 0 0 0 0 0 ...
$ vcv.real : num [1:298, 1:298] 9.56e-05 7.39e-05 6.37e-05 7.17e-05 5.74e-05 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:298] "1" "2" "3" "4" ...
.. ..$ : chr [1:298] "1" "2" "3" "4" ...
bootzies
 
Posts: 21
Joined: Wed Jan 29, 2014 2:31 am

Re: Help with var.components

Postby cooch » Wed Dec 03, 2014 8:53 am

bootzies wrote:
Secondly I would like the mean process variance over the full time period, but for each of my two age groups (age is just a grouping variable that only represents age at mark). I am unsure how to subset my theta and match up the appropriate parts of the vcv matrix (considering these have different numbers of rows – it doesn’t seem that the link is straight forward).


Difficult to do with the moments-baseed VC procedures in MARK (which is what RMark is using). More straightforward with MCMC, where you can specify the various multivariate hyperdistribtuions you want to calculate mean and variance over.

Thirdly I would like to get a single estimate for process variance for each age group, but over a shorter sub-period within my dataset (I want to compare this sub-period with the full data period, so I can see whether there is a difference in the process variance). I am not sure how to tell the var.components function to only include specific years of model estimates in the calculation.


Via the design matrix. See example D.4.3 in Appendix D of the MARK book (which you should read in its entirety before doing much with process variance estimation).

Finally, I need to do all of this for model averaged outputs (averages from 3 models) – I am starting off just doing this for a single model as I wanted to start simple!


No. The random effects model you're fitting is in fact a model-averaged model of sorts, since it is intermediate between time-varying and time-invariant fixed effects models. See section D.5 in Appendix D.

I also wanted to check whether it is actually correct for me to be using tools such as var.components and var.components.reml on a model with covariates?


Depends on the type of covariate....generally, no.

Note that I have some fixed parameter estimates in my models. As I understand it, specifying parameters as fixed during model fitting means that the betas are set to 0; but estimates are still produced by the function ‘get.real’. Should I set the real estimates of these fixed parameters to ‘NA’ before applying the var.components function?


Fixing parameters probably violates some of the key assumptions of the process variance estimation. I'd suggest not doing it.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Help with var.components

Postby cooch » Wed Dec 03, 2014 8:54 am

Should also add that if any of the 'groups' over which you're trying to estimate process variance have <10 occasions, you should probably just stop now. The method(s) don't work well unless you have long-ish time series.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University


Return to RMark

Who is online

Users browsing this forum: No registered users and 2 guests