Advice when deviance=0 with missing observations

questions concerning analysis/theory using program MARK

Advice when deviance=0 with missing observations

Postby jbauder » Fri Mar 02, 2012 5:27 pm

This post is similar to another very recent post on missing data but I have a question about a slightly different aspect of the missing data topic. I have a data set where one population was not sampled during one year so roughly 1/3 of my 2,600 encounter histories have a single missing value (coded with a "."). As discussed in an earlier post (viewtopic.php?f=1&t=2099), my data are too sparse to simultaneously look for effects of age, population, and sex, so I am doing a separate CJS analyses for each factor. When looking for age effects (five groups), the deviances for all my models are zero (except for the phi(.)p(.) model). I examined the deviance residual plots and saw several points that were between -5 and -13, resulting in a highly asymmetrical plot. I examined the residual output and these extreme values are associated with encounter histories containing missing observations and encounter histories that are the same except they do not have the missing observation (e.g., 01001.0 and 0100100).
Since there are extreme residuals, is it even appropriate to proceed with the analysis and use the AIC values and parameter estimates from these models with deviance=0? Or do extreme residuals indicate that I have a fundamental flaw in model structure because of these missing values?
Would an acceptable course of action be to truncate the data to exclude these missing values? I have a 14 year data set and year 12 was the year with the missing data. Because captures and recaptures were very low in years 12-14, I feel like I could truncate my data without loosing much information.
jbauder
 
Posts: 56
Joined: Wed May 25, 2011 12:01 pm

Re: Advice when deviance=0 with missing observations

Postby abreton » Mon Mar 12, 2012 2:02 pm

Some thoughts but possibly no answers(!).

I have a 14 year data set and year 12 was the year with the missing data. Because captures and recaptures were very low in years 12-14, I feel like I could truncate my data without loosing much information


Given that the last few years provided few recaptures, the option you mentioned here shouldn't be excluded. However, an alternative sounds better to me, at least after thinking about it for just a few minutes -- instead of using the "." notation in the encounter histories for this group/occasion instead use "0" and in then fix the associated recapture parameter (p) to zero. This is accomplished from the Run Numerical Optimization window by clicking on the Fix Parameters option. This solution would properly account for the missing data, and it might result in a more 'comforting' residual plot. My suspicion, speculating here, is that the procedure for populating the residual plot is not handling the "." properly. Again, just speculation. My suggestion would remove the issue all together...you should get a sensible plot if the "." was the problem.

my data are too sparse to simultaneously look for effects of age, population, and sex, so I am doing a separate CJS analyses for each factor.


If that's the case, i.e., you can't (for example) fit all possible combinations of these factors in a suite of models and use (in part) summed model weights to assess relative variable importance, then I might consider going a variance route. Basically, you fit a time-dependent survival model without any other effects, and then use the variance components option to estimate the variance among the survival parameters 'without' constraints (e.g., age effects). Then, using the same procedure and model, you fit age, population (group?) and sex as covariates...one at a time...in each case noting the percent reduction in variance from the model without covariates. In the end, you'll know how much variance in survival is explained by each of your covariates. I like it much better than separate analyses for age, pop and sex. See the variance components chapter in the MARK manual for more details...

When looking for age effects (five groups), the deviances for all my models are zero (except for the phi(.)p(.) model).


This is a flag on the field, it suggests a problem with the data, model structure, etc. As you likely know, 'deviance' is relative to the saturated model, that is, a model that fits the data perfectly. So, a model with deviance 0 is equivalent to the saturated model...unless you have infinite data this is very...unbelievable. I suspect that the phi(.)p(.) model converges successfully because there are no problems, the deviance as a result is accurate. All other models I suspect a problem...

I hope some of this turns out to be helpful, good luck.

andre
abreton
 
Posts: 111
Joined: Tue Apr 25, 2006 8:18 pm
Location: Insight Database Design and Consulting

Re: Advice when deviance=0 with missing observations

Postby jbauder » Mon Mar 12, 2012 6:15 pm

Hi Andre

Thanks for your response. I actually tried to account for my missing data by using zeros and then fixing recapture probability to zero. Unfortunately, I have to go back to my reasoning for the three separate analyses. When I first started doing the analysis, I had 42 groups (3 dens or populations, 2 sexes, 7 age classes). Even when I would run a single factor model like phi(den)p(den) or phi(sex)p(sex) (or a phi(.)p(.) model), my deviance residual plots would be very skewed, with a tight cluster of points around or right above zero and many residuals >4. When I looked at the residuals themselves, I had a lot of cells with expected and observed values around zero. I then tried condensing age classes to only have 5 age classes but whenever I had groups for multiple factors (e.g., 15 groups for the 3 dens and 5 age classes) I would still have the same issue with my deviance residual plots. I found that if I went down to no more than 5 groups (just the 3 dens or just the 5 age classes) in my .inp file, my deviance residual plots would finally look normal. It doesn't sound like I can use variance route you suggested. It just seems my data is too sparse to have more than a handful of groups. Highlights the importance of increase your recapture rates.
For the den analysis, I did try using zeros to represent the missing data (which was from den3 in 2008) and then fixing p in that year to zero and my models appeared to run fine. However, for the other .inp files, because I can't put groups in for den (because of the residual plot), I don't have a den-specific parameter to fix for zero.
Thanks for taking the time to respond. If you have any other thoughts, or if I am still missing something, please let me know.

Javan
jbauder
 
Posts: 56
Joined: Wed May 25, 2011 12:01 pm


Return to analysis help

Who is online

Users browsing this forum: No registered users and 2 guests

cron