www.phidot.org

by **JoeEM** » Mon Oct 03, 2005 4:04 pm

I am analyzing mistnet recap data for central american birds, currently a data set with two sites (groups) and 6 years. My c-hat estimates for phi(g*t)p(g*t) vary widely depending on the estimation method used. For example, dividing model deviance by d.f. gives c-hat = 5.02, dividing model deviance by average of 100 bootstrap deviances gives 2.60, while the median c-hat method gives 1.69.

First question, which am I to believe, and implement in subsequent analysis? Second, why the wide range of results, and does this indicate a poor model fit to my data?

One possible explanation may be that of my 361 total captures, only 47 (13%) were caught more than once (many dispersing juveniles at both sites).

Thanks for the feedback,

Joe

by **bmitchel** » Mon Oct 03, 2005 6:16 pm

Joe -

You should probably expect to get different answers with different GOF tests, since tests can vary widely in bias and precision. From what I've read, the first approach (divide model deviance by d.f.) is considered biased and uninformative for mark-recapture models (I think Evan Cooch covers this in his GOF chapter). Your two other methods are both probably fine, although some might argue for more bootstraps (500 or 1,000). Your results (2.6 for bootstrap and 1.7 for median c-hat) are different but not dramatically so (given that these GOF tests are fairly imprecise estimates of fit). The conservative approach would be to take the higher value and use that as your estimate of c-hat. However, I think Gary White has argued that the deviance statistic is also biased, and he has suggested that the median c-hat is a better approach (see the MARK help files). I have not seen any documentation on the median c-hat approach beyond what is written in the MARK help files, so I have not been using it much.

My personal opinion from running a lot of bootstrap GOF simulations is that estimating c-hat is extremely imprecise; I have seen c-hats for simulated data (known to have c-hat = 1) that range up to 4 or 5. In my own research, I tend not to make any adjustment to c-hat unless I have bootstrap GOF results indicating lack of fit (e.g., a bootstrap deviance statistic that is greater than 90% of the bootstrap deviances). In that case, I will adjust c-hat as long as the estimate is less than 5 or so (a larger c-hat estimate is a good sign that the lack of fit is due to poor model structure rather than overdispersion). Others may disagree....

Brian

Brian R. Mitchell
Postdoctoral Research Associate
University of Vermont

by **cooch** » Mon Oct 03, 2005 7:07 pm

bmitchel wrote:You should probably expect to get different answers with different GOF tests, since tests can vary widely in bias and precision. From what I've read, the first approach (divide model deviance by d.f.) is considered biased and uninformative for mark-recapture models (I think Evan Cooch covers this in his GOF chapter). Your two other methods are both probably fine, although some might argue for more bootstraps (500 or 1,000). Your results (2.6 for bootstrap and 1.7 for median c-hat) are different but not dramatically so (given that these GOF tests are fairly imprecise estimates of fit). The conservative approach would be to take the higher value and use that as your estimate of c-hat. However, I think Gary White has argued that the deviance statistic is also biased, and he has suggested that the median c-hat is a better approach (see the MARK help files). I have not seen any documentation on the median c-hat approach beyond what is written in the MARK help files, so I have not been using it much.

All of the various c-hat estimation procedures available in MARK are documented in Chapter 5 - including a fair bit on the median c-hat. Of the various tests that are available, (i) RELEASE is still prefered for 'adequate' CMR datasets (meaning, without too much sparseness that causes pooling problems in RELEASE), and (ii) median c-hat for everything else (median c-hat is stil a work in progress, but results using it have been very promising).

My personal opinion from running a lot of bootstrap GOF simulations is that estimating c-hat is extremely imprecise; I have seen c-hats for simulated data (known to have c-hat = 1) that range up to 4 or 5.

This is a good point - and is also discussed in some detail in Chapter 5 - see pp. 33-35 in that chapter - especially the figure on p. 34, which is a graphical representation of what Brian refers to. Remember, we are estimating c, based on a single data set, which we consider as one realization of an underlying probabilisitic process. Even if the true c is one, estimated c-hat could be wuite different from one. What we need (and some folks are working on) is a robust way to estimate both c-hat, and the SE of the estimate. Stay tuned...

Until then, there are some recommendations in Chapter 5 on how to proceed.

by **JoeEM** » Thu Oct 06, 2005 2:17 pm

Thanks for the advice so far. I have a further question about how to proceed when "sparseness" occurs. Here's the sequence of events that has me befuddled:

1. I started with a data set with 2-sites (6 capture occasions, 716 individuals, 23 capture unique histories). RELEASE indicated a TEST3 failure for the second group. I suspected age-specific differences in survivorship, since I believe I have dispersing juveniles in my sample.

2. I reconstructed the data set with 2 age-classes and 2 sites, built a TSM model, and found (by LRT comparison and AIC rank) that (a) including 2-ages significantly improves the model, and (b) 2 sites does not improve the model. However, I then ran RELEASE which showed insufficient data for all but 3 of the Test2 and Test3 group tests.

Thus, it appears that RELEASE is telling me that age class differences violate GOF for the first data set, but that explicitly treating age class then renders my data too sparse to test GOF. The next logical step, I suppose, would be to combine the two sites, but they are vegetatively different and I am loathe to simply merge that data. Futhermore, I'd like to analyze data for other species for which I have even less data.

Does the problem I am experiencing fall into the category of sparseness of my data set? Under these circumstances can I still estimate Phi if RELEASE cannot test GOF? Or is the proper interpretation that I cannot use MARK on this data? Are there other options for estimating and comparing survival under the limitations present in my data?

Thanks for the help

by **cooch** » Thu Oct 06, 2005 5:07 pm

Read item (1) of the list of recommendations on p. 5 of the GOF chapter. It states epcifically that in the process of GOF testing, you should

identify a general model that, when you fit it to your data, has few or no estimability problems (i.e., all of the parameters seem adequately estimated). Note that this general model may not be a fully time-dependent model - especially if your data set is rather sparse...

I'd be willing to bet your data set is sufficiently sparse that trying to fit a fully time-dependent model is a waste of time - a clue might be how many (if any) of your parameters are actually correctly estimated (you can look at the output either from MARK or RELEASE). Note that RELEASE fits a fully time-dependent model to your data. Such a model may not be appropriate, if your data don't support such a model. You might have to use a reduced parameter model as your starting point - if so, then you don't (can't, in fact) use RELEASE, but some other approach (say, median c-hat) that is described in Chapter 5. So, try fitting your data to simpler models than fully time-dependent models - either by constraining the estimates to be functions of covariates of interest, or even using time-invariant 'dot' models for some parameters.

If your data are really sparse, and you have lots of occasions, you might have to accept there are real limits to what you can do.

by **JoeEM** » Thu Oct 06, 2005 6:21 pm

Thanks a ton for the advice. I understand your comments, and feel I am making progress in grasping the elements at play here. I agree that for many of my species I will have insufficient data. I submit the following remarks as, hopefully, my final effort to settle some unresolved uncertainties.

As you suggest, I am now (and have been) fitting as my general model a time-independent model, by setting up the PIM to match a TSM structure, with no time-dependency for adults, and p(.) for recap parameter, and 2 sites. I did realize RELEASE uses a time-dependent model, but thought the summary of Test2 and Test3 (e.g., "Group 1 3.sm") pooled across time intervals (although these lines don't indicate whether data was sufficient or not). Others have remarked that this pooling is a very tricky statistical issue.

Sounds like the key is now seeing if the model "actually correctly estimates" the parameters, since I can use median c-hat to generate c-hat estimates. Can I still rely on bootstrapping to provide a p-value for my GOF (by rank of observed deviance), or are there other ways of determining if my parameters are correctly estimated? Should I avoid dividing observed deviance by deviance DF to estimate c-hat (a method which produces results 2-3 times larger than median c-hat)? Because median c-hat is still a "work in progress" should I rely instead on dividing observed deviance by mean of simulation deviances?

For example, I look at results of the above general model, LRT comparisons indicate including age classes improves the model over a reduced model. Median c-hat returns 1.54, while dividing obs by simul deviance returns c-hat = 1.68. The parameters estimates seem "pretty reasonable" and the SE for each is not "too large." Based on our discussion, I am inclined to accept the model fit, test sensitivity to a range of c-hats, plug in the higher c-hat estimate (1.68), and let 'er rip. Sounds reasonable?

www.phidot.org

GOF disagreements

GOF disagreements

Re: GOF disagreements

Re: GOF disagreements

GOF Disagreements and Overparameterization

Re: GOF Disagreements and Overparameterization

GOF and sparseness, final questions

Who is online