Dear MARK-Forum,
I’m struggling with the interpretation of the GOF-/ model-analysis and would very much appreciate some hints about how to proceed!
We monitored winter activity and habitat use of 400 individually tagged fish (PIT-tags) in a small, shallow stream by weekly mobile tracking (remote detection). The fish belong to three groups comprising different species and age-classes and were all tagged the same day (single release cohort).
The return rate (percentage of the totally tagged individuals that was re-encountered) varied quite considerably over the 26 tracking occasions and between the different groups (see graph on flickr: http://www.flickr.com/photos/feli2013/6685215881/) Therefore, I’d like to analyze the effects of methodological (e.g. tag size) and environmental factors (e.g. ice formation, individual characteristics) on apparent survival and recapture probability.
I run a time-dependent CJS-model {Phi(t) p(g*t)}, which resulted in a good estimability for all but 1 of the 104 parameters – what let me assume (in accordance to page 39, chap. 5) that the model structure was appropriate. However, deviance was quite pronounced (4220,94), with the deviance residuals showing a highly asymmetrical distribution (see graph on flickr: http://www.flickr.com/photos/feli2013/6685215711/). This pattern was consistent when applying other model structures, e.g. {Phi(g*t) p(g*t) PIM} or {Phi(t) p(t) PIM} or {Phi(g*ICE) p(g*ICE) }.
Based on the text on page 34/35 in Chapter 5 I interpreted the residual plot as follows:
a) I have a trend in my data (“structural problem”) as the residuals do not scatter randomly above and below the 0-line, but are to a great extent positive. (BUT: This conclusion is contradicting the one I made before where I considered model structure to be OK as the model did not have estimability problems).
b) I have some extra-binomial variation as the residuals are not close to the 0-line, but rather large (outside the dashed lines).
I used two different methods of GOF-assessment, with variable success:
1) Median c-hat (with 1300 simulations): I ran it several times and continuously narrowed the estimation range. From the simulation results exported to Excel, I could calculate c-hat to be around 1,139. However, I faced some problems with the MARK-output (see graph on flickr: http://www.flickr.com/photos/feli2013/6685229869/). The position of the 50% value is clearly visible (around 1.139), but the values given for c-hat and SE don’t correspond/ don’t make sense.
2) RELEASE: Mainly TEST 3 produced invalid results due to insufficient values in the contingency tables. In accordance to page 36 in chapter 5, I assume this is related to the sparseness and structure of my data (-> a single release cohort, not too big of a sample size (n= 400), many encounter occasions (n=27) and quite a high recapture rate in the beginning of the study).
If I adjust c-hat stepwise (as proposed on page 5-39), this does not change anything in the rankings of models I have so far.
Based on the residual plot I believe my model has a lack of fit and I would like to identify why. Is this due to data sparseness, i.e. may sparse data lead to both structural and/ or over-dispersion problems?
I have some difficulties to see my next steps. Do I just not have the adequate data to be analyzed with MARK? Does anybody with more experience have some thoughts about this?
Many thanks for a feedback! Also ideas for alternative approaches for analyzing return rates- apparent survival – recapture probabilities are very much welcome!
Best regards and have a good day, christine