www.phidot.org

by **jlaufenb** » Thu Mar 10, 2011 11:37 am

I am analyzing CMR data using Full Closed models with heterogeneity via RMark and have discovered an issue with calculating a single capture probability across mixtures using model averaged estimates. The problem is linked to the estimate of pi. In my study, there is a small proportion of the population for which p is relatively high and a large proportion of the population for which p is relatively low. In my models set, I have various models that include mixtures and all produce similar estimates for piA. For some of those models, the estimation procedure converges to and reports, in this order, an estimate of piA reflecting the larger proportion, estimates of pA reflecting low p, and estimates of pB reflecting high p. For other mixture models, the reported estimate of piA reflects the smaller proportion followed by pA reflecting high p and pB reflecting low p. When these estimates are model averaged, the end result is different than a case where estimates of piA for all models reflect the same proportion of the population. Below is a simple example of what is happening in each scenario:

CASE 1 Estimate, Weight, Weighted estimate, Model averaged estimate
piA-Model A 0.9, 0.7, 0.63, 0.66
pA-Model A 0.05, 0.7, 0.035, 0.11
pB-Model A 0.25, 0.7, 0.175, 0.19
piA-Model B 0.1, 0.3, 0.03,
pA-Model B 0.25, 0.3, 0.075,
pB-Model B 0.05, 0.3, 0.015,

CASE 2 Estimate, Weight, Weighted estimate, Model averaged estimate
piA-Model A 0.9, 0.7, 0.63, 0.9
pA-Model A 0.05, 0.7, 0.035, 0.05
pB-Model A 0.25, 0.7, 0.175, 0.25
piA-Model B 0.9, 0.3, 0.27,
pA-Model B 0.05, 0.3, 0.015,
pB-Model B 0.25, 0.3, 0.075,

As you can see, the model averaged estimates are quite different resulting in the single estimate for p across mixtures (using (piA*pA)+((1-pA)*pB) to be 0.137 for case 1 and 0.07 for case 2. Has anyone esle run into this problem? Is there any way in MARK to ensure that the estimate of pi for each model reflects the same proportion?

by **jlaake** » Thu Mar 10, 2011 1:30 pm

Not 100% certain but the clogit link may do that for you. It seems to be designed to do that. Unfortunately, I haven't put it in RMark so you would have to do it in MARK.

--jeff

by **cooch** » Thu Mar 10, 2011 1:49 pm

jlaufenb wrote:As you can see, the model averaged estimates are quite different resulting in the single estimate for p across mixtures (using (piA*pA)+((1-pA)*pB) to be 0.137 for case 1 and 0.07 for case 2. Has anyone esle run into this problem? Is there any way in MARK to ensure that the estimate of pi for each model reflects the same proportion?

From Chapter 14,

"So, you do an analysis using a closed population heterogeneity abundance model, and derive an estimate of Pi. Perhaps you've built several such models, and have a model averaged estimate of Pi. So, what do you `say' about this estimate of Pi?

Easy answer - generally nothing. The estimate of Piis based on fitting a finite mixture model, with a number (typically small) of discrete states. When we simulated such data (above), we used a discrete simulation approach - we simply imagined a population where 40% of the individuals had one particular detection probability, and 60% had a different encounter probability . In that case, because the distribution of individuals in the simulated population was in fact discrete, then the real estimate of Pi reflected the true generating parameter. However, if in fact the variation in detection probability was (say) continuous, then in fact the estimate of Pi reflects a `best estimate' as to where a discrete `breakpoint' might be (breaking the data into a set of discrete, finite mixtures). Such an estimate is not, generally, interpretable. Our general advice is to avoid post hoc story-telling with respect to Pi, no matter how tempting (or satisfying) the story might seem."

So, in short, are you sure that Pi has any defensible biological meaning? In the vast majority of cases where I've seen finite mixture models used, Pi is essentially uninterpretable. I'd say that was a bigger issue (potentially) than putting constraints on things.

by **jlaufenb** » Thu Mar 10, 2011 5:46 pm

Thanks for the replies guys. I have some ideas what might be driving pi, but have no interest in post-hoc story telling and don't intend on saying anything about pi. My study is interested in the implications that sampling intensity has for reliably estimating N and how high capture probabilities must be to effectively account for heterogeneity. That is why I want to calculate a single p across mixtures from the model averaged estimates to look at the relationship between p and support of heterogeneity models across various sampling intensities. If estimates of pi converge at opposite mixtures for different models, the mean p estimate across mixtures is higher than if all pi's coverge at the same mixture. I think it would be misleading to report that larger mean p. Jeff, this problem has arisen in the analysis that you have helped me with in the past. Given the number of replicate datasets and number of models, it is impossible for me to sort through output for each model and check what the estimates of pi. Is there any chance the clogit will be implemented in RMark soon?

by **jlaake** » Thu Mar 10, 2011 6:07 pm

No plans on putting in clogit to RMark. It is like mlogit and not simple to add because the links aren't independent of parameters. If I understand what you said, may I suggest that what you really want to do is to model average the average detection probabiliity rather than model averaging the parameters and then computing the average detection probability. That should handle any variation in the ordering of pi. You should be able to do this by getting the estimates from each model, computing the average probability for each model and its variance (with the delta method), and then using model.average in RMark via the list approach. See ?model.average.list

--jeff

by **vankatwy** » Wed Apr 17, 2013 3:44 pm

jlaake wrote: You should be able to do this by getting the estimates from each model, computing the average probability for each model and its variance (with the delta method), and then using model.average in RMark via the list approach.

Hi Jeff,
I am interested in finding the average capture probability and found your comment useful. But when working with a time varying model I end up averaging the model averages of p for each occasion and that enters the murky waters of Simpson's Paradox. So is it valid for me to average the p estimates from each model (a time varying, behavioural, and constant model) then model average the p estimates? (My goal here is to find one value of p per input.)

Thanks in advance for any advice you may have.
-Kristin

by **egc** » Wed Apr 17, 2013 6:19 pm

I'll leave this topic in place for now - but it is verging into 'RMark' land.

by **murray.efford** » Thu Apr 18, 2013 12:25 am

Jeff's suggestion seems eminently reasonable if you think of the mixture model as a 3-parameter random effect with a particular mean, variance, and skewness. Averaging the other parameters doesn't make sense to me as they work together, essentially Evan's point from The Book.

Kristin - I wonder if you can be more specific about your aim. By 'input' you mean...? If p varies by time and behaviour, rather than the mean it seems more useful to consider the overall probability an individual is detected at least once i.e. 1 - prod(1-p_t) where p_t is the probability a naive animal is caught at time t (potentially the pi-weighted mean of the mixture classes) and the product is over t.

Hoping to understand this better myself.
Murray

by **vankatwy** » Thu Apr 18, 2013 1:42 pm

Hi Murray,
Sorry for my lack of clarity. I am trying to find one p estimate for each grid sampled (therefore I want one p estimate for each grid .INP file I load into MARK...though I am transitioning to RMark but don't know the equivalent terminology). I have 3 models: constant, behavioural and time varying. I am concerned when the time or behaviour models are the top AIC models because that indicates it is oversimplifying to estimate just one p value for the grid. So your suggestion to find the probability an individual is detected at least once is useful, but I don't quite follow the equation you suggest. Also I am using Huggins without mixtures (so no pi estimate).
Thanks for your input,
Kristin

by **cooch** » Thu Apr 18, 2013 1:49 pm

vankatwy wrote:Hi Murray,
Sorry for my lack of clarity. I am trying to find one p estimate for each grid sampled (therefore I want one p estimate for each grid .INP file I load into MARK...though I am transitioning to RMark but don't know the equivalent terminology). I have 3 models: constant, behavioural and time varying. I am concerned when the time or behaviour models are the top AIC models because that indicates it is oversimplifying to estimate just one p value for the grid. So your suggestion to find the probability an individual is detected at least once is useful, but I don't quite follow the equation you suggest. Also I am using Huggins without mixtures (so no pi estimate).
Thanks for your input,
Kristin

Let p1 = probability of encounter at occasion 1. Let p2 = probability of encounter at occasion 2. Let p3 = probability of encounter at occasion 3. Thus, probability of being missed at occasion 1 is (1-p1). Probability of being missed at occasion 2 is (1-p2). Probability of being missed at occasion 3 is (1-p3).

Thus, the probability of being missed at all 3 occasions is (1-p1)(1-p2)(1-p3).

And, thus, the probability of being encountered at least once is

1-[(1-p1)(1-p2)(1-p3)]

Or, written more generally,

1-prod[1-p(i)]

www.phidot.org

Model averaging capture probabilities for mixture models

Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Re: Model averaging capture probabilities for mixture models

Who is online