Page 1 of 1

Abundance estimate with CJS

PostPosted: Tue Mar 21, 2017 8:32 am
by SoConfused
Hello,

I have a CJS model with an individual covariate for phi (weight at first capture). I'm trying to use the model to estimate abundance + bootstrapped CIs, following this post https://rpubs.com/ogimenez/201392.
I ran into a problem where the estimated abundance is outside of the 95% CIs, so I started digging deeper, which brought these 2 questions:

1) When I run the base model multiple times (no resampling), I get slightly different estimates for phi and p. It looks like there's a random process somewhere; is that correct? Setting seed did not make the results reproducible...
2) In the post I'm referencing, the author runs bootstrapping by resampling pseudo$ch after pseudo.proc is created. Why this way and not either resample pseudo.proc$data$ch, or resample pseudo$ch and then create pseudo.proc?

Re: Abundance estimate with CJS

PostPosted: Tue Mar 21, 2017 12:35 pm
by jlaake
You'll have to ask the author. This is not really an appropriate question for this forum. Unless the ch is changing the estimate will not change, so check your code. There is no random process in the MLE fitting although different starting values can cause very small changes (<1e-4) in the estimate.

Re: Abundance estimate with CJS

PostPosted: Tue Mar 21, 2017 2:08 pm
by SoConfused
Thank you for getting back to me. I will ask the author about question 2. Regarding question 1 - the code is as simple as they come, and I still get the difference in results...

Code: Select all
data1.analysis=function(){
Phi.weight.Country1=list(formula=~NotCanada + IsCanada:Weight)
p.t.x.Country = list(formula=~ Country*time)
cml=create.model.list("CJS")
mark.wrapper(cml,data=data1.proc, ddl=data1.ddl, output=FALSE, retry = 0, delete = FALSE)

data1.results <- data1.analysis()
data1.results[[1]]$results$real


I ran the last 2 rows twice in a row, with no other changes, and get (slightly) different results. Are these the small difference you mention? Would the starting values change between runs, if unspecified? For example, for phi:

run 1
estimate se lcl ucl
Phi g1 0.9699172 0.0699937 0.2264403 0.9997185
Phi g2 0.8193801 0.0639776 0.6603515 0.9136821

run 2
estimate se lcl ucl
Phi g1 0.9699304 0.0699929 0.2261803 0.9997192
Phi g2 0.8193858 0.0639788 0.6603519 0.9136880

Re: Abundance estimate with CJS

PostPosted: Tue Mar 21, 2017 3:04 pm
by jlaake
Yes these are the very small differences I was mentioning. It may be due to singular parameters like in this example where I used the dipper data with your code. I do get very small differences in the estimates. You would have to ask Dr. Gary White about this because it is happening in mark.exe and not in RMark.

Code: Select all
data(dipper)
data1.proc=process.data(dipper)
data1.ddl=make.design.data(data1.proc)

data1.analysis=function(){
Phi.weight.Country1=list(formula=~time)
p.t.x.Country = list(formula=~time)
cml=create.model.list("CJS")
mark.wrapper(cml,data=data1.proc, ddl=data1.ddl,  retry = 0, delete = FALSE)
}
data1.results <- data1.analysis()
data1.results[[1]]$results$real$estimate

Re: Abundance estimate with CJS

PostPosted: Tue Mar 21, 2017 4:26 pm
by gwhite
You can expect very small differences in the results when you have specified that the mark.exe code use parallel processing. The reason is that the order in which the encounter histories are processed can affect the result. Because the encounter histories are processed spread over the available threads, and this allocation is pretty much random due to other tasks, you can get different results. The cause of these differences is because of the summing of the -2log likelihood values. If you sum a "small" value to a "large" value, the effect of the small value may be completely lost. In contrast, summing a series of small values can result in a large enough value that when added to a large value, some effect is retained. In other words, the order in which values are added together can affect the result slightly -- a common problem with large calculations.

If you don't want to see these minor differences, then specify the calculation to be done with a single thread.

I first realized this issue when doing a workshop shortly after I got the parallel processing option to work. Members of the class were getting slightly different results and brought it to my attention. The most visible effect is the computation of the SE of unidentifiable parameters. Some cases you end up with a zero SE, in others a large positive value resulting from dividing by near zero. Very disturbing until I realized the cause.

Gary

Re: Abundance estimate with CJS

PostPosted: Wed Mar 22, 2017 9:30 am
by SoConfused
Thank you for that! I'm not sure how to restrict it to a single thread - google tells me that I would have had to use mark.wrapper.parallel() for parallel processing, but I'm just using mark.wrapper(). Is there an internal command that sets parallel = FALSE somewhere that I haven't found?

Re: Abundance estimate with CJS

PostPosted: Wed Mar 22, 2017 10:07 am
by cooch
SoConfused wrote:Thank you for that! I'm not sure how to restrict it to a single thread - google tells me that I would have had to use mark.wrapper.parallel() for parallel processing, but I'm just using mark.wrapper(). Is there an internal command that sets parallel = FALSE somewhere that I haven't found?


From the command line, when running MARK, you set the threads using -threads option. For example,

mark -threads=-3 i=dipper.inp o=dipper.lst

Here, -threads=-3 means use 3 threads less than maximum. So, if your machine has max 4 threads, the above would run MARK using only 1 thread (4-3=1). If your machine had max 8 threads, you'd use -threads=-7, and so on.

How you accomplish this with RMark is something Jeff would have to answer.

I'll restrict further comment on the 'utility' of deriving abundance estimates from an open CJS model in the first place, and make the point that whatever small differences you see in running using multiple cores will be trivial relative to the uncertainty in the abundance estimates.

Re: Abundance estimate with CJS

PostPosted: Wed Mar 22, 2017 10:49 am
by SoConfused
I'll restrict further comment on the 'utility' of deriving abundance estimates from an open CJS model in the first place


I tried running the POPAN model for my data, but I seemed to have run into a problem (and posted a question here http://www.phidot.org/forum/viewtopic.php?f=21&t=3440). It seems to be due to including an individual covariate (weight at release) in the survival estimates? Do I have to choose whether I want to estimate survival well (and use a covariate with CJS) or get an abundance estimate (and use POPAN with no covariate)?

Re: Abundance estimate with CJS

PostPosted: Wed Mar 22, 2017 11:10 am
by jlaake
Use argument threads=1 in call to mark or mark.wrapper.