Advice on Simulated Encounter History Generation

questions concerning analysis/theory using program MARK

Advice on Simulated Encounter History Generation

Postby mtreglia » Tue Apr 14, 2009 1:47 am

Dear all,

First, I apologize for this long-winded post.

I am more seeking to confirm that I am doing things correct for a simulation than anything. I am aware of the assumptions I am making for this closed population model, but interested in making sure my methodology makes sense.

The scenario:
Basically I have used a Huggins CC model to assess detection probability of a known population (lizards that were surveyed in enclosures within their natural habitat), and it is really low (<0.1) for both males and females. I know these low detection probabilities are of limited use, but my goal is to simulate encounter histories for all individuals in populations of size "x" for two and three occasions. Then I will be using those encounter histories to generate hypothetical population estimates for these "known" population sizes using a simple lincoln-peterson calculation to show how much we typically underestimate population sizes of this species in this area. I am perfectly fine with MARK simply outputting the N(hat) for a number of iterations of these simulations.

What I have done:
I am simulating a Huggins CC Model, simply p(g)=c(g) where g=sex. I set up 2 groups, and 2 encounter occasions. I am inputting the respective detection probabilities for all parameters relating to males and females, and checking the boxes for "Derived Estimates" and "Input Data in Output", and running 100 simulations for 100 Males and 100 Females. I am getting Deriv1 and Deriv2 in my output which are reading numbers that I would generally expect (~25 males, ~17 females).

Does this make sense? Or am I missing something?

Thank you to all who take the time to look at this- I greatly appreciate your input.

Best,
Mike
mtreglia
 
Posts: 12
Joined: Sun Apr 12, 2009 1:22 pm
Location: Texas A&M University

Postby mtreglia » Mon Apr 20, 2009 12:21 pm

Dear All,

In regards to my previous post I think I am getting something horribly wrong in my simulations (or more likely my understanding of what I am doing).

I understand that if I hypothetically have a closed population of 100 individuals and the detection probability (constant through time and across individuals) is 0.1, if I have 2 encounter periods I should typically catch 10 individuals on the first encounter period, and then another 10 on the second, 1 of which would be marked, correct? So a Lincoln-petersen estimate would estimate 100 individuals on average (with huge confidence intervals). But, I know there will be variation in how many are caught only once, and total number of recaptures so I thought it would be worthwhile to simulate these encounter periods for known populations to show this variability.

When I run a Huggins CC model simulation with 2 encounter periods, specifying the appropriate detection probabilities as betas (p(g)=c(g)) I'm getting low, but almost reasonable estimates for my extremely low detection probability (males, 0.08, females, 0.04). For example for a population of 100 individuals I am simulating estimates estimating of generally about 25 males, 17 females.

I expected an estimate somewhat near a Lincoln Petersen estimate, which for males, catching 8 on the first encounter period, 8 on the 2nd encounter period with 1 recapture gives an estimate of 64. So MARK is giving me a lower estimate than I would expect. Is this because conditionally MARK is estimating that I am more likely to encounter the same individuals twice?

Similarly, when I raise the detection probability to say 0.9, I am getting gross overestimates (typically about 5x the hypothetical population size). But, if I set the detection probability to 1, I get the exact hypothetical population size, which makes sense.

But in 100 iterations of simulations, does anybody have input on why I am getting such grossly wrong simulated population estimates?

Thank you for any assistance!
Cheers,
Mike
mtreglia
 
Posts: 12
Joined: Sun Apr 12, 2009 1:22 pm
Location: Texas A&M University

Postby cooch » Mon Apr 20, 2009 1:56 pm

mtreglia wrote:
When I run a Huggins CC model simulation



Quick thought (I'm late for a meeting) - try something other than Huggins. Huggins can be really twitchy with extremely low capture probabilities. For p<0.2, I've had real problems with Huggins (full likelihood models are much better behaved for low p, it seems).
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Advice on Simulated Encounter History Generation

Postby jhines » Mon Apr 20, 2009 3:20 pm

You're right to expect 10 individuals captured in time 1, 10 in time 2, and 1 marked in time 2, when p=0.1. When p=0.8, you should expect 8 in time 1, 8 in time 2, which 0.64 are marked. So, if you averaged the number marked in time 2, you'd get 0.64. However, the individual simulations will be 0, 1, 2,... marked in time 2. When it's zero, p=0, and population size is infinity. If you computed the average N from the average p, that would be OK, but you wouldn't be able to compute an average N from the individual N's. The more serious problem occurs when the number captured in time 1 or time 2 (n1,n2) is zero. With p=0.8, it probably doesn't happen very often, but with p=0.4 it does. In that case, you can't even compute p (p=0/0). So, what do you use to compute the average p? If you throw that iteration out, you're biasing the overall p because you're omitting certain cases.

If you're able to compute p for all simulations, I'd expect the estimated p to be close to the input p. (I tried it with 0.08, 0.07, ...0.04 and they matched pretty well.) When you start throwing out simulations, I expect them not to match as well.

Jim
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Postby mtreglia » Mon Apr 20, 2009 3:47 pm

Evan and Jim, Thank you both for your input.

I had tried simulating closed-capture models which give me reasonable estimates down to about p=0.2, so that solved the issue I was having with getting high estimates at high capture probabilities.

Then, p<=0.2 and I get a few super high estimates (orders of magnitude high), which is a result of essentially no recaptures, correct? The number of high estimates increases with decreasing capture probability. So, what is being shown here is that when capture probability is so low, you can't really derive useful population estimates, correct? Also, Jim, how did you compute p for all simulations? Is that from an optional output of the simulations?

Thanks again- I really appreciate the your time.
Cheers,
Mike
mtreglia
 
Posts: 12
Joined: Sun Apr 12, 2009 1:22 pm
Location: Texas A&M University

Postby darryl » Mon Apr 20, 2009 5:04 pm

mtreglia wrote:So, what is being shown here is that when capture probability is so low, you can't really derive useful population estimates, correct?



Not with only 2 capture occasion, so possible solutions are to either have a greater number of capture occasions, or look at alternative method to increase p.

Darryl
darryl
 
Posts: 498
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand

Postby jhines » Mon Apr 20, 2009 5:12 pm

I used the simulation procedure in MARK, with 1 group and p=.08. In the simulation specification tab, you can tell it to output the 'real' parameters and standard error. You can tell it to print the 'derived' parameter (N), but that will give errors when p=0.

It should be 'all or nothing' when simulating data. If m2 (# marked in time 2) is zero, p=0, and N=infinity. Just one of these would make computation of an average N impossible. However, computing the N from the average p would be possible.

How are you simulating the data? With the 'simulation' menu in MARK?
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Postby mtreglia » Mon Apr 20, 2009 7:13 pm

Jim,

I am using the Simulation menu in MARK. Selecting closed captures, with 2 encounter occasions and 1 group. Then I am setting my PIM so that p is constant in time and p=c.

I am using that PIM for the True Model. I am setting p=0.08 in the Beta tab and 100 in Beta 2 for N (I am somewhat confused by that- is that correct?) This is using an Identity Link

In the Estimated Models tab I am using the current model again, without setting any parameters and using the Sin link.

In the Population Sizes tab I am setting N=100

So when I look at the real parameters, REAL1 is the p and REAL2 is the population size, correct? There is always a value for both (sometimes 0 for REAL1), but when the REAL2 is orders of magnitude high, is that MARK's way of saying it is an error?

If I check the box in the simulation module to "Input Data in Output", and look at the output it gives parameters such as a real and derived population sizes- are these the using only the simulations that did not error? The real parameter in that output for parameter 1 is 0.15; is that also using only the simulations that did not error?

Sorry to bombard you with all of these questions- I really appreciate your help. I have read the MARK guide what I thought was pretty thoroughly but haven't found answers to these questions- I realize they probably don't come up very frequently.

Cheers,
Mike
mtreglia
 
Posts: 12
Joined: Sun Apr 12, 2009 1:22 pm
Location: Texas A&M University

Postby cooch » Mon Apr 20, 2009 8:14 pm

mtreglia wrote:Jim,

I am using the Simulation menu in MARK. Selecting closed captures, with 2 encounter occasions and 1 group. Then I am setting my PIM so that p is constant in time and p=c.

I am using that PIM for the True Model. I am setting p=0.08 in the Beta tab and 100 in Beta 2 for N (I am somewhat confused by that- is that correct?) This is using an Identity Link


No - beta for N should be 1.0 (assuming identity link), and then you enter the 'desired' N in the corresponding tab.

In the Population Sizes tab I am setting N=100


Which is where you should set it.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby jhines » Mon Apr 20, 2009 9:25 pm

I thought you were using Huggins closed capture model, which doesn't have N as a parameter. That's what I used when I simulated, so I only had to enter p=0.08 as a beta value (identity link), and N=100 for the population size. I would think that this would behave better when p gets small, not causing N to blow up.
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Next

Return to analysis help

Who is online

Users browsing this forum: Google [Bot] and 2 guests

cron