Detection probabilities model

Forum for discussion of general questions related to study design and/or analysis of existing data - software neutral.

Detection probabilities model

Postby smit8051 » Fri Feb 15, 2013 3:58 pm

Hi,
I am interested in using detection probabilities (p) to compare sampling techniques for a rare fish species in several river systems. We are currently using the multi-method technique under the single-season model that was outlined in the Nichols et al. 2008 manuscript. I would like to determine if there is a gear effect (i.e., do detection probabilities vary by sampling technique). To do this I have coded a model (p(.)) where detection probabilities do not vary by technique (i.e., they are constant) and a second model where detection probabilities vary by technique (p(gear)). To generate the comparison of detection probabilities by sampling technique I used several sample-level covariates to code for the different sampling techniques. I should also mention that due to our interest in comparing detection rates, we held psi and theta constant in these analyses. When I had fit these models, I noticed that my global model (p(gear)) had poor fit (probability of test statistic greater than or equal to observed (pr(TS≥OBS)) = 0.009 and had a c-hat of 15.348). The model with the sampling technique covariates (p(gear)) had a much lower AIC score than (p(.)), but since the models did not fit well is it still valid to compare them? When I adjusted the c-hat score to 15.348 the model with constant detection probabilities had lower quasi-AIC scores.

Such poor fit was not unexpected in my model because I detected many more fish in one river system compared to the other. Once I added a river system covariate, the fit improved dramatically, although it still doesn’t fit well ((pr(TS≥OBS)) = 0.03 and a c-hat of 2.94). I would like to explore some a priori models in addition to the simple models of gear type and system, so fit would likely improve with several additional covariates. I do not believe there are any independence issues between sampled river reaches. Is there something I am missing, or is it likely the poor fit is simply from not including some influential covariates?

Finally, I would like to get estimates of the gear-specific detection probabilities and standard error for each river system. As expected, the hoop net technique that detected fish in more sampling events than other gears had a higher detection probability in the first river system. However, in the second river system the electrofishing technique accounted for two of the three sampling event detections. Despite a higher number of sampling detections in the second river system with the electrofishing technique the model estimates for detection probabilities were higher for the hoop net (0.044) in the second river system than electrofishing (0.0158) in the second river system. Does this seem reasonable? Shouldn’t the sampling technique that detected fish on more sampling events in the second river system have a higher detection probability that the other technique that only detected fish at one sampling event in that river system? What are your thoughts?

Sorry for so many questions,
Chris
smit8051
 
Posts: 4
Joined: Fri Feb 15, 2013 1:42 pm

Re: Detection probabilities model

Postby bacollier » Mon Feb 25, 2013 12:47 pm

smit8051 wrote:Hi,
I am interested in using detection probabilities (p) to compare sampling techniques for a rare fish species in several river systems. We are currently using the multi-method technique under the single-season model that was outlined in the Nichols et al. 2008 manuscript. I would like to determine if there is a gear effect (i.e., do detection probabilities vary by sampling technique). To do this I have coded a model (p(.)) where detection probabilities do not vary by technique (i.e., they are constant) and a second model where detection probabilities vary by technique (p(gear)). To generate the comparison of detection probabilities by sampling technique I used several sample-level covariates to code for the different sampling techniques. I should also mention that due to our interest in comparing detection rates, we held psi and theta constant in these analyses. When I had fit these models, I noticed that my global model (p(gear)) had poor fit (probability of test statistic greater than or equal to observed (pr(TS≥OBS)) = 0.009 and had a c-hat of 15.348). The model with the sampling technique covariates (p(gear)) had a much lower AIC score than (p(.)), but since the models did not fit well is it still valid to compare them? When I adjusted the c-hat score to 15.348 the model with constant detection probabilities had lower quasi-AIC scores.


Saw no one had answered so I thought I would give it a whirl, if nothing else someone will tell me I am wrong and you will get a better answer then.

Caveat emptor, I am not a fisheries scientist so I may not be as up on the literature in that field as I should be on how this might apply, so take my comments with a grain of salt.

I tend to think about fisheries in the 'capture probability sense', rather than detection probabilities, as I assume you are using some sort of multiple pass-sampling (sensu multipass sampling under different gear types) rather than present/absent relative to gear type used to capture them. Below you say you used a hoop net and electrofishing, can you provide details on how the sampling was actually implemented?

I am not really sure where/why of the Nichol's approach, as the focus of that paper (IMHO) was to look at overall occupancy of the target species (Psi) and at local scale (theta) (although see Efford and Dawson 2012 for some issues with this model). Were you sampling different locations with the system using different methods (as the Nichols structure was done), or were you repeating samples at same location using different methods? How many 'species' are you collecting data on? I guess I am asking what your data structure looks like as I cannot quite determine what it would look like based on what you said above.

As for the chat of >15, that seems to imply you have serious model issues, but you seem to notice that.

Such poor fit was not unexpected in my model because I detected many more fish in one river system compared to the other. Once I added a river system covariate, the fit improved dramatically, although it still doesn’t fit well ((pr(TS≥OBS)) = 0.03 and a c-hat of 2.94). I would like to explore some a priori models in addition to the simple models of gear type and system, so fit would likely improve with several additional covariates. I do not believe there are any independence issues between sampled river reaches. Is there something I am missing, or is it likely the poor fit is simply from not including some influential covariates?

Finally, I would like to get estimates of the gear-specific detection probabilities and standard error for each river system. As expected, the hoop net technique that detected fish in more sampling events than other gears had a higher detection probability in the first river system. However, in the second river system the electrofishing technique accounted for two of the three sampling event detections. Despite a higher number of sampling detections in the second river system with the electrofishing technique the model estimates for detection probabilities were higher for the hoop net (0.044) in the second river system than electrofishing (0.0158) in the second river system. Does this seem reasonable? Shouldn’t the sampling technique that detected fish on more sampling events in the second river system have a higher detection probability that the other technique that only detected fish at one sampling event in that river system? What are your thoughts?

Sorry for so many questions,
Chris


what is a river system? Are individual rivers your sampling unit? Do you only have 2? How many enc. histories do you have? Under a Nichol's model, Psi would be your 'river system' estimate of occupancy, while theta*Psi would be your sample location estimate. Is a constant theta appropriate in your system across species at the local scale? Are you sampling multiple times in multiple locations within the river, e.g., you sample say 50 locations within one river 3 times each? Just trying to figure out what design you used as most of your questions hinge on how you sampled.

Bret
bacollier
 
Posts: 231
Joined: Fri Nov 26, 2004 10:33 am
Location: Louisiana State University

Re: Detection probabilities model

Postby smit8051 » Fri Mar 01, 2013 1:14 pm

Our sampling design consisted of 500-m sampling reaches that were randomly assigned on two river systems. We subdivided the 500-m reaches into two 250-m sample sites. Once sampling sites were identified, one of the 250-m sites was sampled with small-mesh hoop nets and the remaining unit was sampled with a benthic trawl. The entire 500-m reach was sampled with electrofishing techniques (i.e., one electrofishing run in the upper river site along the bank and one run in the lower river site along the bank). Gear samples were replicated (i.e., 4 hoop net sets, 4 benthic trawl hauls, 2 electrofishing runs) at each sampling event. Sampled rivers were fairly large (e.g., 100 m minimum width), so we didn’t believe electrofishing would influence detectability of other sampling techniques. We collected data on all fishes, but for these analyses we are only interested in detectability of a single species. After our sampling was completed in the fall we did not catch any of the species of interest with the benthic trawl technique so we recently removed it from our occupancy models.

We have 11 sites in one river system and 15 sites in the other system. Sites were revisited from 2 to 6 times in the summer and fall seasons. We have a total of 62 sampling events with two sampling techniques (i.e., electrofishing and hoop nets) for a total of 124 encounter histories in the summer. In the fall we were unable to sample one of the river systems and only sampled 11 of the 15 sites in the remaining river system. We have a total of 26 encounter histories in the fall season. Since sites were the same in the summer and fall season, I have generated two separate models to investigate season-specific changes in detectability and relationships with habitat covariates.

Hopefully this clarifies my sampling design.

We are holding Psi and theta constant in these detectability analyses because our sole objective is to compare detectability rates among gears and evaluate the influence of habitat covariates. I wanted to minimize model variation by including only parameters that were of interest to us for this analysis. I also didn’t want to add any unnecessary parameters for my model because of the small sample size compared to the number of parameters I wanted to use. I do expect that theta*Psi (site occupancy estimate) would be different among sampling sites in a river system and I am planning to investigate those relationships in some models where Psi is the focus.

After we removed the technique that didn’t sample any of the species of interest our model fit improved dramatically and now the global models appear to fit the data pretty well (i.e., chat = 1.5 or less). I also figured out that if I added in a gear*system interaction term in my comparison of sampling techniques the detectability rates were more reasonable.

Is there no error parameter included in occupancy models? I have never seen or heard of them mentioned in any occupancy literature. Why are error parameters not included?

Thanks for letting me know about that new paper by Efford and Dawson (2012). I really appreciate your willingness to speak to me regarding these questions.

Chris
smit8051
 
Posts: 4
Joined: Fri Feb 15, 2013 1:42 pm

Re: Detection probabilities model

Postby bacollier » Mon Mar 04, 2013 2:26 pm

Sorry for slow response, was out of internet for a couple days.

smit8051 wrote:Our sampling design consisted of 500-m sampling reaches that were randomly assigned on two river systems. We subdivided the 500-m reaches into two 250-m sample sites. Once sampling sites were identified, one of the 250-m sites was sampled with small-mesh hoop nets and the remaining unit was sampled with a benthic trawl. The entire 500-m reach was sampled with electrofishing techniques (i.e., one electrofishing run in the upper river site along the bank and one run in the lower river site along the bank). Gear samples were replicated (i.e., 4 hoop net sets, 4 benthic trawl hauls, 2 electrofishing runs) at each sampling event. Sampled rivers were fairly large (e.g., 100 m minimum width), so we didn’t believe electrofishing would influence detectability of other sampling techniques. We collected data on all fishes, but for these analyses we are only interested in detectability of a single species. After our sampling was completed in the fall we did not catch any of the species of interest with the benthic trawl technique so we recently removed it from our occupancy models.

We have 11 sites in one river system and 15 sites in the other system. Sites were revisited from 2 to 6 times in the summer and fall seasons. We have a total of 62 sampling events with two sampling techniques (i.e., electrofishing and hoop nets) for a total of 124 encounter histories in the summer. In the fall we were unable to sample one of the river systems and only sampled 11 of the 15 sites in the remaining river system. We have a total of 26 encounter histories in the fall season. Since sites were the same in the summer and fall season, I have generated two separate models to investigate season-specific changes in detectability and relationships with habitat covariates.

Hopefully this clarifies my sampling design.


Yep, you assumed closure over the season, then you subsampled within a 500m reach, 1/2 of which was sampled with method 1, 1/2 of which was sampled with method 2, then the entire reach was sampled with method 3. Based on your above, it seems you are grouping (your replicated) sets of method 1 and method 2, such that any detection in any of the method 1 sets (hoop nets) means a detection of 1 for that sampling session.

We are holding Psi and theta constant in these detectability analyses because our sole objective is to compare detectability rates among gears and evaluate the influence of habitat covariates.

I wanted to minimize model variation by including only parameters that were of interest to us for this analysis. I also didn’t want to add any unnecessary parameters for my model because of the small sample size compared to the number of parameters I wanted to use. I do expect that theta*Psi (site occupancy estimate) would be different among sampling sites in a river system and I am planning to investigate those relationships in some models where Psi is the focus.


Wouldn't it have been simpler to just use a single-season occupancy model if everything else is constant?

After we removed the technique that didn’t sample any of the species of interest our model fit improved dramatically and now the global models appear to fit the data pretty well (i.e., chat = 1.5 or less). I also figured out that if I added in a gear*system interaction term in my comparison of sampling techniques the detectability rates were more reasonable.


Both of which make sense.

Is there no error parameter included in occupancy models? I have never seen or heard of them mentioned in any occupancy literature. Why are error parameters not included?


This goes back to the basics of probability theory/logistic regression, in a nutshell (although there are several ways to say this so no one jump on me), but in these models the observation is Y and we assume Y follows a Bernoulli distribution with parameter p. The Bernoulli distribution, which is a single parameter distribution (aforementioned p) with values for mean (p) and variance (p(1-p)) hence all the variability is known.

Hope the above helps you out some,

Bret
bacollier
 
Posts: 231
Joined: Fri Nov 26, 2004 10:33 am
Location: Louisiana State University

Re: Detection probabilities model

Postby smit8051 » Mon Mar 04, 2013 3:26 pm

Yep, that helps. I think I have a little better understanding of the models now.

Thanks,
Chris
smit8051
 
Posts: 4
Joined: Fri Feb 15, 2013 1:42 pm


Return to analysis & design questions

Who is online

Users browsing this forum: No registered users and 1 guest