www.phidot.org

by **jackieguzy** » Mon Jan 04, 2010 11:43 am

I work with frogs and wetlands and have a large data set for only 12 sites (sampling effort: 181 days, 15 species). Even when collapsing the sampling effort to 20 weeks (which is more biologically relevant), I have a sparse data set...so lots of 0's relative to 1's. As a result, modelling in PRESENCE didn't work (all estimates for individual species had std. errors of zero and CI's of 1). I then moved to generalized linear modelling because I have continuous and categorical site covariates. Results make sense, but I found it interesting that most (if not all) of the covariates in the top models for the GLZ (which also are significant) were the same in top models with PRESENCE.

It is possible to still use the PRESENCE results to say which covariates could matter regarding occupancy and detection since the GLZ cannot separate them?

I have RTFM but not found much guidance on sparse data. Does anyone have any thoughts? I appreciate your time in reading this.

Jackie Guzy

Website · by **dhewitt** » Mon Jan 04, 2010 2:55 pm

Jackie,

You can't squeeze blood from a turnip!

Occupancy estimation will be tough in your case with only 12 sites, no matter what. It will be the proportion of those 12 sites that were inhabited at some point. How general will your results be based on 12 sites?

Furthermore, it sounds like detection probability is really low if you have that many zeros. Low detection prob is a major problem for occupancy models (thus your estimation problems), just as low encounter prob is for capture-recapture models. AND, most relevant to your question, if detection is THE big issue in the data set, the GLM approach is bad -- it ignores detection probability.

So, ... I don't think there is much hope in estimating occupancy with your data set, and thus no chance of determining what influences occupancy or detection. You really need to find a better survey method that has a decent (say, >30%) chance of detecting these frogs when you visit (given that they are there).

Two other points:

It seems like your season is a bit long if you are *reducing* 181 surveys into 20 weeks! Does occupancy not change over such a long period, or are you using multi-season models?

You note that all estimates (presumably for all models) in Presence gave 0 std errs and CIs of 1. I don't see how you have inferred the importance of any covariates in your occupancy models if this is the case. Estimation failed, so why would you trust the results from the model set as a whole? Maybe I am missing something here.

- Dave

by **jackieguzy** » Mon Jan 04, 2010 5:43 pm

First, thank you for your time; I appreciate it. To clear up a couple of things...

I sampled over two years during the summer breeding season for my frogs. I am not interested in extinction or colonization so I used a single season analysis and combined the summers. I am very confident that I detected all species that were there, as I used automated recording devices programmed to record for five minutes of every hour, all night each night over 181 nights. I collapsed the detection histories into two week intervals because if you sampled for two weeks during the breeding season, you should definitely be able to detect the species if it was there, especially since sampling is restricted to times when the animals are very "detectable" since they are calling to breed.

You are right, detection probabilities are low for these frogs (10-30%), thus occupancy modelling is hard.

Detection is not what I am interested in/not the big question, but rather which covariates help predict occupancy for the 7/15 frogs that don't occur at all wetlands.

When I mentioned significant covariates, I was referring to results from the generalized linear model; I wasn't inferring importance of covariates from the PRESENCE models. In the GLZ, each species was significantly predicted by one or two site covariates, which made biological sense.

Website · by **dhewitt** » Mon Jan 04, 2010 6:22 pm

By combining all the surveys from the two summers, you are assuming that none of the wetlands changed occupancy status for any of the species. Is that really reasonable?

If I read you correctly, you are only interested in 7 of the 15 species. The other 8 species occur at all the wetlands, so you simply document that and move on. Right? If so, for each of the 7 remaining species, you want then to ask what habitat features lead to their occupying some of the wetlands and not others. Perhaps species 6 likes a particular wetland plant, and without that plant being present you don't find species 6. Or maybe this is a multispecies model? And, if what you say is true -- that the species are very detectable and you are certain you detected all species present at a wetland (presumably across each 14-day interval) -- your histories should all be nothing but 1s. I must be misinterpreting what you are saying, because detection probs should be much higher (1, actually) if this were the case, not 10-30%. And, if this is the case, you know that there is no uncertainty about detection so it would be OK to use the GLM approach, but you would still be limited by n=12; can any inference be considered general?

I guess I am a bit unclear about what your observations are that create the encounter histories for analysis. My confusion probably comes from not being a frog/amphibian researcher. Do you review each night's tape and check presence/absence for each night based on whether you hear the species? Then collapse across 14-day intervals, creating a "1" for any wetland at which the species was heard on at least one night in that stretch? This is what I was thinking of in writing the above.

by **jackieguzy** » Tue Jan 05, 2010 1:08 pm

Dave,

For one part of my question it is reasonable to combine data from the two summers. I have a drought year and flood year and combined they give a "average" for wetland "heatlh". The question is wetland based, not so much frog based and focuses on qualities that make a wetland suitable for frogs. The site covariates do not change during the two summers. In any case, I am looking at years seperately as well.

You are correct in your inferences... For the GLM I have collapsed detection histories, so I have a 1 or 0 for each species over each wetland. The 10-30% detection is for the frog overall...so Frog X showed up 7 out of 20 weeks, for a 35% detection. If you go out to that wetland any given week during the breeding season, you have a 35% chance of detecting it.

12 sites is obviously not ideal, anything I could infer comes with a large caveat.

I have a very unique dataset and sparse data because I oversampled (for various reasons). So my intent in posting was to ask if PRESENCE could be useful afterall. I think the best that can be done is to use psi(.),p(.) models to get basic probabilities that species were missed, use those estimates and decide if you really think you missed a species, cautiously add it if you did, then run other analyses to analyze covariate data.

www.phidot.org

Sparse data...used a GLZ approach. Can I still use PRESENCE?

Sparse data...used a GLZ approach. Can I still use PRESENCE?

Blood from a turnip

Who is online