www.phidot.org

by **dtempel** » Mon Feb 08, 2016 1:24 pm

I've completed a multi-season analysis of simple occupancy data (i.e., 0=detected, 1=not detected) for a single species. I had a number of site-specific covariates that varied across primary survey occasions, and I was interested in determining how much of the spatial variation and how much of the temporal variation were explained by the top model.

I ran my candidate models in PRESENCE and identified a top-ranked overall model using AIC values. Then I used MARK to run random-effects models. I ran two models, both using the best structure for detection probability: 1) extinction and colonization varied by year, and 2) extinction and colonization varied according to the covariates in the top-ranked model.

For these 2 models, I then used the 'variance components' module in MARK, where I specified a derived parameter (annual probability of occupancy) as a random effect and chose the 'intercept only' design matrix. The MARK output reports sigma^2, or the residual variance. I considered sigma^2 from the first model (i.e., colonization and extinction vary by year) as the unexplained temporal variation in occupancy, and then used the sigma^2 from the top-ranked model to determine how much of the temporal variation in occupancy was explained by the top-ranked model (the sigma^2 was always considerably lower for this model).

Is this a sound approach? If so, then I've identified how much of the temporal variation in occupancy has been explained by habitat (and sometimes climate) covariates. To address specifcally how much variation is explained by habitat, I suppose I could just remove any covariates that aren't habitat-related from the top-ranked model, and run that model.

by **cooch** » Mon Feb 08, 2016 2:11 pm

dtempel wrote:I've completed a multi-season analysis of simple occupancy data (i.e., 0=detected, 1=not detected) for a single species. I had a number of site-specific covariates that varied across primary survey occasions, and I was interested in determining how much of the spatial variation and how much of the temporal variation were explained by the top model.

I ran my candidate models in PRESENCE and identified a top-ranked overall model using AIC values. Then I used MARK to run random-effects models. I ran two models, both using the best structure for detection probability: 1) extinction and colonization varied by year, and 2) extinction and colonization varied according to the covariates in the top-ranked model.

For these 2 models, I then used the 'variance components' module in MARK, where I specified a derived parameter (annual probability of occupancy) as a random effect and chose the 'intercept only' design matrix. The MARK output reports sigma^2, or the residual variance. I considered sigma^2 from the first model (i.e., colonization and extinction vary by year) as the unexplained temporal variation in occupancy, and then used the sigma^2 from the top-ranked model to determine how much of the temporal variation in occupancy was explained by the top-ranked model (the sigma^2 was always considerably lower for this model).

Is this a sound approach? If so, then I've identified how much of the temporal variation in occupancy has been explained by habitat (and sometimes climate) covariates. To address specifically how much variation is explained by habitat, I suppose I could just remove any covariates that aren't habitat-related from the top-ranked model, and run that model.

No, I don't believe this is legitimate, for the same reason as you shouldn't include a temporally varying covariate in a time-dependent model.

But, more to the point, it might be simpler (and more defensible) to use ANODEV -- ANODEV provides a means of evaluating the impact of a covariate by comparing the amount of deviance explained by the covariate against the amount of deviance not explained by this covariate, which sounds like exactly what you're after. Often this metric is useful as an r^2 correlation coefficient for models.

In MARK, all you need to do is select the 3 models that you want in Analysis of Deviance (ANODEV) test (which follows Skalski et al. 1993) by clicking on them with the mouse to highlight them. You must select 3 models: (1) Global model -- model with largest number of parameters that explains the total deviance (fully time-dependent model, say), (2) Covariate model -- model with the covariate that you want to test, and then (3) Constant model -- model with the fewest number of parameters that explains only the mean level of the effect you are examining. The 3 models you select will automatically be classified based on their number of parameters as the Global, Covariate, and Constant models. Note that if the 3 models you select are not properly nested to form the ANODEV, MARK will not tell you. The program uses the number of parameters of each model to determine which is which.

Although I don't believe ANODEV is 'built-in' to PRESENCE, there is no reason you couldn't use PRESENCE output to construct the ANODEV manually.

by **dtempel** » Mon Feb 08, 2016 4:15 pm

You're right; it's essentially r^2 that I'm interested in, and I'll give that a try. Thanks for your help!

by **dtempel** » Mon Mar 28, 2016 5:01 pm

I'm getting some unusual results for one of my data sets with respect to analysis of deviance for simple occupancy. Three data sets are working fine, but not the fourth data set...

I did some preliminary modeling to determine the best structure for detection probability (p), then I ran three models:

1) Constant extinction and colonization, with best structure for p.
2) Full time model for extinction and colonization (different parameter estimate each year), with best structure for p.
3) Extinction and colonization depend upon two site-specific habitat variables, with best structure for p.

The problem is that the deviance for model 3 is actually lower than the deviance for model 2. I ran the models in PRESENCE as well, and got the same result. I'm not sure why this is happening. The habitat covariates aren't strongly correlated (r = -0.12), and all of the beta estimates are estimable with reasonable values and std errors. Has anyone encountered this problem before, or have any ideas why this is happening?

by **jhines** » Mon Mar 28, 2016 7:34 pm

You could get that sort of result if col/ext vary more among sites than among years. Since those two models (2 and 3) are not nested, there's nothing wrong with model 3 having a lower deviance than model 2. A more general model, where col/ext is season-specific and varies by covariates would have to have a lower deviance than those two models as those two models would be nested within it.

by **dtempel** » Tue Mar 29, 2016 10:37 am

Thanks, that makes sense. One thing I forgot to mention is that the habitat covariates also vary by time, in addition to site. So including season-specific effects and time-varying habitat covariates in the same model would be overfitting, correct? Maybe that's not an issue here because we don't mind overfitting in this case?

If that's not appropriate, I think another option would be just to obtain a single average value (over all years) of each habitat covariate for each site, so that habitat no longer varies by time. The habitat values typically don't change very much from year-to-year.

by **jhines** » Tue Mar 29, 2016 10:55 am

Not necessarily. Although the real estimates of p could be different for each site and survey when your habitat covariate varies by site and survey, the p's are still forced to be a linear function of the two beta values (on the logit scale). Generalizing the model to allow year-specific p's just allows a different slope and intercept each year. If you want to model col/ext as a function of habitat, you'll need to combine the covariates so that you have only one covariate value per season.

by **dtempel** » Tue Mar 29, 2016 6:57 pm

To model col/ext as a function of habitat, why do the habitat covariates have to be combined into a single covariate value? The model

ext(Covariate1,Covariate2), col(Covariate1,Covariate2)

is nested within the model

ext(Season,Covariate1,Covariate2), col(Season,Covariate1,Covariate2)

So forgive my ignorance, but why do Covariate1 and Covariate2 have to be combined (i.e. ext(Covariate12), col(Covariate12)?

by **jhines** » Tue Mar 29, 2016 10:21 pm

If habitat doesn't change within each season, then you don't need to do any combining. I was under the impression that you had covariates which changed with each survey within seasons. In that case, you would need to combine them so you only have one covariate value per season (to go with the col/ext parameter).

www.phidot.org

variance components for multi-season occupancy models

variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Re: variance components for multi-season occupancy models

Who is online