Using AIC to select habitat variables (not a MARK question)

questions concerning analysis/theory using program MARK

Using AIC to select habitat variables (not a MARK question)

Postby sixtystrat » Fri Oct 07, 2005 4:09 pm

We are using logistic regression to predict occurrence of a plant species using GIS. We have about 600 locations of the plant and 10 GIS variables to go with each location. We are wondering it it is appropriate to use the entire suite of 10-variable combinations (1028 different models) to find the best ones (and then model average). The trouble is, we have the 10 variables and we have no real good biological rationale to eliminate some variables while keeping others in. However, we want to avoid mistakes due to data dredging. Any help would be appreciated!
sixtystrat
 

Re: Using AIC to select habitat variables

Postby bmitchel » Fri Oct 07, 2005 10:38 pm

Just a couple of thoughts regarding your post...

1) Logistic regression is only appropriate if you are certain that your absences are really absences. If it is difficult to detect your plant species when you survey, you may be better off trying to use an occupancy model (and your ability to do this would depend on your survey design).

2) If you really do not have any a priori knowledge of which variables might be better to help you inform your model set, then it may make sense to fit all possible models. Your variables are presumably ones that you felt would be important (as opposed to ones that would simply be easy to calculate), so you have already done some "a priori" variable selection to get this far. That said, when you fit all possible models (even with a subset of variables), to a certain extent you ARE data dredging (probably more polite to term it an "exploratory analysis"). Basically, the results of your all-possible-models analysis will hopefully provide you with some new insights, and allow future surveys/model sets to be better informed... but the analysis is not strictly "a priori".

3) It will probably help to look for correlations among your 10 variables (highly correlated predictors amount to model duplication and will produce unstable model averaging results).
bmitchel
 
Posts: 28
Joined: Thu Dec 09, 2004 9:57 am


Return to analysis help

Who is online

Users browsing this forum: No registered users and 1 guest