www.phidot.org

by **Hadu** » Mon Dec 05, 2005 4:11 pm

I found this post (see below) in the archives and think it is an important issue. Does anyone have an opinion on this issue, e.g.,

you fit 6 models to explain variation in some response:

(1) y= x + z + x^2
(2) y= x + x^2
(3) y= x + z
(4) y= x
(5) y= z
(6) y=int

where x and z are continuous covariates.

Model rankings based on the appropriate form of AIC strongly support models (4) and (2). Additionally, parameter estimates indicate that in the quadratic model (2) the X^2 term is not distinct from zero (i.e., the parameter estimate is small with 95% CI overlapping zero).

If your goal is to obtain the "best" estimate of x, and you model average, then the estimates of x from models 2 & 4 will receive the most weight. However, the parameter estimates for x from these 2 models differ greatly since x estimates different things in each of the models. Thus, as a result, the model-averaged estimate of x has a large unconditional SE.

Is it appropriate to model-average over all models in such situations, or should you average only over the subset of models that include x as linear trend, or base inference just on the top-ranked model?

Thanks for your input on this.

DJR

***********original post**************
andy smith
Posted: Sat Feb 21, 2004 4:06 pm
Post subject: model-averaging beta parameters

I am studying natural selection on body size in dragonflies. I constraint survival estimates using size as a covariate. I use both linear (i.e., survival~size) and quadratic (i.e., survival~size+size^2) models to test for directional and variance selection, respectively, and use the beta parameters as estimates of the effect of body size on survival. I compare models using AIC, but often no single model has unqualified support; therefore, I am interested in model-averaging the beta parameters over the set of candidate models.

The problem is, in the linear model, the size term estimates slope for the entire function while in the quadratic model, the size term estimates the slope of the function at the origin only. Thus, the linear terms from each model are not directly comparable and cannot be model-averaged. Does anyone have any advice on dealing with this issue? For example, can the linear term from the quadratic model be somehow transformed to estimate mean slope over the entire function?

Thanks in advance for any help

by **bmitchel** » Sat Dec 10, 2005 10:42 am

DJR recently posted a question about model averaging the following models:
y = x
y = x + x^2
where both models have support (high weights), but the quadratic term is small with a confidence interval overlapping zero.

DJR says: "the parameter estimates for x from these 2 models differ greatly since x estimates different things in each of the models. Thus, as a result, the model-averaged estimate of x has a large unconditional SE."

I disagree with the first statement. Assuming that these models are correct (i.e. that "x" represents something like "size" or some other variable in both models), x is the same thing in model one as it is in model two (size), and x^2 is the same thing in both models (size^2). Note that x^2 is constrained to equal 0 in the first model, but it IS a term in that model. Model fitting is the process of estimating slope coefficients for each term. There is no problem with model averaging the coefficients for x and for x^2. When you model average these two models, you will find that the unconditional SE for x probably increases... and the unconditional SE for x^2 will probably decrease (since you end up model averaging a value and variance of 0 (first model) with estimated values from the second model. How this all affects estimates of y (which is probably what you are really concerned about) is dependent on the covariance between x and x^2.

DJR then asks: "Is it appropriate to model-average over all models in such situations, or should you average only over the subset of models that include x as linear trend, or base inference just on the top-ranked model?"

What you do with your results can vary based on your purpose, and the model weights. If your best model has a very high weight (e.g., > ~0.9), it may be valid to use only the best model... however in this case you will probably find that model averaging barely changes the results. Regarding model-averaging all models or a subset of models, this is a tough question. In most situations, you will want to use all of your models, because this is the least biased approach. In some situations, you might exclude parameters that have little predictive value in order to speed up a complex spatial analysis, but you would have to acknowledge that it is a biased approach (this was recently pointed out to me by Gary White). The specific example I am thinking of is a spatially explicit optimization model of habitat suitability. In this case, you need to calculate GIS metrics for every cell for every covariate for every iteration of the model. If you use a 100x100 grid and 1,000 iterations, EACH metric must be calculated 10 million times. In this situation, I think there is a good argument for accepting some bias, and excluding metrics with CI's that strongly overlap zero.

Brian

www.phidot.org

Re-posting of: model-averaging beta parameters

Re-posting of: model-averaging beta parameters

Re: Re-posting of mode-averaging beta parameters

Who is online