Model-averaging and Variance-Covariance Matrix

questions concerning analysis/theory using program PRESENCE

Model-averaging and Variance-Covariance Matrix

Postby Bird Counter » Mon Sep 07, 2009 5:54 pm

There was an interesting thread earlier this year that dealt with model-averaging. http://www.phidot.org/forum/viewtopic.php?t=996&postdays=0&postorder=asc&start=0 on model-averaging and individual covariates.

Using Presence, I created a candidate model set with a global model and all subsets with no interactions or higher order terms. I then created model-averaged parameter and unconditional SE estimates using techniques described in the MARK text. For the covariate of interest, I set beta = 0 and SE = 0 in models where the covariate did not appear.

I would like to graph the effect of changes in the logit estimate over a range of values of the covariate of interest.

I am confused as to the best way to model-average the variance-covariance matrix for the purpose of creating the confidence intervals associated with the model-averaged logits.

In the thread mentioned above, Darryl made the following observation (post #8 ):

snip… There can also be issues if you have correlated covariates, as there you probably need to model-average the entire variance-covariance matrix, not just SE's….snip


So, if the covariates are not highly correlated (<0.7), if model-averaging the variance-covariance matrix necessary?

To summarize, I have two questions.

How do you model-average variance-covariance matrices?

In regard to covariates, how correlated is correlated? I hope this question makes sense. I understand why multicollinearity is problematic, but in this context I am not sure what to look for.
Bird Counter
 
Posts: 16
Joined: Wed Jul 29, 2009 4:18 pm

Postby jlaake » Mon Sep 07, 2009 11:13 pm

Not sure what Darryl had in mind with regard to the correlation betwen covariates as that is not relevant. What matters is the correlation between parramters in the model which maybe is what he meant. If you fit a simple linear regression (with a single covariate) then there is a correlation between the intercept and the slope for the single covariate. Let's say you had two models 1: intercept only and 2: intercept +covariate. It makes no sense to get just the unconditional std errors of beta for the covariate and one for the intercept and ignore the covariance between the intercept and slope in model 2. As that thread concludes, if your interest is prediction then it is best to model average the real parameters and the issue of covariance for the betas as not relevant because it is captured in the variance of the reals.

So let's say you have a single covariate and you are interested in values 0,10,20,30,40,50,60. All you need to so is compute the value of the real parameter (presumably Psi in this case) and it's std error for each model and use the standard formula for model averaging and the unconditional se. For models without the covariate you would use the same constant prediction value for each covariate value.

However if you have more than 1 covariate of interest then things can get a little messy as you really need to consider all possible combinations of each set of covariates and maybe this is what Darryl was getting at in regards to correlation in the covariates. The real parameter value of the model with covariate 1 =10 and covariate 2=0 would not neccesarily be the same as covariate 1=10 and covariate2 =10. Note that I'm just making up values here as examples. So if you have 2 covariates, then you need to create predictions for each combination of values for each covariate. But the process would be the same in which you computed weighted real values and the uconditional se. I highly recommend avoiding model averaging betas (as there are many potential pitfalls) and especially if your goal is prediction of the real parameters.

Now to answer your question, I believe the recommendation in B&A is to model average the correlation matrix of each model and then multiply the model averaged correlation matrix by the outer product of the unconditional std errors. The corr(x,y)=cov(x,y)/(se(x)*se(y) so you can get cov(x,y)=cor(x,y)*se(x)*se(y).

Hope this helps. --jeff
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby darryl » Tue Sep 08, 2009 6:11 am

I actually did mean correlation between covariates. I may be wrong on this, but I was thinking in terms of the multicolinearity issues where if you have highly correlated covariates, the effect sizes (beta estimates) can be unstable among different models depending which combinations of covariates are in the model in which model averaging betas may not be very meaningful.
darryl
 
Posts: 498
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand

Postby jlaake » Tue Sep 08, 2009 9:38 am

Yes your point about multicollinearity is yet another reason to model average real parameters rather than betas. But I think her comment about correlation between covariates was in respect to this line from your original posting.

"There can also be issues if you have correlated covariates, as there you probably need to model-average the entire variance-covariance matrix, not just SE's."

In fact anytime you are going to make predictions of the real parameters from the model averaged betas then you need to get a model averaged v-c matrix because all of the estimated parameters will have some degree of correlation. But bottom line is that you are better off using the real parameters for model averaging to avoid all sorts of pitfalls.
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby Bird Counter » Wed Sep 09, 2009 4:40 pm

Thank you both for your responses.

I apparently missed something fundamental about model-averaging using betas because I was unaware of these pifalls until I read the forum.

As for the issue of multicollinearity, I agree as to the problems created by highly correlated covariates. I see estimates of how much correlation is a a problem - often around 0.7. I suppose if the beta estimates of covariate A vary dramatically when covariate B is included, that might be an indication even if the covariates did not exceed some correlation threshhold.

An open question to the statiticians and biologists alike:

What information would you like to see included in tables in papers that use model-averaging to derive psi estimates?
Bird Counter
 
Posts: 16
Joined: Wed Jul 29, 2009 4:18 pm


Return to analysis help

Who is online

Users browsing this forum: No registered users and 1 guest

cron