Dummy Individual Covariates and Real Parameter Estimates

questions concerning analysis/theory using program MARK

Dummy Individual Covariates and Real Parameter Estimates

Postby npseudacris » Tue May 09, 2006 12:17 pm

Greetings:

I want to use sex as an individual covariate as a dummy variable (I already have two groups based on adult morphology and I want to be able to create both additive and interaction terms with group and sex and a second covariate (condition) in the DM). What I am confused about are the settings for the real parameter estimates. For my other individual covariate (body condition at first capture) I’ve used the “standardize individual covariate” with the “mean value” to estimate real parameters, however, both the MARK book and the help say it doesn’t make biological sense to do the same when individual covariates are dummy variables. I see their point, but I’m not sure what a better alternative would be. For example what could I specify for the “user specified” option that would make more sense, because to specify one value would mean the whole population is one or the other sex and if that is the case then why have it as a covariate (it’s also not clear what you are allowed to specify, a single number or a range of values). I realize the third alternative is to use the individual covariate values for the first encounter history, but I wasn't sure how that works. Does that mean that the covariate is only used to calculate the Phi 1 or does it only apply when you have a time-varying covariate?

Thanks,
Nicole
npseudacris
 
Posts: 8
Joined: Tue Mar 28, 2006 1:27 pm
Location: Murray State University, KY

Postby jlaake » Fri May 12, 2006 5:28 pm

Nicole-

Not sure why you want to use sex as an individual covariate rather than a grouping covariate but I'll answer your question anyhow.

First of all the option regarding the covariates is only for computation of the real parameters. It does not affect estimation. Also you can always re-run the model and change the value of the covariate to get different real parameter estimates. For example, if you code the sex covariate as 0 for male and 1 for female (and you don't standardize covariates), then the beta intercept for your model will be for males and the beta for sex will be the additive amount that females differ from males. Now, if you specify the covariate value as 0, the real parameter estimates will be for males. Then you can re-run it and specify 1 for the covariate to be used for estimation and the real parameters will be for females. Note however that the results of the beta estimates, lnl, aic etc will all be the same as the model hasn't changed - just the covariate value used for prediction of the real parameter estimates. If you were to choose to use the mean covariate, the mean value of sex will be the proportion of females in the data. The reason this may not make sense is that the sex ratio in your data may be biased if p differs by sex or if it is a CJS model you may have released critters with a sex-ratio that doesn't match the population.

Hope this helps --jeff
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby egc » Fri May 12, 2006 8:40 pm

jlaake wrote:Nicole-

Not sure why you want to use sex as an individual covariate rather than a grouping covariate but I'll answer your question anyhow.



A side point - one reason is because it lets you deal with group effects without having to modify the PIM structure in any way.
egc
Site Admin
 
Posts: 201
Joined: Thu May 15, 2003 3:25 pm

Postby jlaake » Thu May 18, 2006 5:36 pm

Fair point but for something like sex with just 2 categories doesn't expand the PIMS substantially and groups are much faster to run than individual covariates in MARK. If you can accomplish the same task with groups, it is best to avoid individual covariates especially if you have a large data set (lots of critters) so it doesn't have to create a design matrix for each encounter history. Also, you don't have to re-run it to get predictions for the other sex.
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Postby egc » Thu May 18, 2006 7:12 pm

jlaake wrote:Fair point but for something like sex with just 2 categories doesn't expand the PIMS substantially and groups are much faster to run than individual covariates in MARK. If you can accomplish the same task with groups, it is best to avoid individual covariates especially if you have a large data set (lots of critters) so it doesn't have to create a design matrix for each encounter history. Also, you don't have to re-run it to get predictions for the other sex.


Yes, because of the way the likelihood is constructed, models with individual covariates take longer to run (this is mentioned at several points in the book). However, in some cases - groups with >2 levels, it can be more efficient to use an individual covariate coding to handle things, rather than PIMs.
egc
Site Admin
 
Posts: 201
Joined: Thu May 15, 2003 3:25 pm

Postby Todd » Sun May 20, 2007 2:50 pm

Just to follow up on this concept, can I include a dummy variable such as "treatment" as a covariate if it has three levels (1,2,3)?

Then I can test for a "treatment" effect on survival by including a model with the covariate and a model without?

I'm concerned that because the covariate is a 1, 2 or 3 in my encounter history input file it would falsely confine the model to think that treatment 3 has 3x the effect that treatment one does because the corresponding Beta will now be multiplied by 3, 2 or 1.

Is that a problem? Will there be a linear constraint now imposed? What if the relationship between the treatments is not linear?
Todd
 
Posts: 20
Joined: Thu Feb 10, 2005 2:07 pm

Postby cooch » Sun May 20, 2007 4:47 pm

Todd wrote:Just to follow up on this concept, can I include a dummy variable such as "treatment" as a covariate if it has three levels (1,2,3)?

Then I can test for a "treatment" effect on survival by including a model with the covariate and a model without?

I'm concerned that because the covariate is a 1, 2 or 3 in my encounter history input file it would falsely confine the model to think that treatment 3 has 3x the effect that treatment one does because the corresponding Beta will now be multiplied by 3, 2 or 1.

Is that a problem? Will there be a linear constraint now imposed? What if the relationship between the treatments is not linear?


The questions you're asking imply a basic lack of understanding of linear models. I suggest you read Chapter 7 - thoroughly. If you don't want your treatment effects to be evaluated by an ordinal, strictly linear model, then why did you encode them in your .INP file that way in the first place? If you read Chapter 7, and then look at Chapter 2 (data formatting), you'll not only be able to answer the question, but you'll also see what you want to do instead. If all you want to do is a basic 'ANOVA-like' linear model, looking for differences among 3 levels of a treatment, then you don't want to code things as linear covariates in the input file.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby abreton » Sun May 20, 2007 5:05 pm

Todd - You'll want to bring in your three treatments as 'groups'; not as an ordered (ordinal, discrete) covariate. If you already have 'groups' incorporated into your encounter histories (EH), then these will have to be expanded to accomodate the three treatment groups. Otherwise, assuming (e.g.) a four occasion study, an individual EH with three groups...

1011 1 0 0;
...an individual in the first group (or treatment).
0010 0 1 0;
...an individual in the 2nd group (or treatment)..
0101 0 0 1;
...an individual in the last group (or treatment)..

After specifying your three groups and importing EHs into MARK, note that you'll get PIMs specified for each group. Now you'll need to code these such that your global or most general model includes treatment effects. I generally code the global model in the PIMs, close them, and then fit/run all models in the DM (starting with the global PIM model with or without interactions depending on data sparsity). Testing for a treatment effect with treatments coded as groups in your EH is very straightforward. See chapters recommended by Cooch for details.
abreton
 
Posts: 111
Joined: Tue Apr 25, 2006 8:18 pm
Location: Insight Database Design and Consulting

Postby Todd » Mon May 21, 2007 10:03 am

cooch wrote:
Todd wrote:Just to follow up on this concept, can I include a dummy variable such as "treatment" as a covariate if it has three levels (1,2,3)?

Then I can test for a "treatment" effect on survival by including a model with the covariate and a model without?

I'm concerned that because the covariate is a 1, 2 or 3 in my encounter history input file it would falsely confine the model to think that treatment 3 has 3x the effect that treatment one does because the corresponding Beta will now be multiplied by 3, 2 or 1.

Is that a problem? Will there be a linear constraint now imposed? What if the relationship between the treatments is not linear?


The questions you're asking imply a basic lack of understanding of linear models. I suggest you read Chapter 7 - thoroughly. If you don't want your treatment effects to be evaluated by an ordinal, strictly linear model, then why did you encode them in your .INP file that way in the first place? If you read Chapter 7, and then look at Chapter 2 (data formatting), you'll not only be able to answer the question, but you'll also see what you want to do instead. If all you want to do is a basic 'ANOVA-like' linear model, looking for differences among 3 levels of a treatment, then you don't want to code things as linear covariates in the input file.


I'm aware of the difference between coding as a group versus coding as a covariate. And I'm also aware that continuous covariates, such as size, cause parameters to be evaluated in a linear manner unless you transform the covariates.

However, I'm also aware that coding for sex as a binary covariate (0 or 1, dummy variable) instead of as separate groups eliminates the consequent doubling of parameters in your PIM chart, while still allowing you test for sex effects in the Design Matrix by adding the "sex" covariate.

In the case of my models, if I enter my data as three groups, my starting PIM chart will have 350+ parameters.

Thus, I suppose my real question is whether "dummy variables" must be binary or not. If they are not binary, they must be linear?

The issue of dummy variables isn't handled clearly in any of the guide chapters and I don't recall that question coming up in the MARK workshop last year.
Todd
 
Posts: 20
Joined: Thu Feb 10, 2005 2:07 pm

Postby cooch » Mon May 21, 2007 10:13 am

I'm aware of the difference between coding as a group versus coding as a covariate. And I'm also aware that continuous covariates, such as size, cause parameters to be evaluated in a linear manner unless you transform the covariates.


You original post suggested otherwise.

However, I'm also aware that coding for sex as a binary covariate (0 or 1, dummy variable) instead of as separate groups eliminates the consequent doubling of parameters in your PIM chart, while still allowing you test for sex effects in the Design Matrix by adding the "sex" covariate.


This works because sex has only two levels.

In the case of my models, if I enter my data as three groups, my starting PIM chart will have 350+ parameters.


...which, undoubtedly will be reduced considerably anyway, so that shouldn't be a big deal.

Thus, I suppose my real question is whether "dummy variables" must be binary or not. If they are not binary, they must be linear?


You can use the 'individual covariate' trick in a number of situations, but the most typical use is for situations where a classification variable is binary. But, to handle multiple groups, using the covariate trick is not recommended (even if you can figure out how to handle it).

The issue of dummy variables isn't handled clearly in any of the guide chapters and I don't recall that question coming up in the MARK workshop last year.


Actually, it is. A dummy (or, more commonly, indicator) variables are discussed in some detail in chapter 7. What you're referring to as a dummy variable is simply using a binary 1/0 variable as an individual covariate to code for sex. We don't talk about using that approach for handling >2 groups for a simple reason - you probably shouldn't do it.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Next

Return to analysis help

Who is online

Users browsing this forum: No registered users and 5 guests