Dummy coding, AA.INP

questions concerning analysis/theory using program MARK

Dummy coding, AA.INP

Postby jbruggin » Thu Jun 09, 2005 3:28 pm

Section 2.1, chapters 6 and 7 and the answers to various posts have convinced me that dummy variables (e.g., sex) need to be coded, for example, as '1 0' for males and '0 1' for females rather than 1 for males, 0 for females. Thus, I'm confused by the example on page 2-7 (from BLCKDUCK.INP) where age appears to be coded as 0 for subadult and 1 for adult. Is this correct?

I didn't get the file AA.INP referred to in Chapter 6 with my recent download of MARK. Is it still available?
jbruggin
 
Posts: 14
Joined: Wed May 25, 2005 10:18 am

Re: Dummy coding, AA.INP

Postby cooch » Thu Jun 09, 2005 5:39 pm

jbruggin wrote:Section 2.1, chapters 6 and 7 and the answers to various posts have convinced me that dummy variables (e.g., sex) need to be coded, for example, as '1 0' for males and '0 1' for females rather than 1 for males, 0 for females. Thus, I'm confused by the example on page 2-7 (from BLCKDUCK.INP) where age appears to be coded as 0 for subadult and 1 for adult. Is this correct?


You need to distinguish between dummy variable oding in the design matrix, and the frequency coding in the input file. The former is discussed at length in Chapter 7, and the latter is discussed at length in Chapter 2.

I didn't get the file AA.INP referred to in Chapter 6 with my recent download of MARK. Is it still available?


The example data files referred to in 'the book' are not distributed with MARK, but are available on the website where you downloaded 'the book' (last item in the drop-down list).
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby jbruggin » Mon Jun 13, 2005 2:02 pm

OK, my question referred primarily to chapter 2 of ‘the book’ as I am primarily concerned with input file formatting at this point. Sorry if I’m being obtuse but I’ve reread it (again), and remain confused. Near the top of page 2-4 is the following example of how to code individual encounter histories:

110000101 1 0;
110000101 1 0;
110000101 1 0;
110000101 1 0;
110000101 0 1;
110000101 0 1;

Where “the coding ‘1 0’ indicates that the individual is a male, and ‘0 1’ indicates the individual is a female.”

In contrast, later in chapter 2 (page 2-7) this example of covariate coding is given:

/* 01 */ 1100000000000000 1 1 1.16 27.7 4.19;
/* 04 */ 1011000000000000 1 0 1.16 26.4 4.39;
/* 05 */ 1011000000000000 1 1 1.08 26.7 4.04;
/* 06 */ 1010000000000000 1 0 1.12 26.2 4.27;
/* 07 */ 1010000000000000 1 1 1.14 27.7 4.11;
/* 08 */ 1010110000000000 1 1 1.20 28.3 4.24;
/* 09 */ 1010000000000000 1 1 1.10 26.4 4.17;

Here the first covariate “is a dummy variable representing age (0 = subadult, 1 = adult)” in a single column.

Are not both examples dealing with frequency coding? If I need 3 columns to code for sex in the first example, why don’t I need 3 to code for age in the second? Do I need to digest chapter 7 in order to understand this aspect of chapter 2?

Thanks.

John
jbruggin
 
Posts: 14
Joined: Wed May 25, 2005 10:18 am

input file coding

Postby ganghis » Mon Jun 13, 2005 2:21 pm

John,

In the first case, sex is treated as a group, and in the second it is treated as a covariate. Which one you want to use will depend on your purposes... you generally will need to do more work in the design matrix if you treat it as a covariate. Note if there were more than 2 groups (e.g. male, female, and unknown) you would almost always want to code them as different groups in the input file.

-Paul
ganghis
 
Posts: 84
Joined: Tue Aug 10, 2004 2:05 pm

Postby cooch » Mon Jun 13, 2005 2:26 pm

jbruggin wrote:
Are not both examples dealing with frequency coding? If I need 3 columns to code for sex in the first example, why don’t I need 3 to code for age in the second? Do I need to digest chapter 7 in order to understand this aspect of chapter 2?

Thanks.

John


In INP file, you're not 'coding' for anything - each group (or sub-group...) needs a column. I suspect you're confusing design matrix coding with basically having a column of frequencies for each group in the INP file.

So, if your groups are 'males' and 'females' (as per the example), you have two columns of frequencies - one for males, and one for females. If you see

Code: Select all
1011101  1 0;


the last two columns are not coding columns, they are frequency columns.

In the second example, with individual covariates, 'sex' is itself coded as an individual covariate. And, it is indeed coded, so only one column is needed.

Paul has explained why you might choose one over another.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University


Return to analysis help

Who is online

Users browsing this forum: No registered users and 4 guests