design matrix problem

questions concerning analysis/theory using program MARK

design matrix problem

Postby procyon » Wed Feb 13, 2008 3:06 pm

Greetings all,
I've run into a problem and I'm hoping that someone may be able to point me in the right direction. I am trying to estimate raccoon abundance using Huggins closed capture modeling. My data is sparse, so I must pool it to estimate p and c. I have created an attribute group for each site and I'm also using individual covariates. Essentially when I attempt to open the reduced design matrix I am not able/allowed to specify the number of rows that I would like the matrix to have. Has anyone had this problem? Any suggestions are greatly appreciated. Yes, I have downloaded the newest version of MARK.

Thanks.

Bill
procyon
 
Posts: 1
Joined: Tue Feb 12, 2008 2:43 pm

Re: design matrix problem

Postby bmcclintock » Wed Feb 13, 2008 3:35 pm

Hi Bill,

You can only specify the number of columns when you open the reduced design matrix. The number of rows is determined by your PIM structure. It would be best if you consult the first seven chapters of "The Book" to get a good idea of how the PIMs and design matrix are related.

Best,
Brett
bmcclintock
 
Posts: 46
Joined: Mon Feb 12, 2007 6:10 pm
Location: NOAA National Marine Mammal Laboratory

design matrix problem

Postby ELR » Wed Feb 13, 2008 3:37 pm

Hi Bill,
I have recently had this same problem. I have 2 release groups for each of my treatments and in the .inp file I have assigned these release groups a group covariate/attribute. My starting model is the full CJS but I would like to report estimates for each treatment (not replicate/release group) so I must pool the groups. The only way I could pool my groups in the design matrix was to start off with the column # equal to the number of rows I want, then add columns.
Alternatively, I think we could loose the group attribute designation in the .inp file (which would make the design matrix cooperate better), but I am not sure how that effects the variance estimate (After all, the blue book has an entire chapter on the importance of replication, so do I really want to pool my groups in the .inp file?). I have run the analysis both ways (forcing the design matrix to pool and pooling groups in the .inp file) and the results are nearly identical (to within rounding error of course!) but maybe I just got lucky.
Clarification by the experts would be greatly appreciated.
Cheers, Erin
ELR
 
Posts: 7
Joined: Wed Feb 21, 2007 9:44 pm
Location: Vancouver, BC, Canada

design matrix problem

Postby ELR » Wed Feb 13, 2008 5:33 pm

Hi again,
I should have specified that once I created my general model with the PIMs, I created the same model with the design matrix, and from here on out I have been building linear models using the design matrix, as per the recommendation in the MARK book (page 7-34). This method is problematic if you want to pool your attribute groups in the design matrix, and thus I was having the same "how do I change my rows?" problem that Bill is having. (I described how I initially got around this in my last post although loooking back I realized that I had to run the pooled model from the PIMs and then open the corresponding DM).
After reading Jeff's response however (which was posted at the same time as mine so I didn't see it before I hit "submit"), I played around with the PIMs and DM and found another way to control the rows when you want to pool groups.
As Jeff said, the DM is controlled by the PIMs. So open the PIM chart for your general model, pool the groups, do not run this model, but go up to Design- reduced, and the window that will open will have the appropriate number of rows with the appropriate labels. Close the PIM chart, use the new DM.
If you have converted over to using only the design matrix for setting up models, you might not think of getting it started with the PIM chart. This would really only apply if you were pooling groups, e.g., for the dipper dataset, if you wanted to combine the males and females in the DM.

I have not seen any other discussions of pooling groups in the DM so I hope this helps, Bill. Thanks for making me rethink this problem, Jeff.

Erin
ELR
 
Posts: 7
Joined: Wed Feb 21, 2007 9:44 pm
Location: Vancouver, BC, Canada

Re: design matrix problem

Postby cooch » Wed Feb 13, 2008 7:24 pm

ELR wrote:Hi again,
I should have specified that once I created my general model with the PIMs, I created the same model with the design matrix, and from here on out I have been building linear models using the design matrix, as per the recommendation in the MARK book (page 7-34). This method is problematic if you want to pool your attribute groups in the design matrix,



I must be missing the point. You pool groups in a DM simply by deleting the column(s) (and interactions, as appropriate) corresponding to the grouping variable. For example, if I want to pool male and female dippers, I simply delete the sex column from the DM.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby abreton » Wed Feb 13, 2008 8:52 pm

I've found that Gary's advice (MARK Workshops at CSU ~2003 and 2004) to accomodate your general model in the PIMs, close them and then do all subseuqent model building/running from the design matrix is an efficient way to proceed in MARK. If you want to 'pool groups', you do not need to remove rows from your DM, you only need to change how they are coded.

For example, have a look at the design matrix on page 7-10 (Chapter 7) in the Gentle Intro: note that the 2nd column (half 1's; half 0's) codes for sex. To 'pool sexes' (groups), all you would have to do is remove column 2 from this DM. Check out other examples in Chapter 7.

I never asked Gary why he suggested the approach I described, but I suspect it is to reduce confusion...a real problem when your model results include a variaety of PIM and DM structures.
abreton
 
Posts: 111
Joined: Tue Apr 25, 2006 8:18 pm
Location: Insight Database Design and Consulting

Postby cooch » Wed Feb 13, 2008 9:11 pm

abreton wrote:I've found that Gary's advice (MARK Workshops at CSU ~2003 and 2004) to accomodate your general model in the PIMs, close them and then do all subseuqent model building/running from the design matrix is an efficient way to proceed in MARK. If you want to 'pool groups', you do not need to remove rows from your DM, you only need to change how they are coded.

For example, have a look at the design matrix on page 7-10 (Chapter 7) in the Gentle Intro: note that the 2nd column (half 1's; half 0's) codes for sex. To 'pool sexes' (groups), all you would have to do is remove column 2 from this DM. Check out other examples in Chapter 7.

I never asked Gary why he suggested the approach I described, but I suspect it is to reduce confusion...a real problem when your model results include a variaety of PIM and DM structures.



The DM represents a set of linear constraints applied to the underlying PIMs. If the PIMs change, then by definition so does the corresponding DM.

A thorough reading of Chapter 7 will probably help with some of the confusion a few folks are apparently having. You set the basic parameter structure of your problem with the PIMs, construct the DM corresponding to it, and then build models nested within the general model by modifying the corresponding DM (e.g., adding or deleting columns).

But, to reiterate, a simple pooling across levels of a classification variable for a given underlying PIM (parameter) structure involves recoding elements of the DM. It does *not* require recoding the PIMs.

And, finally, the number of rows in the DM are determined by the number of parameters in the PIMs, and the number of levels of one or more classification variables. So, for example, for a phi(t) model, with 7 occasions (6 intervals), and 3 groups, you'd have 6x3 = 18 rows for the parameter phi. Its really not much more complicated than that.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby jlaake » Wed Feb 13, 2008 9:38 pm

I assume that Erin meant Brett instead of Jeff. I don't remember posting on this subject.

In regard to the last posting, mixing PIMS is probably not a good idea because it can take more time getting all the PIMS correct. Also, I think it would make it very confusing to model average across models where the same parameter is represented by different indices in different models and I'm not even sure that the interface will allow model averaging in that case. Evan or Gary can correct me if I'm wrong on that.

This discussion is pertinent to the way RMark works. By default it uses the most general PIM (all-different) and then constructs the necessary design matrix for the model formula. Then transparent to the user, for most parameters it does change (simplify) the underlying PIM structure before it is sent to mark.exe to speed up execution times which are in part related to the number of rows in the design matrix. Then when it retrieves the parameter estimates from the output it does so with the most general PIM so model averaging is possible.
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

design matrix problem

Postby gwhite » Wed Feb 13, 2008 10:07 pm

All:

Andre is exactly right that I recommend setting up the PIMs for the global model, and then building all the reduced models with the design matrix. The reason is that you can use the parameter estimates from the simpler models to start the optimization of the more complex models. So, you build the PIMs, and then start with the dot model in the design matrix, i.e., the single intercept. You should always be able to get an estimate from the dot model. Now add in the group covariates, but retreive the parameter estimates from the dot model to start the optimization. MARK matches the beta parameters up by examining the columns of the design matrix when you retrieve parameter estimates from a previous model. So, the group covariates are started at zero, but the intercept starts at the value from the dot model. Now, do the same for the time model. Depending on whether there are more groups, or more time intervals, use one of these model to intiatilze estimates for the g+t model. Also, you can retrieve columns out of other design matrices to build the g+t model. That is, retrieve the g model, and then retrieve the time columns out of the time model to build the t+g model.
Finally, when you run the g*t model, use the product option to create the design matrix from the g+t model, and then retrieve estimates from the g+t model to start the optimization.
This strategy is both time efficient on your part, and is a good strategy to obtain the appropriate solution for the global g*t model.

Gary
gwhite
 
Posts: 340
Joined: Fri May 16, 2003 9:05 am

design matrix problem

Postby ELR » Wed Feb 13, 2008 11:02 pm

Hi All,
Yes, I did mean Brett. I've just seen Jeff's name on the forum so often...

If everyone can just humor me for a moment...
I understand that linear models have to be built in the DM, this is clear. The problem that I was trying to solve is how to pool multiple release groups from the same treatment and then use that for the global/starting model.
So for example, in response to Evan's 1st reply about pooling the male and female dippers, if we delete the column representing sex (and the interactions) in the DM then we are left with 6 columns and 12 rows for phi. Since we want the survival parameters for the entire group, we *know*, and have to remember, that the bottom half of the phi quadrant is to be ignored (rows 7-12). When we run the model we get 6 beta parameters for survival and 12 real/reconstituted parameters for survival. The real estimates for 1-6 and 7-12 are identical. Now we have to remember that parameters 7-12 in the real parameter output are to be ignored. With the dipper data set this is easy to figure out, but when we have huge DM's and multiple groups this could get very confusing.
Alternatively, we could pool the sexes in the PIM chart, open the corresponding DM, and wall-la!, there 6 rows for the phi parameters. There are also only 6 real phi parameters reported. (Could some one please try it and confirm this?) At this point you can add dummy or real covariates, add as many interactions as you please, to your pooled data with the DM.
I will venture to say that it sounds like this is what happens in RMark as well.
Erin
ELR
 
Posts: 7
Joined: Wed Feb 21, 2007 9:44 pm
Location: Vancouver, BC, Canada

Next

Return to analysis help

Who is online

Users browsing this forum: No registered users and 2 guests

cron