Discrepancies between age structure model using PIM and DM

questions concerning analysis/theory using program MARK

Discrepancies between age structure model using PIM and DM

Postby cmb » Wed Jun 15, 2005 9:42 am

I am a new Mark user, and have what is probably a fairly basic question. I'm try to fit a model with two age structures. Before doing anything fancy (constraining the model) I want to be certain I know what's going on with the fully time-dependent model. Consequently I've set up the model using the PIMs as described in the latest Mark book, and get a reasonable fit.

(PIM modeled as follows, though obviously larger)

1 7 8 9 10 11
. . 2 8 9 10 11
. . . 3 9 10 11
. . . . . 4 10 11
. . . . . . . 5 11
. . . . . . . . . 6

As constraining the model requires working with the design matrix, my first step was to construct the dm that corresponds to the fully time-dependent PIM, something like the following:

1 1 0 0 0 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0 0 0
1 1 0 1 0 0 0 1 0 0 0
1 1 0 0 1 0 0 0 1 0 0
1 1 0 0 0 1 0 0 0 1 0
1 1 0 0 0 0 1 0 0 0 1
1 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 0 0 0
1 0 0 0 0 1 0 0 0 0 0
1 0 0 0 0 0 1 0 0 0 0

I run this model and get exactly the same mean estimates (which is encouraging) as when using the PIM, but the SE of some of these estimates are different from the those generated using the PIM. And the total deviance quoted is also (marginally) different. The differences are so small that I suspect they're unimportant and probably reflect the message that pops up when using the PIM method - that prompts to use the identity matrix as no other is defined. The difference between using the identity matrix and the design matrix I specified might be the key to these small differences, but I would like to understand this more fully if possible.

Hope someone can help,

Colin
cmb
 
Posts: 4
Joined: Wed Jun 15, 2005 4:30 am

Re: Discrepancies between age structure model using PIM and

Postby cooch » Wed Jun 15, 2005 10:54 am

cmb wrote:I am a new Mark user, and have what is probably a fairly basic question.


Which at this point leads to the suggestion make sure you've read the relevant documentation before plunging too far...

I'm try to fit a model with two age structures. Before doing anything fancy (constraining the model) I want to be certain I know what's going on with the fully time-dependent model. Consequently I've set up the model using the PIMs as described in the latest Mark book, and get a reasonable fit.

(PIM modeled as follows, though obviously larger)

1 7 8 9 10 11
. . 2 8 9 10 11
. . . 3 9 10 11
. . . . . 4 10 11
. . . . . . . 5 11
. . . . . . . . . 6

As constraining the model requires working with the design matrix, my first step was to construct the dm that corresponds to the fully time-dependent PIM, something like the following:

1 1 0 0 0 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0 0 0
1 1 0 1 0 0 0 1 0 0 0
1 1 0 0 1 0 0 0 1 0 0
1 1 0 0 0 1 0 0 0 1 0
1 1 0 0 0 0 1 0 0 0 1
1 0 0 0 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 0 0 0
1 0 0 0 0 1 0 0 0 0 0
1 0 0 0 0 0 1 0 0 0 0

I run this model and get exactly the same mean estimates (which is encouraging) as when using the PIM, but the SE of some of these estimates are different from the those generated using the PIM. And the total deviance quoted is also (marginally) different. The differences are so small that I suspect they're unimportant and probably reflect the message that pops up when using the PIM method - that prompts to use the identity matrix as no other is defined. The difference between using the identity matrix and the design matrix I specified might be the key to these small differences, but I would like to understand this more fully if possible.

Hope someone can help,

Colin


"Something like" doesn't help much - post the full PIM and DM structure, it helps folks answer questions. But, to the point at hand...

As discussed at length in Chapter 8, if the PIM is

Code: Select all
1 6 7 8 9
  2 7 8 9
    3 8 9
      4 9
        5


then the DM corresponding to this PIM is either

1 1 1 0 0 0 0 0 0
1 1 0 1 0 0 1 0 0
1 1 0 0 1 0 0 1 0
1 1 0 0 0 1 0 0 1
1 1 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0
1 0 0 0 1 0 0 0 0
1 0 0 0 0 1 0 0 0
1 0 0 0 0 0 0 0 0

(this DM is appropriate if you ultimately want to test an additive model - a model where estimates between the age classes parallel each other on the logit scale). Column 1 is the intercept, column 2 is age class, columns 3 -> 6 are time code, and columns 7 -> 9 are the interactions of age and time. Pay attention to the fact that there is no adult age class for the first interval, which is why there is no interaction column for that interval.

or (if you don't care about additive models...)

1 1 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0
1 0 0 0 1 0 0 0 0
1 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 0 0
0 0 0 0 0 1 0 1 0
0 0 0 0 0 1 0 0 1
0 0 0 0 0 1 0 0 0


This is the interaction model, but since each age class has a unique intercept, you can't build an additive model. But, you may or may note want to. The key is that both DM's have the same number of columns (and are entirely equivalent in terms of the real parameter estimates MARK will report).

If you use PIMs, the default is the sin link. If you use a DM approach, the link function is a logit link. This difference should not affect the deviance, but may affect the number of estimated paramteres. This is discussed in some detail in the book - read chapters 7 and 8, carefully.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby cmb » Thu Jun 16, 2005 4:30 am

Thanks for this - I've not been precice enough in my original posting.

"Something like" doesn't help much - post the full PIM and DM structure, it helps folks answer questions. But, to the point at hand...

As discussed at length in Chapter 8, if the PIM is

Code: Select all

1 6 7 8 9
  2 7 8 9
    3 8 9
      4 9
        5
 


then the DM corresponding to this PIM is either

1 1 1 0 0 0 0 0 0
1 1 0 1 0 0 1 0 0
1 1 0 0 1 0 0 1 0
1 1 0 0 0 1 0 0 1
1 1 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0 0
1 0 0 0 1 0 0 0 0
1 0 0 0 0 1 0 0 0
1 0 0 0 0 0 0 0 0


"Something like" refers to the fact my actual DM is far larger -should have been more precise. Note, however, that the DM I have posted and the suggeted version are equivalent - the only difference being that I have reference rows as the first year in each group, whereas this suggested DM has them as the last ones. I chose this design as is seems unwise to make the last parameter (which is unestimable anyway) the reference parameter - as described in the new book on page 7 - 58. I did infact first try the DM suggested and ended up with a model with a different number of estimable parameters to the default generated using the PIM. Using the first year as the reference level gave me the same number of estimable parameters. As these DMs are equivalent, I am reassured that I am using the correct DM at least.

If you use PIMs, the default is the sin link. If you use a DM approach, the link function is a logit link. This difference should not affect the deviance, but may affect the number of estimated paramteres. This is discussed in some detail in the book - read chapters 7 and 8, carefully.


Sorry, should have mentioned - recognising this issue I had changed the default using the PIMs to the logit link to ensure I was comparing like with like. So my question remains - if the DM I am using is correct and I correctly use the logit link, why do I still end up with (very small) differences in the estimated SEs?
cmb
 
Posts: 4
Joined: Wed Jun 15, 2005 4:30 am

Postby cooch » Thu Jun 16, 2005 6:10 pm

"Something like" refers to the fact my actual DM is far larger -should have been more precise. Note, however, that the DM I have posted and the suggeted version are equivalent - the only difference being that I have reference rows as the first year in each group, whereas this suggested DM has them as the last ones. I chose this design as is seems unwise to make the last parameter (which is unestimable anyway) the reference parameter - as described in the new book on page 7 - 58. I did infact first try the DM suggested and ended up with a model with a different number of estimable parameters to the default generated using the PIM. Using the first year as the reference level gave me the same number of estimable parameters. As these DMs are equivalent, I am reassured that I am using the correct DM at least.


Indeed - this happens with more frequency than we think - glad you picked up on the suggestion (in the book) to not use a confounded parameter as the reference level in the DM. In my experience, this generally a good policy to follow...

Sorry, should have mentioned - recognising this issue I had changed the default using the PIMs to the logit link to ensure I was comparing like with like. So my question remains - if the DM I am using is correct and I correctly use the logit link, why do I still end up with (very small) differences in the estimated SEs?


What - exactly - were the model deviances? They should be identical - the AIC values will differ on occasion (reflecting differences in numbers of estimable parameters), but the deviances shouldn't differ.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Postby cmb » Fri Jun 17, 2005 4:32 am

Indeed - this happens with more frequency than we think - glad you picked up on the suggestion (in the book) to not use a confounded parameter as the reference level in the DM. In my experience, this generally a good policy to follow...


I would strongly agree, and was slightly surprised that it wasn't the default method anyway! Still, the models are:

Code: Select all
{Phi(a2, t/t)P(t) DM}   QAIC = 1063.2481; No. Par. = 66; Dev. = 335.885
{Phi(a2, t/t)P(t) PIM}  QAIC = 1063.1754; No. Par. = 66; Dev. = 335.812


(Model names describe 2 age classes, both fully time varying, and time dependence in P (but not differing by age) and the method used in building them). So as you can see the differences are tiny (rounding error/differences between identity matrix from PIM and actual DM?), but I'm surprised they're there at all and would feel happier to understand why. Interestingly, for any of the reduced models where either first or second (or both in fact) age class is constant, the devinace is identical - e.g.

Code: Select all
{Phi(a2, t/.)P(t) DM}   QAIC = 1046.3785; No. Par. = 44; Dev. = 372.004
{Phi(a2, t/.)P(t) PIM}  QAIC = 1046.3785; No. Par. = 44; Dev. = 372.004


Might be worth mentioning that the best model (by some margin) has the first age class (actually transients) constant, and the second constrained to winter rainfall (QAIC = 1031.8). Also to reiterate that there are unestimable parameters here (resighting probability is very close to 1 in several years).
cmb
 
Posts: 4
Joined: Wed Jun 15, 2005 4:30 am


Return to analysis help

Who is online

Users browsing this forum: No registered users and 0 guests

cron