design matrices and parameter estimates in RMark

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

design matrices and parameter estimates in RMark

Postby robrob » Thu Feb 25, 2010 1:38 pm

Hi folk,
apologies if i've missed something somewhere along the line, but some guidance from more experienced hands would be welcome. I'm estimating CJS 'survival' of birds, which may differ additively by sex, at a bunch of sites over a number of years. Although all sites have a decent run of data, they don't all start and finish in the same year and sites are likely to vary in quality, and hence in the estimated survival probabilities; but I am hopeful that annual variation is additive (we see good years and bad years across sites). Now I can easily fit, and test, such models but I stumble when it comes to parameter estimation. What I would like to do is come up with a series of annual survival probabilities averaged across the sites. I can see how to do this through the design matrix in MARK, but not in RMark because it (quite sensibly) generates estimates with regard to a particular site (which might be particularly good or bad and which, because the exercise will be repeated annually, may change). I'm nervous of going in and fiddling with RMark's DMs without a clear idea of what I am doing but I thought someone, surely, must have done this sort of thing before and can point me in the right direction?
any thoughts gratefully received
thanks and best wishes
rob
robrob
 
Posts: 4
Joined: Thu Feb 25, 2010 6:00 am
Location: Thetord, Norfolk, UK

Re: design matrices and parameter estimates in RMark

Postby jlaake » Thu Feb 25, 2010 9:27 pm

Rob-

I'm not quite following your post. However, RMark is not designed such that you "fiddle" with the DM. That is all done with formula. So if you you wanted an additive model of sex and year that would be ~sex+time. If you wanted an additive model of sex,site,time that would be ~sex+site+time. I'm not certain what you mean when you say that you could create a DM in MARK that would average across sites. Such a model would simply exclude site as a factor.

But from your description maybe you really want to estimate a separate time-varying survival for each site and then use variance components to average across sites with year. In RMark you could fit ~sex +site:time. The site:time would handle the fact that not all years are sampled at each site. Then you could use var.components to average across sites in a each year and compute the process variance for each year. However, the current code would only work for a single year.

So the above is all guesses at what you might possibly want. --jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: design matrices and parameter estimates in RMark

Postby robrob » Fri Mar 26, 2010 9:38 am

Jeff,
thanks for the reply. Apologies for not being clearer (and apologies for the time-out also, other projects intervened). Yes, the variance components approach is the sort of thing I'm trying to do - the problem being that for some sites the data are such that estimates for a particular year may be very poor, so not sure how well it will work, hence the reason for trying to consider sites together.
So, what I am doing is fitting Phi ~ -1+sex+site+year (one can obviously test against site:year to check the interaction). What I would like to present is simply the time-specific variation in Phi, which may be for up to 15 years. I could use the real parameter estimates, but this means representing a single site, which to choose? I could back-transform the beta estimates but, because of identifiability, the first year's estimate is 0 and others are an offset from that, so the estimates will always center around 0.5. I guess I could add an intercept (had I fitted it), but one is then still implicitly representing a single site (and sex), that which has been set to 0, you also need to figure out the arithmetic for combining the se's. In its simplest form what I need is an average of the site parameters to anchor the time-series.
Where I was coming from was the approach Evan demonstrates in his marvellous book on p6-85, where by casting the design matrix in an alternative fashion (using -1s) one can make the beta's specify the mean of a set of parameters rather than an intercept+offset, which seems to give me what I want - but I'm not sure this is possible using the formula approach?
Hope that is clearer, it seems such a straightforward thing to want to do, I feel I must be missing something somewhere. Any thoughts welcome and thanks again for your time
best wishes
rob
robrob
 
Posts: 4
Joined: Thu Feb 25, 2010 6:00 am
Location: Thetord, Norfolk, UK

Re: design matrices and parameter estimates in RMark

Postby jlaake » Fri Mar 26, 2010 1:38 pm

First I'll answer your question about DMs. In R you can set the following:

options(contrasts=c("contr.sum","contr.poly"))

and it will use the type of DM that is described on page 6-85. Personally I think treatment contrasts are more understandable. Below are some examples:

> x=data.frame(time=factor(1:4),site=factor(c("a","a","b","b")))
> model.matrix(~time,x)
(Intercept) time1 time2 time3
1 1 1 0 0
2 1 0 1 0
3 1 0 0 1
4 1 -1 -1 -1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$time
[1] "contr.sum"

> model.matrix(~time+site,x)
(Intercept) time1 time2 time3 site1
1 1 1 0 0 1
2 1 0 1 0 1
3 1 0 0 1 -1
4 1 -1 -1 -1 -1
attr(,"assign")
[1] 0 1 1 1 2
attr(,"contrasts")
attr(,"contrasts")$time
[1] "contr.sum"

attr(,"contrasts")$site
[1] "contr.sum"

While it is possible to use separate contrasts with factor variables as an argument to model.matrix, this is not supported by RMark. That could be something I could add if deemed useful, but not soon.

With regard to some of your other points, with treatment constrasts the mean of the betas for time will not be 0 as I think you are stating. The other issue you need to consider is that you can't completely divorce the time effects from site. Certainly these are additive on the logit scale but the transformation to the real parameters is non-linear. Thus, the impact of the variation of the time betas will depend on the intercept. To see that consider the following with different intecepts but the same range of differences for beta. The one ranges from 0.5-0.88 and the other from 0.73-0.95 based on the intecept difference.

> plogis(1+1)
[1] 0.880797
> plogis(1+-1)
[1] 0.5
> plogis(2+1)
[1] 0.9525741
> plogis(2+-1)
[1] 0.7310586

regards --jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA


Return to RMark

Who is online

Users browsing this forum: No registered users and 2 guests

cron