www.phidot.org

by **stellatus** » Mon Feb 21, 2011 1:41 am

Hello,

I have four questions regarding fitting models to data with group design covariates, with the explanation of the problem given below:

1. Why are parameter estimates produced for the finest group structure even for models which don't specify all group design covariates?
2. How do I get estimates and confidence intervals for parameters only for groups specified in a model, even when a more detailed group structure is present in the design data list?
3. If there is no possible way to do Question 2, and considering my sparse data, do I need to set up a number of different design data lists for each of the relevant combinations of group design covariates (e.g. ddl1<-groups=c("age","sex"), ddl2<-groups=c("sex","fire"), etc.) and is it valid to compare models based on different design data lists?
4. Why are the model selection results so different between a ddl with three groups and one with 13 groups even when the candidate models were exactly the same?

If anyone can help with this it will be greatly appreciated,
Annabel.

Explanation of problem:

I am working with the robust design with the aim of estimating the effect of time since fire category (early, medium, late) on abundance, survival and capture probabilities. I also want to investigate the effect of the sex and age of the animals and the occasion-specific covariates rainfall and temperature on the model parameters. I have 43 secondary occasions within six primary periods of 8,7,6,8,7,7 days in length.

I set up my design data list with all three group design covariates specified (age, sex and fire category) and then added temp and rainfall to the design data. With all possible combinations of the three group design covariates, the data consist of 13 different groups. My data are fairly sparse: I have a total of 139 animals and daily recapture rates between 5 and 25 %. Dividing them into 13 groups therefore drastically reduces power. Given this constraint, I am trying to limit the number of parameters that I include in my models. For example, I would like to compare models that include fire effects to those with sex effects, and also examine interactive effects, for e.g. between fire and sex. Even though I have a maximum of 13 groups, I would probably not include all three group design covariates in one model, but want them all available for comparing different models.

Even when I write a model including only temperature effects on p and fire effects on N, based on a ddl with three group design covariates (mdd1):

mod2<-mark(process1,mdd1,
model.parameters=list(S=S.dot,GammaDoublePrime=GDP.dot,p=p.mintemp,N=N.fire),
model.name="13 groups,mod2",output=FALSE)

the derived parameters for N are still estimated for each of the 13 groups, when I expected to get results for only three groups (fire category: early, medium and late). The real parameters are given only for x of the 13 groups, where x is the number of groups specified for a particular parameter. In the above example the sum of N for all groups within each fire category equals a number close to the actual number that I captured in each category. But I don't know how to get an actual estimate for the group with confidence intervals.

I created a new design data list which only specified fire category (three groups). I ran four identical candidate models on the ddl with 13 groups and the ddl with 3 groups. I modelled the parameters only as functions of either 1, time related variables, or fire (details in the script below). The results from the model list are different for the ddl with 13 groups and the ddl with 3 groups. The top model for 13 groups is different to the top model for 3 groups, even though only fire groups were specified for both. The models with 3 groups have a much better fit to the data, even though it is the same data, and the same parameters, but different group structure.

model npar AICc
3 groups,mod7 16 609.125
3 groups,mod8 26 611.516
3 groups,mod6 7 623.669
3 groups,mod5 4 671.690
13 groups,mod2 7 1038.545
13 groups,mod3 16 1039.424
13 groups,mod1 4 1051.463
13 groups,mod4 26 1058.053

# Script:

dat<-import.chdata(filename="pinks_all_covar2.txt",header=T)
int<-read.table(file="int.txt",header=F)

# Group design based on 13 groups: all possible combinations of "fire" (early, med, late), "sex" (M, F, U)
# and "AdJ" (adult, juvenile). "U" in "sex" stands for unknown and is restricted to juveniles (it's sometimes not
# possible to determine the sex of a juvenile).

process1<-process.data(dat,model="Robust",begin.time=1,time.intervals=int,groups=c("fire","sex","AdJ"))
mdd1<-make.design.data(process1)

# Group design based on three groups only (fire category): early, medium and late

process2<-process.data(dat,model="Robust",begin.time=1,time.intervals=int,groups="fire")
mdd2<-make.design.data(process2)

# I have also specified occasion specific data for each design data list, details of which are not shown because the script is very long.
# My study species is a gecko and capture rates are likely to be highly dependent on daily temperature.

# Parameters

S.dot<-list(formula=~1)
S.fire<-list(formula=~fire)
S.fire_time<-list(formula=~fire+time)
GDP.dot<-list(formula=~1,share=TRUE)
p.dot<-list(formula=~1,share=TRUE)
p.mintemp<-list(formula=~mintemp,share=TRUE)
p.mintemp_fire<-list(formula=~mintemp+fire,share=TRUE)
p.mintemp_fire_session<-list(formula=~mintemp+fire+session,share=TRUE)
N.dot<-list(formula=~1)
N.fire<-list(formula=~fire)
N.fire_session<-list(formula=~fire+session)
N.rfall_fire_session<-list(formula=~rfall+fire+session)

# Models for the data with 13 groups:

mod1<-mark(process1,mdd1,
model.parameters=list(S=S.dot,GammaDoublePrime=GDP.dot,p=p.dot,N=N.dot),
model.name="13 groups,mod1",output=FALSE)
mod2<-mark(process1,mdd1,
model.parameters=list(S=S.dot,GammaDoublePrime=GDP.dot,p=p.mintemp,N=N.fire),
model.name="13 groups,mod2",output=FALSE)
mod3<-mark(process1,mdd1,
model.parameters=list(S=S.fire,GammaDoublePrime=GDP.dot,p=p.mintemp_fire,N=N.fire_session),
model.name="13 groups,mod3",output=FALSE)
mod4<-mark(process1,mdd1,
model.parameters=list(S=S.fire_time,GammaDoublePrime=GDP.dot,p=p.mintemp_fire_session,N=N.rfall_fire_session),
model.name="13 groups,mod4",output=FALSE)

# Models for the data with 3 groups (exactly the same parameters as mod1-mod4):

mod5<-mark(process2,mdd2,
model.parameters=list(S=S.dot,GammaDoublePrime=GDP.dot,p=p.dot,N=N.dot),
model.name="3 groups,mod5",output=FALSE)
mod6<-mark(process2,mdd2,
model.parameters=list(S=S.dot,GammaDoublePrime=GDP.dot,p=p.mintemp,N=N.fire),
model.name="3 groups,mod6",output=FALSE)
mod7<-mark(process2,mdd2,
model.parameters=list(S=S.fire,GammaDoublePrime=GDP.dot,p=p.mintemp_fire,N=N.fire_session),
model.name="3 groups,mod7",output=FALSE)
mod8<-mark(process2,mdd2,
model.parameters=list(S=S.fire_time,GammaDoublePrime=GDP.dot,p=p.mintemp_fire_session,N=N.rfall_fire_session),
model.name="3 groups,mod8",output=FALSE)

res1<-collect.models(type="Robust")
res1

by **jlaake** » Mon Feb 21, 2011 8:19 pm

1) Because there are real parameters specified for each group and you could easily specify a more complex model. Possibly you could make the code smarter enough to see that only a subset of the real parameters differ but I've not put much thought into it.

2) You can select a subset of parameters to display by selecting the indices from the PIMS. However, that will not work with N because there is a real parameter N for each group which is the fundamental issue you are having here.

3&4) These go hand and hand in this case. If this was a CJS model, then you could create different design data lists and you could compare models. I've not used the Robust design myself but I believe N is in the likelihood and each group represents a different multinomial structure. Thus, when you use different group structures you cannot compare AICs. Even if you use the same model for N there are 13 different Ns in the one case and 3 in the other. Also, I'm not sure it makes much sense to do something like ~fire for N. Have you just compared it to ~group in each case?

--jeff

by **stellatus** » Tue Feb 22, 2011 2:57 am

Thanks very much for your answers Jeff.

jlaake wrote: Have you just compared it to ~group in each case?

I just tried this and it produces a lower AIC than all the other models. The reason I used ~fire rather than ~group was that I thought I would get derived parameter estimates based only on fire groups, or whatever groups I put in the model. But now that I realise that's not how it works, I see you're right that ~group makes more sense.

The thing I'm really interested in is the effect of fire on abundance and survival. So I think I will try reducing the number of parameters by pooling juveniles into fire groups only, and keeping some of the time related variables constant. I'll do some preliminary analyses to explore the effect of age and sex and possibly pool them further if they're not important.

Cheers,
Annabel

by **stellatus** » Tue Feb 22, 2011 10:22 pm

Hi Jeff, I've been thinking more about something you said regarding my analysis and now I'm confused:

jlaake wrote:Also, I'm not sure it makes much sense to do something like ~fire for N. Have you just compared it to ~group in each case?

If I've got fire and sex groups (I'll just forget about adult/juvenile categories for now), can't I model N~fire, N~sex and N~fire+sex to look for the relative effects of these variables? What exactly would N~group tell me if I've got both fire and sex groups? Is the interaction?

I've been reading chapter 6 of the Gentle Introduction book where the dipper data are analysed for treatment and fire effects, which is just what I want to do too. I'm struggling to see the difference between this analysis (section 6.3) and mine. In the dipper example sex was entered is as a group design covariate (using the attribute group button to name groups specified in the .inp file), and flood was added to the design matrix as a dummy variable. But the group variable sex is coded in the design matrix in the same way as a dummy variable. Is the "group" function in RMark different to having groups specified in the encounter history file as you would in an .inp file?

Can I look for sex and treatment effects with both fire and sex as group design covariates?

I appreciate the help you've given me, thanks again,
Annabel.

by **jlaake** » Tue Feb 22, 2011 10:59 pm

I should re-iterate that I've not used the Robust Design and you should look through the material in Cooch and White on this model. Given that caveat, I believe the real parameter N that you are modelling is actually f0 which is the number not caught and not total abundance. At least that is the way it is handled with the POPAN model and I think N is handled consistently across MARK models in that way but please double check by reading that section of Cooch and White. I can imagine scenarios in which N was related to fire but f0 was not because p varied by fire. Imagine if p was high for fire and abundance was too and it worked out that f0 was the same for fire and non-fire areas. I can also imagine that f0 is different for fire strata when abundance was the same.

I can and have programmed the RMark interface for many different models in MARK without really understanding the model and all of its subtleties. Doing so in RMark is quite easy for most models once I created the underlying infrastructure. Also, while you can write a model like ~fire or ~fire+sex with RMark it doesn't guarantee that it makes sense. You have to make those decisions. RMark is only an interface to avoid doing the tedious work of creating design matrices by hand and for automating model development. I hope this helps.

In regard to your specific question, if groups=fire,sex then ~group is the same as ~fire*sex. You can create ~fire or ~sex or ~fire+sex. I'm just not sure how you would interpret those parameters. In the case of the dipper data, the flood, sex covariates are affecting p and Phi and the effects are interpretable. Not certain about the f0 case.

--jeff

by **stellatus** » Wed Feb 23, 2011 12:28 am

Ah yes, thanks for pointing that out Jeff. I realise this is more about my understanding of the analysis in general rather than an RMark issue. I'm just now going through the robust design chapter again (I have actually read it many times, it's just hard to remember everything).

I understand what you were saying about RMark being just an interface but what an excellent interface it is. I would be having the same problems understanding my analysis whether I was building models with the design matrix in MARK or with RMark, but at least with RMark the analysis is streamlined and it's easy to keep track of everything I've done.

Thanks,
Annabel

www.phidot.org

RMark: interpreting results from different design data lists

RMark: interpreting results from different design data lists

Re: RMark: interpreting results from different design data l

Re: RMark: interpreting results from different design data l

Re: RMark: interpreting results from different design data l

Re: RMark: interpreting results from different design data l

Re: RMark: interpreting results from different design data l

Who is online