time.bins for constraining gamma parameters

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

time.bins for constraining gamma parameters

Postby stellatus » Wed Mar 09, 2011 7:11 am

Hello,

I would like to make my last and second-to-last gamma parameters equal to avoid confounding. I'm having trouble understanding the instructions in the help file for the robust example data, which uses "time.bins" for this purpose:

here it is done by binning the times so that times 3 and 4 are in the same bin, so the time model
# has 3 levels (1,2, and 3-4). By doing so the parameters become identifiable but this may not be
# reasonable depending on the particulars of the data. Note that the same time binning must be done both for
# GammaPrime and GammaDoublePrime because the parameters are the same in the random emigration model. If you
# forget to bin one of the parameters across time it will fit a model but it won't be what you expect as it will
# not share parameters. Note the use of the argument "right". This controls whether
# binning is inclusive on the right (right=TRUE) or on the left (right=FALSE). Using "right" nested in the list
# of design parameters is equivalent to using it as a calling argument to make.design.data or add.design.data.
#
S.time=list(formula=~time)
p.time.session=list(formula=~-1+session:time,share=TRUE)
GammaDoublePrime.random=list(formula=~time,share=TRUE)
model.4=mark(data = robust, model = "RDHuggins", time.intervals=time.intervals,design.parameters=list(GammaDoublePrime=list(time.bins=c(1,2,5))),
right=FALSE, model.parameters=list(S=S.time,GammaDoublePrime=GammaDoublePrime.random,p=p.time.session))


1. In the example data, there are four gamma"s, and the third and fourth are set to equal (i.e. put in the same bin) using time.bins. How come the time bins are c(1,2,5)? I don't understand where the 5 comes from.

2. In the example, right=FALSE, meaning that "binning is inclusive on the left". I don't understand what this means (sorry!).

3. The instructions point out that the same binning must be done for both Gamma' and Gamma", but in the example model above, the time.bins are only created for GammaDoublePrime and the share function is used. Is it necessary to create bins for both parameters if using "share"?

4. If my time intervals are uneven when I create bins, do I use 1,2,etc. as above, or do I create time.bins based on the time values in my ddl (e.g. 1,57,etc.).

Here are the first few lines of my design data for GammaDoublePrime:

Code: Select all
    group cohort age time Cohort Age Time  fire sex time_dummy
1  EarlyF      1   0    1      0   0    0 Early   F          0
2  EarlyF      1  56   57      0  56   56 Early   F          0
3  EarlyF      1  72   73      0  72   72 Early   F          1
4  EarlyF      1 336  337      0 336  336 Early   F          0
5  EarlyF      1 384  385      0 384  384 Early   F          0
6  EarlyF     57   0   57     56   0   56 Early   F          0
7  EarlyF     57  16   73     56  16   72 Early   F          1
8  EarlyF     57 280  337     56 280  336 Early   F          0
9  EarlyF     57 328  385     56 328  384 Early   F          0
10 EarlyF     73   0   73     72   0   72 Early   F          1


Thank you!

Annabel.
stellatus
 
Posts: 12
Joined: Sun Feb 13, 2011 9:28 pm

Re: time.bins for constraining gamma parameters

Postby jlaake » Thu Mar 10, 2011 12:10 am

I can see that this example needs work. If it were split into the process.data and make.design.data steps it would have helped because you could then look at the design data and see what the various arguments are doing. In creating bins, the code is using the cut function in R so if you were to type ?cut in R you'll see the right argument that it uses and what it does. If you have the intervals c(1,2,5) with right=FALSE then it creates the intervals [1,2), and [2,5). The ( means open (doesn't incluide) and } means closed (includes), so [2,5) includes the times 2,3,4 but not 5. If right=TRUE then the intervals are (1,2], (2,5]. You can also use include.lowest but I didn't include this in the RMark functions. If Include.lowest=T and right=FALSE then the last interval is [2,5] and with right=TRUE the first interval is [1,2]. Now because I don’t have the include.lowest argument I used 5 as the upper number so the last interval would include 4.
Now as I'm sure you have caught on here, what is written in the text is incorrect because it implies that there are 3 time bins 1,2,and 3-4 when in fact there are just two, 1 and 2-3-4. To get the 3 levels as specified it should have been time.bins=c(0,1,2,4),right=TRUE.
With regard to your question 3, this was an example of what happens when you don’t specify the same time.bins for gamma prime and gamma double prime. That could have been made more clear in the text. If you wanted to do it correctly then the same time.bins should have been used for both. With regard to question 4, you would use your time values.
You don’t need to use the time.bins with design.parameters. If you use the process.data and make.design.data steps, you can use any R commands like cut to create the bins or a new time bin field in the design data as you see fit.
--Jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: time.bins for constraining gamma parameters

Postby stellatus » Thu Mar 10, 2011 8:03 am

Thanks once again for your help Jeff, it's making more sense now. I see that the add.design.data function in RMark is using this "cut" function too.

After thinking about it more (and re-reading the "more complex example" in 15.6.2 of C&W) I realised that putting more realistic constraints on my gammas would allow me to avoid confounding while also using time variables which make more biological sense.

I have six primary periods spanning two summer seasons, with three periods in the first summer, and three in the second. There are thus two short time intervals (20-40 days), followed by a long interval (~300 days), then two more short intervals. I made a new variable (time_season) by assigning new values to Time (1,1,2,1,1) then converting it to a factor. This will reduce the number of time parameters and take into account any differences between within- and among-season S, G" and G'. Given that S1,S2,S4 and S5 (and corresponding gammas) are all constant in time_season, I think all the parameters will be identifiable.

My second season had much higher rainfall than the first and was thus a much "better" season (higher capture rates etc.), so I also wanted a time variable that would reflect this. I used "cut" as you suggested to bin the default time variable for S, G" and G' into three bins (I called it time_rfall):

Code: Select all
mdd1$GammaDoublePrime$time_rfall<-as.numeric(mdd1$GammaDoublePrime$time)
mdd1$GammaDoublePrime$time_rfall<-cut(mdd1$GammaDoublePrime$time_rfall,breaks=c(0,2,3,5),labels=c("1","2","3"))


I had to change it first to numeric because cut only works on numeric data (or I could have just used Time instead of time). I imagine I could have also used add.design.data for the same purpose? So time_rfall will reflect any differences between the two seasons as well as any difference for the long time interval.

Thanks so much,

Annabel.

Here are the first few rows of design data for GammaDoublePrime:

Code: Select all
     group cohort age time Cohort Age Time   fire sex time_season time_rfall
1   EarlyF      1   0    1      0   0    0  Early   F           1          1
2   EarlyF      1  56   57      0  56   56  Early   F           1          1
3   EarlyF      1  72   73      0  72   72  Early   F           2          2
4   EarlyF      1 336  337      0 336  336  Early   F           1          3
5   EarlyF      1 384  385      0 384  384  Early   F           1          3
6   EarlyF     57   0   57     56   0   56  Early   F           1          1
7   EarlyF     57  16   73     56  16   72  Early   F           2          2
8   EarlyF     57 280  337     56 280  336  Early   F           1          3
9   EarlyF     57 328  385     56 328  384  Early   F           1          3
10  EarlyF     73   0   73     72   0   72  Early   F           2          2
stellatus
 
Posts: 12
Joined: Sun Feb 13, 2011 9:28 pm

Re: time.bins for constraining gamma parameters

Postby jlaake » Thu Mar 10, 2011 10:57 am

You have got it. You can create design data in any way you want and often you have to use R commands to do so because what is in make and add.design.data is very limited. Your first example of 1,1,2,1,1 is an example of where cut would not work, whereas it does for rfall and you could use add.design.data or cut directly. By using cut as you show you can also add the labels that you want as well. And the key is to ALWAYS look at your design data once you are done so to make sure it is what you intended. As much care should be given to the design data as your observation data.

In the crm function for cjs and js models (independent of MARK) there is no distinction between design data and observation data, which makes it less confusing and more apparent that they are equally important. But for the time being, that approach is not fully developed and only works for those 2 types of models.

--jeff
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA


Return to RMark

Who is online

Users browsing this forum: Bing [Bot] and 1 guest