Naming time-varying individual covariates

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Naming time-varying individual covariates

Postby jlaake » Wed Dec 11, 2013 8:59 pm

The recent interchange with Tyler on phidot made me realize that there may be a fair amount of confusion with users and time-varying individual covariates. Heck I even forgot what the code does and I haven't been good about keeping up with the documentation. I need an Evan Cooch for RMark.

Time-varying individual covariates was all simple prior to adding Robust Design models. With RD, the covariates can either align with sessions (primary), times(secondary), or with session-time values.

Here is the logic the code uses when it encounters a variable name in the formula that is not in the design data. I use cov as the variable name used in the formula. If there is a variable called cov then it skips everything below because the covariate is static (not time-varying).

1) First it looks in the data where individual covariates are for variables with names covst where cov is base name of variable in formula, s is session and t is time. If each of those variables is available in the data then it assumes that cov depends on session and time.

2) If not then it looks for covs in the data and if it finds all of those then it assumes they are session dependent.

3) If any of those are missing then it constructs covt and if all of those are available then it assumes time dependence.

4) If none of the above then the code whines at you that it can't find the variable you used. Something like:
Error in make.mark.model(data.proc, title = title, parameters = model.parameters, :

Error: Variable AirT used in formula is not defined in data

What I may need to add is a way to force the use of time when the #sesssions<=#times. In that case,
the problem will be that because it first tries to match it to session, there will be a covariate name for each session if the sequence of session numbers is the same as the times (what happens when begin.time=1 the default). I guess I could add code to see whether there are more than the # of sessions, but that will still
fall down if #sessions=#times.

Some suggestions for this are:
1) change the value of begin.time (used to label sessions in RD model) (see example below)
2) use cov_s_t approach but then some of your covariate values are redundant.
3) modify session values in your ddl so they don't include time values and rename covariates so they match time and not session. This is what
setting begin.time does.

I'm not entirely sure how to get around this problem other than the suggestions above but I'm not sure how often it happens that you have a covariate that is constant across secondary sampling occasions. Feedback? I could have another attribute for a formula that specified which variable names are
time, session or session-time dependent. I'll give this some more thought if folks think this would be useful.

Here is an example of Tyler's design data when default begin.time=1 is used and begin.time=0 is used. Now session numbers are different than times and the code won't get confused because the covariates will have to be named cov0,cov1,cov2 to be session dependent and cov1,cov2,...cov6 to be time dependent.

Code: Select all
> LIBL.ddl$p
   par.index model.index group time session Time pA pB pC
1          1           6     1    1       1    0  1  0  0
2          2           7     1    2       1    1  1  0  0
3          3           8     1    3       1    2  1  0  0
4          4           9     1    1       2    0  1  1  0
5          5          10     1    2       2    1  1  1  0
6          6          11     1    3       2    2  1  1  0
7          7          12     1    1       3    0  1  0  1
8          8          13     1    2       3    1  1  0  1
9          9          14     1    3       3    2  1  0  1
10        10          15     1    4       3    3  1  0  1
11        11          16     1    5       3    4  1  0  1
12        12          17     1    6       3    5  1  0  1
> LIBL.process=process.data(LIBL.inp, model="RDOccupEG", time.intervals=c(0,0,1,0,0,1,0,0,0,0,0),begin.time=0)
> LIBL.ddl=make.design.data(LIBL.process)
> LIBL.ddl$p
   par.index model.index group time session Time
1          1           6     1    1       0    0
2          2           7     1    2       0    1
3          3           8     1    3       0    2
4          4           9     1    1       1    0
5          5          10     1    2       1    1
6          6          11     1    3       1    2
7          7          12     1    1       2    0
8          8          13     1    2       2    1
9          9          14     1    3       2    2
10        10          15     1    4       2    3
11        11          16     1    5       2    4
12        12          17     1    6       2    5



regards --jeff
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Naming time-varying individual covariates

Postby TGrant » Tue Aug 05, 2014 3:20 pm

I ran into this issue with a new analysis I am working on. I am using the robust design occupancy model where frog life stages are considered "seasons", i.e., season 1 is calling frogs, season 2 is tadpoles, and season 3 is metamorphs (baby frogs just changed from tadpoles). Using the model this way allows estimation of how many sites successfully produce metamorphs compared to how many sites male frogs are attempting to breed. However, modeling of p for each "season" takes completely different covariates and other considerations.

For example, I want to model p in season 1 as a function of air temperature, but I don't have air temp measurements for seasons 2 and 3. Similarly, I want to model detection of metamorphs as a function of shoreline vegetation, but shoreline vegetation has nothing do with p of calling frogs and tadpoles.

What I've done is create covariates that are entirely 0's. So AirT11, AirT12... have my the air temp measurements, but AirT21, AirT22... and AirT31, AirT32... are all 0's. I would expect/hope this would give me correct estimates. I tested this in MARK proper and it had no effect on parameter estimates or AICc, so it seems to me to be a valid workaround.

Tyler
TGrant
 
Posts: 18
Joined: Tue Dec 19, 2006 9:45 pm

Re: Naming time-varying individual covariates

Postby jlaake » Tue Aug 05, 2014 6:24 pm

Tyler-

Be careful with that. When you are using an individual covariate, it is a numeric variable (thus a slope). By setting the other values to 0, you are using the intercept of the relationship. If you did such a thing and did NOT use a separate value for each season (ie put season in the model) then you are using the intercept value from season 1 for the other 2 seasons which is the predicted value at air temp 0. If you were to subtract the mean value from the numeric covariate, then the intercept of 0 would be the average air temp in season 1 and that would certainly make more sense than using an air temp of 0.

--jeff
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Naming time-varying individual covariates

Postby TGrant » Tue Aug 05, 2014 9:02 pm

I'm not sure what is happening. When I tested this in MARK, I used a species with 2 primary occasions (I couldn't use tadpoles so it was just calling frogs and metamorphs). I used a session-varying covariate: wetland size. I wanted to model p2, the det prob of metamorphs, as a function of wetland size. I created a covariate entirely of 0's which I called pWetSize. The DM's for the 3 models I ran in MARK (condensed - there were 6 secondary occs in total) are below:

Model 1
p1 1
p2 1 WetSize

Model 2
p1 1 pWetSize
p2 1 WetSize

Model 3
p1 1 pWetSize
p2 1 0 WetSize

All 3 models gave the exact same real parameter estimates for p1 and p2, the same cov vs. parm graph for p2, and the same AICc values. Beta estimates were the same except there was a beta for pWetSize in Model 3, but again the reals were the same. Based on that I was proceeding as I described before, but your post is making me doubt the validity of this. I'm not sure how they reconcile.
TGrant
 
Posts: 18
Joined: Tue Dec 19, 2006 9:45 pm

Re: Naming time-varying individual covariates

Postby jlaake » Thu Aug 07, 2014 11:00 am

Tyler-

I didn't fully understand your design matrices because it looked like you had a WetSize and pWetsize which I didn't understand. Those could be valid models if they make sense for your data. For example with Model 1 you have the same intercept for p1 and p2. Does it make sense that p1 is the same as p2 with Wetsize=0 because that is what the model describes. I was saying that it might make more sense to have

p1 1 1 0
p2 1 0 Wetsize
jlaake
 
Posts: 1480
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Naming time-varying individual covariates

Postby TGrant » Mon Aug 11, 2014 6:36 pm

No, this model doesn't make sense:

Model 1
p1 1
p2 1 WetSize

In my haste to figure this out I forgot to put in the basic additive effect for p2.

To be clear to any other readers, here is how I am modeling it. This is for 3 primary occasions/sessions, with a covariate that is constant during a primary occasion/session:

p1 1 0 0 ptWetSize
p2 1 1 0 ptWetSize
p3 1 0 1 ptWetSize

The covariate ptWetSize is a covariate of tadpole (p2) detection probability, so for p1 and p3 it is all 0's. The formula in RMark is simply ~session+ptWetSize. The covariate in the input file looks like this:

ptWetsiz1 ptWetSize2 ptWetSize3
0 data 0
0 data 0
etc

And for the time-varying covariates it looks something like this:

MS11 MS12 MS13 MS21 MS22 MS31 MS32
data data 0 0 0 0 0
etc

Hopefully I'm clear on this now and it will be clear to others as well. It's a pretty simple fix to implement.

Tyler
TGrant
 
Posts: 18
Joined: Tue Dec 19, 2006 9:45 pm


Return to RMark

Who is online

Users browsing this forum: No registered users and 1 guest