Page 1 of 1

Dummy Covariates in RMark

PostPosted: Tue Dec 03, 2013 4:06 pm
by TGrant
I have an unusual covariate structure that I am trying to implement in RMark. I surveyed wetlands for frogs and sometimes the wetlands were dry or were created/destroyed by floods. Dummy covariates were created for present/not-present wetlands, essentially creating an intercept that is site-specific - present when the wetland exists and not present when the wetland is dry or destroyed. It runs in PRESENCE and MARK and I'm trying to match up the estimates from RMark with those estimates to make sure I'm doing it right before I run a bunch of models.

Here is the R code (A means a wetland was non-existent, B means it existed. AA means it transitioned from dry to dry, AB from dry to wet, etc.. This is a robust design occupancy study with 3 primary occasions.):

Code: Select all
covs = c("AA1","AB1","BB1","BA1","AA2","AB2","BB2","BA2","Day1","Day2","Day3","Day4","Day5","Day6","Day7","Day8","Day9",
         "Day10","Day11","Day12","Time1","Time2","Time3","Time4","Time5","Time6","Time7","Time8","Time9","Time10","Time11",
         "Time12","AirT1","AirT2","AirT3","AirT4","AirT5","AirT6","AirT7","AirT8","AirT9","AirT10","AirT11","AirT12",
         "MS1","MS2","MS3","MS4","MS5","MS6","MS7","MS8","MS9","MS10","MS11","MS12","WatT1","WatT2","WatT3","WatT4","WatT5",
         "WatT6","WatT7","WatT8","WatT9","WatT10","WatT11","WatT12","Size10","DistWet10","Slope10","EmerVeg10","BareGr10",
         "GrHerb10","Slope12","Size12","DistWet12","EmerVeg12","BareGr12","GrHerb12","Size13","DistWet13","Slope13",
         "EmerVeg13","BareGr13","GrHerb13")

LIBL.inp <- convert.inp("LIBL.inp", covariates = covs, use.comments = TRUE)

#add dummy covars for Psi and p
LIBL.inp[,"A1"]=apply(LIBL.inp[,c("AA1","AB1")],1,sum)
LIBL.inp[,"B1"]=apply(LIBL.inp[,c("BA1","BB1")],1,sum)
LIBL.inp[,"A2"]=apply(LIBL.inp[,c("AA1","BA1")],1,sum)
LIBL.inp[,"B2"]=apply(LIBL.inp[,c("AB1","BB1")],1,sum)
LIBL.inp[,"A3"]=apply(LIBL.inp[,c("AA2","BA2")],1,sum)
LIBL.inp[,"B3"]=apply(LIBL.inp[,c("AB2","BB2")],1,sum)

LIBL.process=process.data(LIBL.inp, model="RDOccupEG", time.intervals=c(0,0,1,0,0,1,0,0,0,0,0))
names(LIBL.process)

LIBL.ddl=make.design.data(LIBL.process)
#the following code makes 3 columns for the 3 primary occasions - probably not necessary because 'session' will do this
LIBL.ddl$p$pA=0
LIBL.ddl$p$pA[1:12]=1
LIBL.ddl$p$pB=0
LIBL.ddl$p$pB[4:6]=1
LIBL.ddl$p$pC=0
LIBL.ddl$p$pC[7:12]=1
LIBL.ddl

fit.LIBL.models=function()
{
  # Define Psi models
  Psi.dot=list(formula=~-1+B1+A1) 
  # Define Epsilon models
  Epsilon.dot=list(formula=~-1+AB+AA+BB+BA)
  # Define Gamma models
  Gamma.dot=list(formula=~-1+AB+AA+BB+BA)
  # Create model list
  # Define p models
  #p.session=list(formula=~session)
  #p.year=list(formula=~-1+session*B)
  #p.year=list(formula=~-1+session:B)
  p.year=list(formula=~-1+session:B+session:A)
  #p.year=list(formula=~-1+B1+B2+B3)
  #p.yearother=list(formula=~-1+pA:B1+pB:B2+pC:B3)
  cml=create.model.list("RDOccupEG")
  # Run and return marklist of models
  return(mark.wrapper(cml,data=LIBL.process,ddl=LIBL.ddl))
}

LIBL.models=fit.LIBL.models()
LIBL.models$Psi.dot.Epsilon.dot.Gamma.dot.p.year$design.matrix

#Need Parameter Index from unsimplified PIM to tell RMark which covariate to estimate
PIMS(LIBL.models[[1]],"p", simplified=F)
PIMS(LIBL.models[[1]],"Psi", simplified=F)
PIMS(LIBL.models[[1]],"Epsilon", simplified=F)
PIMS(LIBL.models[[1]],"Gamma", simplified=F)

cov.df=data.frame(B1=c(1,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0),  #B1
                   B2=c(0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0),  #B2
                   B3=c(0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1),  #B3
                   A1=c(rep(0,17)),  #A1
                   A2=c(rep(0,17)),  #A2
                   A3=c(rep(0,17)),  #A3
                   AB1=c(rep(0,17)),  #AB1
                   AB2=c(rep(0,17)),  #AB2
                   AA1=c(rep(0,17)),  #AA1
                   AA2=c(rep(0,17)),  #AA2
                   BB1=c(0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0),  #BB1
                   BB2=c(0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0),  #BB2
                   BA1=c(rep(0,17)),  #BA1
                   BA2=c(rep(0,17))  #BA2
                   )
cov.df$index=c(1:17)

Model.ests=covariate.predictions(LIBL.models[[1]], data=cov.df)


The design matrix looks the same as the one from MARK:

Code: Select all
> LIBL.models$Psi.dot.Epsilon.dot.Gamma.dot.p.year$design.matrix
                 Psi:B1 Psi:A1 Epsilon:AB Epsilon:AA Epsilon:BB Epsilon:BA Gamma:AB Gamma:AA Gamma:BB Gamma:BA p:session1:B
Psi g1 a0 t1     "B1"   "A1"   "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "0"         
Epsilon g1 a0 t1 "0"    "0"    "AB1"      "AA1"      "BB1"      "BA1"      "0"      "0"      "0"      "0"      "0"         
Epsilon g1 a1 t2 "0"    "0"    "AB2"      "AA2"      "BB2"      "BA2"      "0"      "0"      "0"      "0"      "0"         
Gamma g1 a0 t1   "0"    "0"    "0"        "0"        "0"        "0"        "AB1"    "AA1"    "BB1"    "BA1"    "0"         
Gamma g1 a1 t2   "0"    "0"    "0"        "0"        "0"        "0"        "AB2"    "AA2"    "BB2"    "BA2"    "0"         
p g1 s1 t1       "0"    "0"    "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "B1"       
p g1 s2 t1       "0"    "0"    "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "0"         
p g1 s3 t1       "0"    "0"    "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "0"         
                 p:session2:B p:session3:B p:session1:A p:session2:A p:session3:A
Psi g1 a0 t1     "0"          "0"          "0"          "0"          "0"         
Epsilon g1 a0 t1 "0"          "0"          "0"          "0"          "0"         
Epsilon g1 a1 t2 "0"          "0"          "0"          "0"          "0"         
Gamma g1 a0 t1   "0"          "0"          "0"          "0"          "0"         
Gamma g1 a1 t2   "0"          "0"          "0"          "0"          "0"         
p g1 s1 t1       "0"          "0"          "A1"         "0"          "0"         
p g1 s2 t1       "B2"         "0"          "0"          "A2"         "0"         
p g1 s3 t1       "0"          "B3"         "0"          "0"          "A3"       


However in trying to define what covariates to estimate, it isn't giving quite the same answers as MARK. I'm looking at estimates where BB1 = BB2 = 1, and so B1=B2=B3=1, and everything else is zero.

Here is the covariate dataframe from the code above and the estimates.

Code: Select all
> cov.df
   B1 B2 B3 A1 A2 A3 AB1 AB2 AA1 AA2 BB1 BB2 BA1 BA2 index
1   1  0  0  0  0  0   0   0   0   0   0   0   0   0     1
2   0  0  0  0  0  0   0   0   0   0   1   0   0   0     2
3   0  0  0  0  0  0   0   0   0   0   0   1   0   0     3
4   0  0  0  0  0  0   0   0   0   0   1   0   0   0     4
5   0  0  0  0  0  0   0   0   0   0   0   1   0   0     5
6   1  0  0  0  0  0   0   0   0   0   0   0   0   0     6
7   1  0  0  0  0  0   0   0   0   0   0   0   0   0     7
8   1  0  0  0  0  0   0   0   0   0   0   0   0   0     8
9   0  1  0  0  0  0   0   0   0   0   0   0   0   0     9
10  0  1  0  0  0  0   0   0   0   0   0   0   0   0    10
11  0  1  0  0  0  0   0   0   0   0   0   0   0   0    11
12  0  0  1  0  0  0   0   0   0   0   0   0   0   0    12
13  0  0  1  0  0  0   0   0   0   0   0   0   0   0    13
14  0  0  1  0  0  0   0   0   0   0   0   0   0   0    14
15  0  0  1  0  0  0   0   0   0   0   0   0   0   0    15
16  0  0  1  0  0  0   0   0   0   0   0   0   0   0    16
17  0  0  1  0  0  0   0   0   0   0   0   0   0   0    17

> Model.ests
$estimates
   vcv.index model.index par.index B1 B2 B3 A1 A2 A3 AB1 AB2 AA1 AA2 BB1 BB2 BA1 BA2 index  estimate         se       lcl       ucl
1          1           1         1  1  0  0  0  0  0   0   0   0   0   0   0   0   0     1 0.7432198 0.06528550 0.5968312 0.8498294
2          2           2         2  0  0  0  0  0  0   0   0   0   0   1   0   0   0     2 0.2548359 0.08059615 0.1295595 0.4400128
3          3           3         3  0  0  0  0  0  0   0   0   0   0   0   1   0   0     3 0.2548359 0.08059615 0.1295595 0.4400128
4          4           4         4  0  0  0  0  0  0   0   0   0   0   1   0   0   0     4 0.3359001 0.10056867 0.1728934 0.5503341
5          5           5         5  0  0  0  0  0  0   0   0   0   0   0   1   0   0     5 0.3359001 0.10056867 0.1728934 0.5503341
6          6           6         6  1  0  0  0  0  0   0   0   0   0   0   0   0   0     6 0.7010555 0.04846991 0.5984545 0.7867808
7          6           6         7  1  0  0  0  0  0   0   0   0   0   0   0   0   0     7 0.7010555 0.04846991 0.5984545 0.7867808
8          6           6         8  1  0  0  0  0  0   0   0   0   0   0   0   0   0     8 0.7010555 0.04846991 0.5984545 0.7867808
9          7           7         9  0  1  0  0  0  0   0   0   0   0   0   0   0   0     9 0.3447109 0.07262784 0.2188193 0.4969548
10         7           7        10  0  1  0  0  0  0   0   0   0   0   0   0   0   0    10 0.3447109 0.07262784 0.2188193 0.4969548
11         7           7        11  0  1  0  0  0  0   0   0   0   0   0   0   0   0    11 0.3447109 0.07262784 0.2188193 0.4969548
12         8           8        12  0  0  1  0  0  0   0   0   0   0   0   0   0   0    12 0.4750208 0.00000000 0.4750208 0.4750208
13         8           8        13  0  0  1  0  0  0   0   0   0   0   0   0   0   0    13 0.4750208 0.00000000 0.4750208 0.4750208
14         8           8        14  0  0  1  0  0  0   0   0   0   0   0   0   0   0    14 0.4750208 0.00000000 0.4750208 0.4750208
15         8           8        15  0  0  1  0  0  0   0   0   0   0   0   0   0   0    15 0.4750208 0.00000000 0.4750208 0.4750208
16         8           8        16  0  0  1  0  0  0   0   0   0   0   0   0   0   0    16 0.4750208 0.00000000 0.4750208 0.4750208
17         8           8        17  0  0  1  0  0  0   0   0   0   0   0   0   0   0    17 0.4750208 0.00000000 0.4750208 0.4750208


First of all there is a zero SE for the 3rd session p's. Secondly the estimates are close but not the same as the MARK estimates:

Parameter Estimate Standard Error Lower Upper
-------------------------- -------------- -------------- -------------- --------------
1:Psi 0.7445108 0.0655467 0.5972798 0.8513148
2:Epsilon 0.2161603 0.0782260 0.1003726 0.4053367
3:Epsilon 0.2161603 0.0782260 0.1003726 0.4053367
4:Gamma 0.3336947 0.1000520 0.1717176 0.5474715
5:Gamma 0.3336947 0.1000520 0.1717176 0.5474715
6:p Session 1 0.6998424 0.0488543 0.5964537 0.7862360
7:p Session 1 0.6998424 0.0488543 0.5964537 0.7862360
8:p Session 1 0.6998424 0.0488543 0.5964537 0.7862360
9:p Session 2 0.4069499 0.0500476 0.3136654 0.5074648
10:p Session 2 0.4069499 0.0500476 0.3136654 0.5074648
11:p Session 2 0.4069499 0.0500476 0.3136654 0.5074648
12:p Session 3 0.4601467 0.0463455 0.3715848 0.5512989
13:p Session 3 0.4601467 0.0463455 0.3715848 0.5512989
14:p Session 3 0.4601467 0.0463455 0.3715848 0.5512989
15:p Session 3 0.4601467 0.0463455 0.3715848 0.5512989
16:p Session 3 0.4601467 0.0463455 0.3715848 0.5512989
17:p Session 3 0.4601467 0.0463455 0.3715848 0.5512989

Have I defined cov.df correctly? Why is there a zero SE and why are the estimates not exactly the same? Thanks in advance for the help,

Tyler Grant

Re: Dummy Covariates in RMark

PostPosted: Wed Dec 04, 2013 9:50 am
by jlaake
I contacted Tyler off-list. What he encountered was a bug that Jonathon Cohen reported to me 3 weeks ago. In some unusual cases if you have many time-varying individual covariates and you use the same covariates in more than one formula (ie more than one parameter), unfortunately RMark could get the covariate names out of sync with the covariate data when it was passed to MARK. As you can imagine that is not a good thing. Version 2.1.7 corrects that problem and when Tyler updated the code his problem went away. I'll be posting v2.1.7 to CRAN after some more testing for changes made in MARK for the MultScalOcc model. In the meantime, you can get that version at https://docs.google.com/folder/d/0B77g1 ... YtWE0/edit
My apologies for any problems this may have caused.

regards --jeff

Re: Dummy Covariates in RMark

PostPosted: Thu Dec 05, 2013 12:10 pm
by TGrant
I'm having trouble getting RMark to represent some design matrices correctly. Using the simple formula p.AirT=list(formula=~-1+B+AirT) gives this design matrix:

Code: Select all
> LIBL.models$Psi.dot.Epsilon.dot.Gamma.dot.p.AirT$design.matrix
                 Psi:B1 Psi:A1 Epsilon:AB Epsilon:AA Epsilon:BB Epsilon:BA Gamma:AB Gamma:AA Gamma:BB Gamma:BA p:B  p:AirT
Psi g1 a0 t1     "B1"   "A1"   "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "0"  "0"   
Epsilon g1 a0 t1 "0"    "0"    "AB1"      "AA1"      "BB1"      "BA1"      "0"      "0"      "0"      "0"      "0"  "0"   
Epsilon g1 a1 t2 "0"    "0"    "AB2"      "AA2"      "BB2"      "BA2"      "0"      "0"      "0"      "0"      "0"  "0"   
Gamma g1 a0 t1   "0"    "0"    "0"        "0"        "0"        "0"        "AB1"    "AA1"    "BB1"    "BA1"    "0"  "0"   
Gamma g1 a1 t2   "0"    "0"    "0"        "0"        "0"        "0"        "AB2"    "AA2"    "BB2"    "BA2"    "0"  "0"   
p g1 s1 t1       "0"    "0"    "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "B1" "AirT1"
p g1 s2 t1       "0"    "0"    "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "B2" "AirT2"
p g1 s3 t1       "0"    "0"    "0"        "0"        "0"        "0"        "0"      "0"      "0"      "0"      "B3" "AirT3"


It's only using the first 3 occasions of the covariate. The design matrix should look like this (the column is labeled wrong):

https://drive.google.com/file/d/0B7p4Ri ... sp=sharing

The Julian date model is similar but slightly more complicated because I need a separate beta for each year.

https://drive.google.com/file/d/0B7p4Ri ... sp=sharing

If these links to my screenshots don't work let me know.

Thanks.

Re: Dummy Covariates in RMark

PostPosted: Thu Dec 05, 2013 12:45 pm
by jlaake
Tyler-

When I originally designed the individual time-varying covariates robust models were not in RMark so it was only designed for time variation. Then I added the use of both session-time covariates. p is a secondary parameter so time is labelled by the 3 occasions in the primary occasion. I need to add a way to specify when session should be used over time. As a fix, rename time to say xtime (or whatever) and then rename the session variable to time in the p design data. That should work but let me know.

This is an area where the documentation needs to be improved - for my sake as well.

--jeff

Re: Dummy Covariates in RMark

PostPosted: Tue Dec 10, 2013 3:54 pm
by TGrant
Jeff, that's not working for some reason.

I tried this for the p design data but no luck either:

Code: Select all
$p
   par.index model.index group xtime time Time pA pB pC
1          1           6     1     1    1    0  1  0  0
2          2           7     1     2    2    1  1  0  0
3          3           8     1     3    3    2  1  0  0
4          4           9     1     1    4    0  1  1  0
5          5          10     1     2    5    1  1  1  0
6          6          11     1     3    6    2  1  1  0
7          7          12     1     1    7    0  1  0  1
8          8          13     1     2    8    1  1  0  1
9          9          14     1     3    9    2  1  0  1
10        10          15     1     4   10    3  1  0  1
11        11          16     1     5   11    4  1  0  1
12        12          17     1     6   12    5  1  0  1