www.phidot.org

by **jheishma** » Fri Oct 14, 2022 2:15 pm

Hello, I have searched but have not been able to find an explicit solution to my question; please direct me towards any references that may be applicable here.

I am seeking to develop a CJS model from capture-recapture data for which individual "spawn year" is known. I am interested in estimating parameters of annual survival and encounter probability for these cohorts. My method and data are represented in spirit in this code chunk:

Code: Select all: # capture events represent annual detection results spanning 1993 - 2022 ch_df <- data.frame(ch = c(000000001100000000000000000000, 000000000000000000010011000000, 000000000000000000000000001010), spawn_year = c(1998, 2008, 2015), init.age = c(4, 5, 5))

Here is my first point of uncertainty: in general, what are the differences to be expected when generating parameter estimates by using a grouping variable such as spawn_year vs translating spawn_year into a measure of age-at-encounters, as I understand can be accomplished within the "initial.ages" argument in process.data. I should clarify that I am interpretting the "initial.ages" to be "the age when the individual was first encountered", which corresponds to the first "1" in my capture histories. Correct me if this is incorrect, here.

-------

I have been proceeding to calculate the age-at-initial-encounter from the data above when processing data, as below:

Code: Select all: dat.pr <- process.data(ch_df, initial_ages = c(4 , 5)) dat.ddl <- make.design.data(dat.pr) # Define parameters Phi.age <- list(formula = ~ age) p.dot <- list(formula = ~ 1) # Construct models results <- mark.wrapper(create.model.list("CJS", data = ch_df, ddl = dat.ddl) # output in question results$Phi.age.p.dot$results$real

My largest question is relating to the interpretation of this output. It appears as though I am not generating parameter estimates for age-structured survival, as the output looks something like this, where x-variables represent unique annual estimates:

Phi g1 c1 a0 t1 x1
Phi g1 c1 a1 t2 x2
...
Phi g1 c1 a28 t29 x29
p g1 c1 a1 t2 x30

This doesn't appear to be generating estimates of survival-at-age, so much as it resembles a model of time-structured survival estimates. I will note that the annual estimates do vary from those generated from a corresponding Phi ~ time model, although I am not certain why. Am I making some error in my code to indicate the variable use for assigning known age-at-encounters? Or am I making an error in my interpretation of parameter estimates?

by **jlaake** » Fri Oct 14, 2022 3:02 pm

First of all thank you for creating a simple reproducible example. This makes it a lot easier to answer the question. Although that must not be the code you used because it gives a fair number of errors. Please try your code before posting in a message. The argument is initial.ages and not initial_ages. Also you need to add "" around ch values to make them character values and a missing parenthesis.

Age models are discussed in section 6.5 of the workshop notes. A link to a documentation zip with the workshop notes and other info is given every time you type library(RMark). Unfortunately this is often over-looked.

You are correct that initial.ages value is the age at first capture/release but what you are missing is that this only works if you define a group variable that defines all of the animals with a particular age at initial release. Right now your initial.ages variable is setting the initial age for all critters at 4 because there is only one group. See your example below

Code: Select all: ch_df <- data.frame(ch = c("000000001100000000000000000000", "000000000000000000010011000000", "000000000000000000000000001010"), spawn_year = c(1998, 2008, 2015), init.age = c(4, 5, 5)) str(ch_df) dat.pr <- process.data(ch_df, initial.ages = c(4 , 5)) dat.ddl <- make.design.data(dat.pr)

When you look at the first few records of the Phi design data you see the age values start at 4

Code: Select all: > head(dat.ddl$Phi) par.index model.index group cohort age time occ.cohort Cohort Age Time 1 1 1 1 1 4 1 1 0 4 0 2 2 2 1 1 5 2 1 0 5 1 3 3 3 1 1 6 3 1 0 6 2 4 4 4 1 1 7 4 1 0 7 3 5 5 5 1 1 8 5 1 0 8 4 6 6 6 1 1 9 6 1 0 9 5

Using your example, this is what you would want to do. Note that init.age is numeric and it just warns you that it is using it as a factor variable. No biggie but group variables must be categorical (factor) variables. Below I show that the initial age is 4 for group 4 and 5 for group 5. In your actual data you would want to have a group for each initial age value in your data.

Code: Select all: > dat.pr <- process.data(ch_df, groups="init.age", age.var=1, initial.ages = c(4 , 5)) Warning message: In process.data(ch_df, groups = "init.age", age.var = 1, initial.ages = c(4, : init.age is not a factor variable. Coercing to factor. > dat.ddl <- make.design.data(dat.pr) > head(dat.ddl$Phi) par.index model.index group cohort age time occ.cohort Cohort Age Time init.age 1 1 1 4 1 4 1 1 0 4 0 4 2 2 2 4 1 5 2 1 0 5 1 4 3 3 3 4 1 6 3 1 0 6 2 4 4 4 4 4 1 7 4 1 0 7 3 4 5 5 5 4 1 8 5 1 0 8 4 4 6 6 6 4 1 9 6 1 0 9 5 4 > head(dat.ddl$Phi[dat.ddl$Phi$group=="5",]) par.index model.index group cohort age time occ.cohort Cohort Age Time init.age 436 436 436 5 1 5 1 1 0 5 0 5 437 437 437 5 1 6 2 1 0 6 1 5 438 438 438 5 1 7 3 1 0 7 2 5 439 439 439 5 1 8 4 1 0 8 3 5 440 440 440 5 1 9 5 1 0 9 4 5 441 441 441 5 1 10 6 1 0 10 5 5

by **jheishma** » Fri Oct 21, 2022 3:41 pm

Thanks for your attentive reply, and my apologies for the syntax errors that I included in the sample code. I have been working through the documentation you referenced, but I have a few lingering follow-up questions regarding the output which I received after adapting the code you corrected.

Code: Select all: # example code from this thread library(RMark) ch_df <- data.frame(ch = c("000000001100000000000000000000", "000000000000000000010011000000", "000000000000000000000000001010"), spawn_year = c(1998, 2008, 2015), init.age = c(4, 5, 5)) str(ch_df) dat.pr <- process.data(ch_df, initial.ages = c(4 , 5)) dat.ddl <- make.design.data(dat.pr) Phi.age <- list(formula = ~ age) p.dot <- list(formula = ~ 1) # Construct competing models results <- mark.wrapper(create.model.list("CJS"), data = dat.pr, ddl = dat.ddl) # View parameter estimates results$Phi.age.p.dot$results$real

Upon running this code, I get an output with parameter labels that look a bit like this:

Phi g1 c1 a4 t1
Phi g1 c1 a5 t2
...
Phi g1 c1 a32 t29
p g1 c1 a5 t2

Initially, I am confused about the interpretation here, as it appears that I have only generated survival estimates for a single cohort (the cohort that is 4 years old at time t1), which omits the differential survival of those individuals which were born in 2008 or 2015 (t16, t23). Am I misinterpreting these labels?

------------------

Following this, I have another parameter label interpretation question from the output of a separate dataset I am working on which followed the sample code's process for generating design data with exception in the initial.ages argument shown here:

Code: Select all: dat.pr <- process.data(ch_df, groups = "init.age", age.var = 1, initial.ages = c(0:13))

Phi g0 c1 a0 t1
Phi g0 c1 a1 t2
...
Phi g0 c1 a28 t29
Phi g1 c1 a29 t29
Phi g2 c1 a30 t29
...
Phi g13 c1 a41 t29
p g0 c1 a1 t2

What is going on here? I reviewed Ch. 7 of the Gentle Introduction looking for clues in why the groups are confounding my age/time question listed above here, but haven't found any hints yet. I will appreciate greatly any references where I can review these parameter assignments. Thanks for your patience.

by **jlaake** » Fri Oct 21, 2022 4:42 pm

From section 6.10 of the workshop notes on PIM simplification

RMark maintains the original indices with the simplified indices for model averaging real parameters.
As the models change, the number and indices of the simplified parameters change to match the model
complexity, but the underlying original typically all-different indices remain the same. Because the simplified
indices differ across models they would not be useful for averaging across models. The original non-simplified
indices are also in model.index in the design data. The one side-effect of simplification is that the real
parameter labels in the output file produced by MARK are not useful. However, labels in the RMark
summaries can be useful although the pim formatted output provided by summary is typically more useful.
The real labels are pasted values of group (g), cohort (c), time (t), and age (a).

The real parameter labels were originally setup with the all-different PIM structure when RMark was first created. It became apparent fairly quickly with large data sets that the design matrix and number of real parameters could become so large that the variance-covariance matrix of the real parameters required too much memory to run on most computers. Thus I had to devise PIM simplification which reduces the design matrix to the unique rows of the design matrix and then devises new simpler PIMs with the unique PIM values. This is done without any concern for the row labels (real parameter labels) of the design matrix which are carried forward with the real parameters. Thus those labels are useless and should be ignored. I probably should remove them but have never done so. The upside of PIM simplification beyond enabling larger problem size is that models will run much faster due to the smaller number of rows in the design matrix.

The correct real parameter labels are given with the summary function as in:

Code: Select all: summary(results$Phi.age.p.dot,se=TRUE)$reals$Phi

You can explore PIM structure with the PIMS function as in

Code: Select all: > PIMS(results$Phi.age.p.dot,parameter="Phi") group = Group 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 10 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 17 1 2 3 4 5 6 7 8 9 10 11 12 13 18 1 2 3 4 5 6 7 8 9 10 11 12 19 1 2 3 4 5 6 7 8 9 10 11 20 1 2 3 4 5 6 7 8 9 10 21 1 2 3 4 5 6 7 8 9 22 1 2 3 4 5 6 7 8 23 1 2 3 4 5 6 7 24 1 2 3 4 5 6 25 1 2 3 4 5 26 1 2 3 4 27 1 2 3 28 1 2 29 1 > PIMS(results$Phi.age.p.dot,parameter="Phi",simplified=FALSE) group = Group 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 2 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 3 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 4 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 5 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 6 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 7 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 8 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 9 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 10 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 11 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 12 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 13 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 14 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 15 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 16 331 332 333 334 335 336 337 338 339 340 341 342 343 344 17 345 346 347 348 349 350 351 352 353 354 355 356 357 18 358 359 360 361 362 363 364 365 366 367 368 369 19 370 371 372 373 374 375 376 377 378 379 380 20 381 382 383 384 385 386 387 388 389 390 21 391 392 393 394 395 396 397 398 399 22 400 401 402 403 404 405 406 407 23 408 409 410 411 412 413 414 24 415 416 417 418 419 420 25 421 422 423 424 425 26 426 427 428 429 27 430 431 432 28 433 434 29 435 >

You can see the design matrix with the simplified PIMS with

Code: Select all: results$Phi.age.p.dot$design.matrix > nrow(results$Phi.age.p.dot$design.matrix) [1] 30

There are 30 rows in the simplified design matrix. If it was not simplified there would have been 436 rows.

Note that with the exception of the RMark appendix of the Gentle Introduction, there won't be useful information in it to answer questions about RMark output and labelling etc. Look to the RMark Appendix of the Gentle Introduction (which is somewhat dated) and preferably the Workshop notes that I mentioned in the last post. I recommend a very careful read of the entire Workshop notes before you get too far into using RMark. If you are using RMark the Gentle Introduction will be most useful for understanding PIMS, design matrices (LinearModels.pdf in RMark Documentation zip also useful here) and the multitude of chapters describing the various models in MARK.

by **jlaake** » Fri Oct 21, 2022 4:52 pm

Wasn't thinking quite correctly in my discussion of number of rows of the design matrix. With PIM simplification there are 30 rows and without simplification there would have been 870 rows of the full design matrix because there would be 435 Phi PIM values and 435 p PIM values for this example. Without simplification the variance-covariance matrix of the real parameters would be 870 by 870 (756,900 elements) and with simplification it is 30 by 30 (900 elements). This illustrates the benefits of PIM simplification fairly clearly.

www.phidot.org

Age-specific survival parameters

Age-specific survival parameters

Re: Age-specific survival parameters

Re: Age-specific survival parameters

Re: Age-specific survival parameters

Re: Age-specific survival parameters

Who is online