Closed population with covariates: problems with "p" and "N"

questions concerning analysis/theory using program MARK

Closed population with covariates: problems with "p" and "N"

Postby samastete » Wed Apr 27, 2011 2:45 pm

Estimation of a closed population with covariates: problems with “p” and “N”

Dear forum members


I’m currently using MARK for the estimation of abundance (N) of a closed population. Previous analyses with CAPTURE suggest the Mh model as the best estimator for the population. In order to improve the analysis and based in my observations I’m using two covariates for “p” and “c” -sex and time-, and one covariate for “N” (sex). The (almost 60) resulting candidate models in MARK look like the following ones

p(sex*t)c(sex+t)N(sex)
p(t)c(sex)N(.)
p(sex)c(.)N(sex)
etc,

All the models were created using the Full Design (Matrix) and making the respective modifications of the most complex model {p(sex*t)c(sex*t)N(sex)} to create the simpler models. It wasn’t possible to do this with the PIM Chart due the difficult to deal with interactions (i.e. “p(sex*t)”).
After looking at the results, I found two problems:

1 – Problems with “p”.

I beg you to quickly check at the last “Pi” values (17:p and 34:p) in the two following summaries of the Estimation of Parameters.

FUNCTION A (17 Capture Events)
Real Function Parameters of {p(t)c(sex)N(sex)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:p 0.1999999 0.1264911 0.0504114 0.5407150
2:p 0.1408542E-063 0.2355231E-061 -0.4602168E-061 0.4630339E-061
(...)
17:p 1.0000000 0.0000000 1.0000000 1.0000000
18:p 0.1999999 0.1264911 0.0504114 0.5407150
19:p 0.1408542E-063 0.2355231E-061 -0.4602168E-061 0.4630339E-061
(...)
34:p 1.0000000 0.0000000 1.0000000 1.0000000
(…)
66:c 0.2333333 0.0772202 0.1155105 0.4149549
67:N 7.0000000 0.0000000 7.0000000 7.0000000
68:N 3.0000000 0.0000000 3.0000000 3.0000000


FUNCTION B (17 Capture Events)

Real Function Parameters of {p(sex)c(sex)N(.)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:p 0.1794872 0.0614507 0.0880518 0.3313705
2:p 0.1794872 0.0614507 0.0880518 0.3313705
(…)
17:p 0.1794872 0.0614507 0.0880518 0.3313705
18:p 0.1428570 0.0763603 0.0467958 0.3613554
(…)
34:p 0.1428570 0.0763603 0.0467958 0.3613554
35:c 0.3000000 0.0512348 0.2099066 0.4087569
(…)
66:c 0.2333333 0.0772202 0.1155105 0.4149548
67:N 7.0000000 0.1107337E-003 7.0000000 7.0000683
68:N 3.0000000 0.1107337E-003 3.0000000 3.0000683


In the “Gentle Introduction Manual” (p. 530-532 “14.3.1 constraining the final p”) and in the White article of 2008 (Closed population estimation models and their extensions in Program MARK), page 4, both text warns that if no constraint is imposed on the last “Pi”, the estimated abundance “N” will be simply Mt+1, with the last “Pi” estimate equaling 1. As you can see, this happens in FUNCTION A and in every modeI which includes a relation with “time” in the “p” (See values of 17:p and 34:p). When I set up a constrain, like including “sex” in “p” or making it constant – like FUNCTION B – , the last “p” equals a value different than 1.

My question is, are those models with the last p=1 appropriate to estimate the abundance? If not, is there a way to correct/modify them in order to avoid the last “Pi” = 1? Is this error caused by a miss operation during the Full Design process or by any other miss operation?


2 – Problem with “N”.

Once again I beg you to check at the calculated abundances “N” in FUNCTION A & B (67:N & 68:N). As you see, FUNCTION A includes N(sex), so it is expected to have 2 Ns, one for males and other for females (67:N & 68:N). To create FUNCTION B {with N(.)}, in the Full Design Matrix I eliminate the column correspond to sex, leaving only the Intercept of N, to make it constant. Nonetheless I still have two estimations of N (67:N & 68:N). Why did the Estimate of Real Parameters not reflect this modification? Curiously, when I work with the PIM Chart and make the N constant –N(.)–, the function Estimate of Real Parameters reflects only one “N’, as expected. Why is this not working with the Full Design Matrix?

Thank you a lot for any help and excuse me for the extension of this letter.

Cordially,
Samuel
samastete
 
Posts: 3
Joined: Tue Mar 15, 2011 4:11 pm

Re: Closed population with covariates: problems with "p" and

Postby cooch » Wed Apr 27, 2011 8:37 pm

samastete wrote:Estimation of a closed population with covariates: problems with “p” and “N”

Dear forum members


I’m currently using MARK for the estimation of abundance (N) of a closed population. Previous analyses with CAPTURE suggest the Mh model as the best estimator for the population. In order to improve the analysis and based in my observations I’m using two covariates for “p” and “c” -sex and time-, and one covariate for “N” (sex). The (almost 60) resulting candidate models in MARK look like the following ones

p(sex*t)c(sex+t)N(sex)
p(t)c(sex)N(.)
p(sex)c(.)N(sex)
etc,



60 models? You're data dredging (i.e., trying all possible models). Given that your main interest is in estimating N, I suppose you can argue this is reasonable. But its still probably too many models -- I'd suggest thinking harder about what is and is not biologically plausible.

All the models were created using the Full Design (Matrix) and making the respective modifications of the most complex model {p(sex*t)c(sex*t)N(sex)} to create the simpler models. It wasn’t possible to do this with the PIM Chart due the difficult to deal with interactions (i.e. “p(sex*t)”).


No, models with interactions can be built with PIMs -- the only models you can't build with PIMs are those that have additive effects, or where you want to constrain estimates to be a linear function of one or more covariates.

After looking at the results, I found two problems:

1 – Problems with “p”.

I beg you to quickly check at the last “Pi” values (17:p and 34:p) in the two following summaries of the Estimation of Parameters.


I assume you mean P(i)-- the ith indexed value of p -- rather than Pi, which is a parameter estimated in finite mixture models.

FUNCTION A (17 Capture Events)
Real Function Parameters of {p(t)c(sex)N(sex)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:p 0.1999999 0.1264911 0.0504114 0.5407150
2:p 0.1408542E-063 0.2355231E-061 -0.4602168E-061 0.4630339E-061
(...)
17:p 1.0000000 0.0000000 1.0000000 1.0000000
18:p 0.1999999 0.1264911 0.0504114 0.5407150
19:p 0.1408542E-063 0.2355231E-061 -0.4602168E-061 0.4630339E-061
(...)
34:p 1.0000000 0.0000000 1.0000000 1.0000000
(…)
66:c 0.2333333 0.0772202 0.1155105 0.4149549
67:N 7.0000000 0.0000000 7.0000000 7.0000000
68:N 3.0000000 0.0000000 3.0000000 3.0000000


FUNCTION B (17 Capture Events)

Real Function Parameters of {p(sex)c(sex)N(.)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:p 0.1794872 0.0614507 0.0880518 0.3313705
2:p 0.1794872 0.0614507 0.0880518 0.3313705
(…)
17:p 0.1794872 0.0614507 0.0880518 0.3313705
18:p 0.1428570 0.0763603 0.0467958 0.3613554
(…)
34:p 0.1428570 0.0763603 0.0467958 0.3613554
35:c 0.3000000 0.0512348 0.2099066 0.4087569
(…)
66:c 0.2333333 0.0772202 0.1155105 0.4149548
67:N 7.0000000 0.1107337E-003 7.0000000 7.0000683
68:N 3.0000000 0.1107337E-003 3.0000000 3.0000683


In the “Gentle Introduction Manual” (p. 530-532 “14.3.1 constraining the final p”) and in the White article of 2008 (Closed population estimation models and their extensions in Program MARK), page 4, both text warns that if no constraint is imposed on the last “Pi”, the estimated abundance “N” will be simply Mt+1, with the last “Pi” estimate equaling 1.


Correct -- the rationale is explained in 'algebraic detail' in section 14.3.1 (as an aside, p. 530-523 doesn't mean anything -- chapter are all 'internally numbered' -- meaning, each chapter begins with page 1. So, in referring to anything in the book, either note the section or subsection, or the page number(s) within chapter -- upper right-hand corner of each page).

As you can see, this happens in FUNCTION A and in every modeI which includes a relation with “time” in the “p” (See values of 17:p and 34:p). When I set up a constrain, like including “sex” in “p” or making it constant – like FUNCTION B – , the last “p” equals a value different than 1.


As it should.

My question is, are those models with the last p=1 appropriate to estimate the abundance?


You mean models where the final p=1 because you don't impose a constraint? Then obviously no -- since the 'estimate' isn't an estimate, it's simply M(t+1).

If not, is there a way to correct/modify them in order to avoid the last “Pi” = 1? Is this error caused by a miss operation during the Full Design process or by any other miss operation?


I have no idea what you're asking here.


2 – Problem with “N”.

Once again I beg you to check at the calculated abundances “N” in FUNCTION A & B (67:N & 68:N). As you see, FUNCTION A includes N(sex), so it is expected to have 2 Ns, one for males and other for females (67:N & 68:N). To create FUNCTION B {with N(.)}, in the Full Design Matrix I eliminate the column correspond to sex, leaving only the Intercept of N, to make it constant. Nonetheless I still have two estimations of N (67:N & 68:N). Why did the Estimate of Real Parameters not reflect this modification? Curiously, when I work with the PIM Chart and make the N constant –N(.)–, the function Estimate of Real Parameters reflects only one “N’, as expected. Why is this not working with the Full Design Matrix?


Sound like you don't really understand design matrices. Go back and thoroughly study chapters 6 -> 7 (6 for basics of linear models, 7 for the TSM ultrastructure which shows up a lot in closed abundance DM's), and then start over.Seriously.
cooch
 
Posts: 1654
Joined: Thu May 15, 2003 4:11 pm
Location: Cornell University

Re: Closed population with covariates: problems with "p" and

Postby samastete » Mon May 09, 2011 3:00 pm

.
Last edited by samastete on Mon May 09, 2011 3:13 pm, edited 1 time in total.
samastete
 
Posts: 3
Joined: Tue Mar 15, 2011 4:11 pm

Re: Closed population with covariates: problems with "p" and

Postby samastete » Mon May 09, 2011 3:12 pm

Dear forum members,

Excuse me for the delay of my answer. Cooch, thank you a lot for your comments and observations. I have some answers to them, in order to facilitate the reading, I'm qouting in the text. I have included also some images, which hope will make my questions easy to understand. I hope they can be useful for other novel members like me.

Cordially,

Samuel



cooch wrote:
samastete wrote:Estimation of a closed population with covariates: problems with “p” and “N”

Dear forum members


I’m currently using MARK for the estimation of abundance (N) of a closed population. Previous analyses with CAPTURE suggest the Mh model as the best estimator for the population. In order to improve the analysis and based in my observations I’m using two covariates for “p” and “c” -sex and time-, and one covariate for “N” (sex). The (almost 60) resulting candidate models in MARK look like the following ones

p(sex*t)c(sex+t)N(sex)
p(t)c(sex)N(.)
p(sex)c(.)N(sex)
etc,


60 models? You're data dredging (i.e., trying all possible models). Given that your main interest is in estimating N, I suppose you can argue this is reasonable. But its still probably too many models -- I'd suggest thinking harder about what is and is not biologically plausible.


Yes, because I'm beginning into the MARK universe, I tried to be very inclusive in my models. I have also the biological information to discard most of the candidate models. As I said, I prefer to discard using a posteriori criteria than being in the interrogant if I have ommited one model. Actually, I'm working with a doze models which are the most biologically plausible.




All the models were created using the Full Design (Matrix) and making the respective modifications of the most complex model {p(sex*t)c(sex*t)N(sex)} to create the simpler models. It wasn’t possible to do this with the PIM Chart due the difficult to deal with interactions (i.e. “p(sex*t)”).


No, models with interactions can be built with PIMs -- the only models you can't build with PIMs are those that have additive effects, or where you want to constrain estimates to be a linear function of one or more covariates.


You are right, thank you!




After looking at the results, I found two problems:

1 – Problems with “p”.

I beg you to quickly check at the last “Pi” values (17:p and 34:p) in the two following summaries of the Estimation of Parameters.


I assume you mean P(i)-- the ith indexed value of p -- rather than Pi, which is a parameter estimated in finite mixture models.

Yes! It's P(i) or, easier - Pi -, as writen in the Gentle Introduction Book. I just discover the HTML option to format text!

FUNCTION A (17 Capture Events)
Real Function Parameters of {p(t)c(sex)N(sex)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:p 0.1999999 0.1264911 0.0504114 0.5407150
2:p 0.1408542E-063 0.2355231E-061 -0.4602168E-061 0.4630339E-061
(...)
17:p 1.0000000 0.0000000 1.0000000 1.0000000
18:p 0.1999999 0.1264911 0.0504114 0.5407150
19:p 0.1408542E-063 0.2355231E-061 -0.4602168E-061 0.4630339E-061
(...)
34:p 1.0000000 0.0000000 1.0000000 1.0000000
(…)
66:c 0.2333333 0.0772202 0.1155105 0.4149549
67:N 7.0000000 0.0000000 7.0000000 7.0000000
68:N 3.0000000 0.0000000 3.0000000 3.0000000


FUNCTION B (17 Capture Events)

Real Function Parameters of {p(sex)c(sex)N(.)}
95% Confidence Interval
Parameter Estimate Standard Error Lower Upper
------------------------- -------------- -------------- -------------- --------------
1:p 0.1794872 0.0614507 0.0880518 0.3313705
2:p 0.1794872 0.0614507 0.0880518 0.3313705
(…)
17:p 0.1794872 0.0614507 0.0880518 0.3313705
18:p 0.1428570 0.0763603 0.0467958 0.3613554
(…)
34:p 0.1428570 0.0763603 0.0467958 0.3613554
35:c 0.3000000 0.0512348 0.2099066 0.4087569
(…)
66:c 0.2333333 0.0772202 0.1155105 0.4149548
67:N 7.0000000 0.1107337E-003 7.0000000 7.0000683
68:N 3.0000000 0.1107337E-003 3.0000000 3.0000683


In the “Gentle Introduction Manual” (p. 530-532 “14.3.1 constraining the final p”) and in the White article of 2008 (Closed population estimation models and their extensions in Program MARK), page 4, both text warns that if no constraint is imposed on the last “Pi”, the estimated abundance “N” will be simply Mt+1, with the last “Pi” estimate equaling 1.


Correct -- the rationale is explained in 'algebraic detail' in section 14.3.1 (as an aside, p. 530-523 doesn't mean anything -- chapter are all 'internally numbered' -- meaning, each chapter begins with page 1. So, in referring to anything in the book, either note the section or subsection, or the page number(s) within chapter -- upper right-hand corner of each page).

As you can see, this happens in FUNCTION A and in every modeI which includes a relation with “time” in the “p” (See values of 17:p and 34:p). When I set up a constrain, like including “sex” in “p” or making it constant – like FUNCTION B – , the last “p” equals a value different than 1.


As it should.

My question is, are those models with the last p=1 appropriate to estimate the abundance?


You mean models where the final p=1 because you don't impose a constraint? Then obviously no -- since the 'estimate' isn't an estimate, it's simply M(t+1).

Every candidate model is useful to calculate an N(estimate). But, aparently - as you explained -, the models with p=1 will result in M(t+1), which is far away to be to most adequate N(estimate). So, to find the most apropriate N(estimate), they are not useful, OK. They could be for other goals, but apparently this is not the case. Is that right?


If not, is there a way to correct/modify them in order to avoid the last “Pi” = 1? Is this error caused by a miss operation during the Full Design process or by any other miss operation?


I have no idea what you're asking here.

Forgive me english. My question is if there exist a way to modify (or correct) a model which have a last Pi value equal to 1, in order to avoid this result. If there is not such a way, and if this is not a consequence of an error of my part (and is the nature of the model), then I will eliminate them from the analysis. Thank you!

2 – Problem with “N”.

Once again I beg you to check at the calculated abundances “N” in FUNCTION A & B (67:N & 68:N). As you see, FUNCTION A includes N(sex), so it is expected to have 2 Ns, one for males and other for females (67:N & 68:N). To create FUNCTION B {with N(.)}, in the Full Design Matrix I eliminate the column correspond to sex, leaving only the Intercept of N, to make it constant. Nonetheless I still have two estimations of N (67:N & 68:N). Why did the Estimate of Real Parameters not reflect this modification? Curiously, when I work with the PIM Chart and make the N constant –N(.)–, the function Estimate of Real Parameters reflects only one “N’, as expected. Why is this not working with the Full Design Matrix?


Sound like you don't really understand design matrices. Go back and thoroughly study chapters 6 -> 7 (6 for basics of linear models, 7 for the TSM ultrastructure which shows up a lot in closed abundance DM's), and then start over.Seriously.[/quote]

Once again I have to beg to forgive my english, maybe I have not been clear to explain you my doubt. I order make it easier, I will include some pictures from the MARK program. As I told you, I worked with the models in the Matrix, with 17 capture events. Let see how they look in the Matrix and their estimates of real parameters.

The first one, easy to understand, and the parameter's results.

p(sex)c(.)N(sex)


Image
http://www.flickr.com/photos/45975405@N08/5703960583/in/photostream/

You can identify the intercept and binary values for sex (1 male, 0 female)

And here the estimate of real parameters


Image
http://www.flickr.com/photos/45975405@N08/5704527068/in/photostream

You can see in the red circle the two different estimates corresponding to the male and female population. Fine! Then I modifyed the Matrix in order to create the model

p(sex)c(.)N(.)


Image
http://www.flickr.com/photos/45975405@N08/5704526810/in/photostream

Fine! N is constant - N(.) - . And here is the estimate of real parameters

Image
http://www.flickr.com/photos/45975405@N08/5703960755/in/photostream

Wait! What happened? I expected to have only one value, which could be constant (meaning 67 and 68:N being the same value). But here, both of them have different values. This doesn't coherent if I construct the matrix to have p(sex)c(.)N(.)!!! But the most intriguing happens when I used the PIM Chart and modify it to build the same model,

p(sex)c(.)N(.)

See how it seems


Image
http://www.flickr.com/photos/45975405@N08/5703960635/in/photostream

And, as I said, here come the most intriguing, the estimate of real parameters

Image
http://www.flickr.com/photos/45975405@N08/5704526910/in/photostream

In this case, I have only one value for the general (and constant) population, 7, wich coincidently is the value of the male population in the model wich includes N(sex). So, as you can guess, my question is... what happened? Did I made something wrong with the matrix? Why do I have two different values of N(estimate) if I expected to have only one? - Model p(sex)c(.)N(.) -.

Thanks all of you for your time and patience.

Cordially,

Samuel

samastete
 
Posts: 3
Joined: Tue Mar 15, 2011 4:11 pm


Return to analysis help

Who is online

Users browsing this forum: No registered users and 1 guest