www.phidot.org

Website · by **simone77** » Wed Nov 23, 2011 8:35 am

I would like to finish my analyses next days and it would be great if someone could check some doubtful (to me) point, hoping that they could be of some general interest.

This is the case study summary:
1 group,
0 individual covariates,
3 states (no antibodies, antibodies, dead),
4 events (not seen, no antibodies, antibodies, unknown)
1 age class
9 occasions

This is how the pattern matrices look like:

1. At occasion nº 4, all the individuals captured are in event "unknown" (no blood samples). I am setting the corresponding state assignment probabilities to zero, is that right?

2. In some cases, some parameter isn't estimable due to model structure and for that reason you can say E-SURGE to exclude them from the optimization process (it may help to reach convergence). For instance in my case, I have set this Umbrella Model:
{pi(t) phi(f.t) psi(f.t) beta(a(1)+a(2).[f.t]) delta(a(1).[f.t]+a(2).[f.t])}
In this multi-event context, which parameters are not estimable due to model structure? Given some previous trials and how parameters are confounded in simpler models (CJS for example), my guess is that (greek letters refer to above pattern matrices) these parameters are not estimable:
- final pi,
- final phi(s),
- final psi(s),
- final beta(s) (referred to a(2)),
- final delta(s) (referred to a(2)),
- first beta (referred to a(1)).
In particular, I would believe that final psi(s) and delta(s) are not estimable as a consequence of the final parameters over that they are conditional (respectively phi(s) and beta(s)) not to be estimable.

Even though, if I fix these parameters to one, E-SURGE doesn't get a result for that running.
What's wrong?
As these final parameters should be confounded parameters (their products are estimable but they are not each one for separate) , I thought that the value=1 for each of them might be generating some conflict and tried to fix the final parameter to the random value they were assigned by default (multiple random option).
Doing that, the model run but I am not sure at all if it is correct.
In fact, I get much more deviance with respect to the same model with no final parameters constrained.

For that reason I have tried to run a few times the same models with the same IVFV settings, i.e. (i) that model with those final parameters fixed to their random values, and (ii) that model with just the delta(s) at occasion 4 fixed to zero.
In case (i) I get several very different results in the parameters estimates (perhaps it is wrong to fix some of these final values) and in case (ii) I get a few very different results that I guess could be due to local minima (although the window about the saddle points didn't pop up).

3. Is there a suggested way to proceed in the model simplification? I mean in a normal CJS simplification I was told to modeling before the capture rate and after go on with modeling survival rate, is there some rule of thumb on this in the multi-event context?

4. Is there a way to run automatically the same models on a different dataset (but same nº of occasions, groups, age, states, events), or something similar ?

5. In the MS-DOS window during the running appears quite often a sentence saying "Ten first histories incompatibles with the model". Should I be worried about it?

Thanks for any help,

Simone

by **ganghis** » Wed Nov 23, 2011 1:21 pm

Hi Simone,
Here's a shot at your questions (1)-(3). Hopefully someone else can provide input on (4) and (5).

Although the theoretical number of identifiable parameters should be similar to multistate CJS models, there are often problems with estimability when fitting these complicated models to real life data. In particular, the thing I'd worry most about with your dataset is having enough information in the data to estimate all the parameters. I would actually approach this the other way - start with a simple model (time invariant, say) and make sure you can estimate everything (Remi has some good numerical tools built in to assess this), then add on additional complexity. Once you've built a model from the 'ground up' this way, you can go to the reverse direction if you'd like.

Concerning occasion 4, it depends on what you mean by 'state assignment probabilities.' The initial state parameters (pi) should still be estimated, but the delta's can be constrained to zero for the 'known' states.

Paul

Website · by **simone77** » Wed Nov 23, 2011 4:32 pm

Hi Paul,

Thank you for your suggestions. I will try to start from the simpler model, I guess the Rémi numerical tools you're talking about are those described in the E-SURGE manual to control local minima and convergence and to check parameters identifiability (tolerance, initial values, parameters identifiability in output, etc.). What you suggest sounds familiar to me as I did read this way to proceed in model selection elsewhere in some paper, but it is important to me to know it is something suitable for this specific situation.

Simone

Website · by **simone77** » Mon Nov 28, 2011 8:48 am

Hi Paul,

ganghis wrote:In particular, the thing I'd worry most about with your dataset is having enough information in the data to estimate all the parameters. I would actually approach this the other way - start with a simple model (time invariant, say) and make sure you can estimate everything (Remi has some good numerical tools built in to assess this), then add on additional complexity. Once you've built a model from the 'ground up' this way, you can go to the reverse direction if you'd like.

1. It would be very useful to me to better understand this.
I am starting with the time invariant model and go on adding complexity but I am not sure to understand what is the goal model to reach, is it the more parametrized given all its non redundant parameters are identifiable?
I would appreciate a lot any more details on how to do this.
2. I know there is some transient and trap happiness effect in females but given that (i) the intervals are uneven making difficult modeling age effect, (ii) the data are somehow sparse with respect to these kind of analyses and (iii) that the c-hat=Chi2/d.f. is not very high (1.5 < 2.5 - depending on each of the three datasets, i.e. three fences), I thought to calculate the c-hats on the CJS and use them in E-Surge analyses. Is that suitable for this?

Thank you for any help,

Simone

by **CHOQUET** » Mon Nov 28, 2011 9:11 am

Concerning :

> 4. Is there a way to run automatically the same models on a different dataset (but same nº of occasions,
> groups, age, states, events), or something similar ?

No, there is no way. Moreover, I do not recommend this approach because of the presence of local minima.

> 5. In the MS-DOS window during the running appears quite often a sentence saying "Ten first histories
> incompatibles with the model". Should I be worried about it?

If the optimisation process continue further, no.
If the optimisation process stop then yes. Perhaps, the model do not fit the data at all, due to
a bad transition (or problem in the data) or a bad fixed capture-recapture parameter.

E-SURGE has a unique feature to always try to detect problems with some appropriate tools.

In the same way, E-SURGE gives you a reliable rank (not based on the hessian)
and the list of parameters which are redundant when the model is not full rank.
This is really crucial for this kind of model.

by **ganghis** » Mon Nov 28, 2011 1:12 pm

Hi Simone,
It's difficult to answer your question about starting with a simple model and adding complexity to examine parameter redundancy. Unfortunately, dealing with parameter redundancy in (potentially) complicated models for 'imperfect' datasets is where modeling becomes a more of an art than a science. My approach in the past has been to slowly add in complexity in a stepwise fashion to examine which parameters tend to be difficult to estimate. For instance, you might find that the model has difficulty accommodating temporal variation in some parameter (i.e., Remi's diagnostics show the model is not full rank) no matter what the underlying model for the other parameters in. This is useful to know, because it limits the complexity of your 'umbrella' model and reduces the number of models you have to fit.

The problem with the stepwise approach is that you can end up building different umbrella models depending on the order in which you add in complexity. In general, I don't think this is much of a problem, as you can let your intuition guide you. Which parameters seem most important to add temporal variation to? This might be a function of the hypotheses you're trying to test, or your knowledge of the study design (e.g., if there's appreciable differences between disease testing rates in different intervals you know a priori that you should try to fit a model with time varying delta parameters - although you might want to address this with a covariate rather than time dependency).

There hasn't been much written on this to my knowledge, probably because the simulated datasets people work with in model selection exercises are usually well behaved. Anyone have any other opinions?

Concerning GOF, I think your approach is a reasonable one, but I haven't done any work on GOF testing in hidden Markov models. I suspect that this a current area of research, so maybe someone else could chime in.

Paul

www.phidot.org

E-SURGE: a case study and some questions

E-SURGE: a case study and some questions

Re: E-SURGE: a case study and some questions

Re: E-SURGE: a case study and some questions

Re: E-SURGE: a case study and some questions

Re: E-SURGE: a case study and some questions

Re: E-SURGE: a case study and some questions

Who is online