by abreton » Tue Mar 20, 2012 3:34 pm
All of my advice assumes that you've specified the models correctly using the interface provided by MARK. If that's the case then anytime you're fitting multi-state models with real data (i.e., typically sparse) you'll want to apply two 'tricks' often suggested at MARK workshops and described in the MARK Book,
(1) use estimates from simpler models as starting values in more complex models.
(2) for complex models try simulated annealing -- an alternative optimization option in program MARK.
Both of these can help you to overcome the issue that you're describing, in particular poor results for those estimates that are close to 0 or 1 (a boundary). I generally start any analysis by fitting the 'dot' model, that is a model without variation in any parameter -- for example, for a CJS model we might describe this model as phi(.) p (.). At this stage, if data are sparse (nearly always), I gradually increase model complexity and each time use parameter estimates from the previous model as starting values (for the optimization) for the next model. Just before you run a model, check the 'Provide Initial Parameter Estimates' option on the Setup Numerical Optimization Form -- this option allows you to pick any previously run model as a source for the estimates. Using this approach to 'pull yourself up' to the most complex model can be very useful...and it might solve your problem.
Simulated annealing takes LONGER to run than the standard optimization option in MARK. For small datasets, time is not an issue. But for larger datasets, it can take many hours or even days for the sim option to finish. I suggest you run the option on a simple model to assess how long this might take. A good option is to let MARK run overnight for the more complex models -- watch out for that annoying habit for windows to update/restart on its own. Note that trick (2) renders trick (1) potentially not very useful, this is so because simulated annealing jumps around the likelihood surface (multi-dimensional) in order to avoid climbing a false summit (maximum). Imagine that you provide starting values, MARK starts searching for the maximum near this point on the surface, but then jumps to a different location...and then jumps again, etc. This decreases the usefulness of the starting values.
I suggest that you use key words from my post to search the 'entire [MARK] book' which you've obviously been reading. If you search, e.g., on 'simulated annealing' you'll find many helpful sections of text. Similarly, search for 'starting values', also 'estimates on a boundary' or just 'boundary' perhaps. Note -- for a novice you've done well getting this far with multi-state models!
andre