www.phidot.org

by **GDistiller** » Tue Aug 31, 2010 8:37 am

I am trying to fit a model using MCMC and am getting an error message that looks as follows:

run-time error
M6203: MATH
-exp: OVERFLOW error
Image Pc Routine Line Source
MARK.exe 00683129 Unknown Unknown Unknown
MARK.exe 00682F87 Unknown Unknown Unknown
etc etc

It is a fairly simple model: a single state model to both recovery and recapture data for 32 occasions. Two independent hyperdbns specified (1 for survival and 1 for recapture) and 1 chain run. I have parameterised the model with 1's, 0's and -1's so that the intercept is the overall mean. I have also run it initially without MCMC so that I have starting values to supply (albeit some of them are dodgy).

Any feedback as to what is causing this would be much appreciated.

Thanks

Greg

by **GDistiller** » Fri Sep 03, 2010 7:44 am

I think I have discovered what is causing this problem...but unfortunately have no idea why this is. Since I am using -1 coding I am trying to force the mean of the hyperdbn to be zero ie so that the intercept represents the overall mean and the random effects are deviations around the mean. So in other words, focusing only on survival, the intercept is the overall mean survival rate and the yearly estimates will be a combo of this mean plus a deviation where these deviations ~ N(0, sigma).

In an earlier post where the -1 coding was suggested to me, I was also told that the way to force the mean for the hyperdbn to be zero was by setting the priors for the hyperdbn to have a mean of zero and a very small sigma like 0.001. If I leave the default values for this ie do not change the prior sigma for my mu to 0.001, then the model runs and I do not get the run-time error.

Have I misunderstood how to force the mean of the hyperdbn to be zero? To confirm, the dialog box pops up to specify the parameters for the hyperdbns where one can supply 4 values for each mu and sigma. For the mean mu these 4 values represent: step size, initial value, mu, and sigma. It is here that I make sure that the 3rd value is zero and the 4th value I change to 0.001. I have not changed the prior values for the sigma parameter ie the default values are used (1/sigma^2~ gamma(alpha, beta))?

I would really appreciate any help with figuring this out...

Thanks!

Greg

by **gwhite** » Fri Sep 03, 2010 9:39 pm

Greg:
Sounds like you are correctly specifying the 3rd and 4th values for the hyperdistribution dialog, but I suspect you are leaving the initial value (2nd value) as "compute". In this case, you need to start the chain at 0, or the computed value will be outside the prior. Thus, change "compute" to zero, and see if the problem is solved.
Gary

by **GDistiller** » Tue Sep 07, 2010 8:27 am

Thanks Gary! That was the problem, model is now busy running...

I have another larger multi-state model that I am trying to estimate with simulated annealing. It has been running for 2 weeks now...does this sound reasonable? I know that the documentation says it can take a very long time to run. Am I correct that if it got stuck somewhere it would abort the process by itself?

This leads me to my next question: are there any pc clusters around that can run Mark jobs? If there is a power failure then I lose everything and have to start again (this has already happened and cost me several days)...plus both my pc and macbook are busy running models making it difficult for me to use them for other work at the same time...

Thanks!

Greg

by **cooch** » Tue Sep 07, 2010 8:51 am

GDistiller wrote:Thanks Gary! That was the problem, model is now busy running...

I have another larger multi-state model that I am trying to estimate with simulated annealing. It has been running for 2 weeks now...does this sound reasonable? I know that the documentation says it can take a very long time to run. Am I correct that if it got stuck somewhere it would abort the process by itself?

Depending on the problem, it is possible for the chain to 'get stuck'. This is the nature of the beast - for distributions with multiple modes, there are all sorts of issues which can cause a chain to get stuck. While there are some 'technical' solutions (which require changing the underlying sampler - which is a general Metropolis-Hastings), these would require changes to the underlying code base. Not going to happen in the short run. Alternatively, you can try very different starting points, and run multiple chains, to see what happens. Finally, the usual strategy of taking your time series data from one or more chains and dumping it into CODA (or some such) to look at various traces and diagnostics, is always an option.

As for the time to run, this always amuses me. In some fields, a single experiment can takes months -> years to complete. But, in quantitative ecology, we seem to think we should get answers in seconds -> minutes. If the result is important to you, how long it takes should be relatively unimportant. And, the moment you move into MCMC, the time to completion can take much, much longer (I have jobs that took 2-3 months to complete - for a single model).

Of course, this presumes that your compute environment is stable (your second query)....

This leads me to my next question: are there any pc clusters around that can run Mark jobs? If there is a power failure then I lose everything and have to start again (this has already happened and cost me several days)...plus both my pc and macbook are busy running models making it difficult for me to use them for other work at the same time...

Thanks!

Greg

Short answer - no (not that I'm aware of). If I'm guessing correctly that you're at a University, then - thats what University compute infrastructure is designed to do (in other words, look for something at your end). In the modern era of computational intensive statistical inference, you need a compute environment that is going to never be turned off, and is 100% failsafe reliable. Which often means big central server farms with 24x7 maintenance (this would include the 'cloud' paradigm in many cases). Which is what old farts like me remember back from the 'central mainframe' days. Interesting to see the paradigm swinging back away from the desktop to 'big iron' which is designed for reliability. Consider the acquisition of high-end, reliable computing a long-term 'research investment'.

by **GDistiller** » Tue Sep 07, 2010 9:33 am

Thanks very much for the quick reply!
BTW this particular model is not one within the MCMC framework so using multiple chains and CODA is not an option. This model is being estimated using simulated annealing. If the estimation does get stuck, will it realise this and abort? Or is it possible that it will go on ad infinitum?

I am based at a university (Cape Town) so will take your advice and look this side for any computing cluster solutions. We've also currently got a few people here from Creem in St Andrews so I will ask them as well...

Thanks again for all the help...

Greg

by **cooch** » Tue Sep 07, 2010 3:39 pm

GDistiller wrote:Thanks very much for the quick reply!
BTW this particular model is not one within the MCMC framework so using multiple chains and CODA is not an option. This model is being estimated using simulated annealing. If the estimation does get stuck, will it realise this and abort? Or is it possible that it will go on ad infinitum?

I've never had a simulated annealing problem get stuck, but that might be because I usually start it off in something 'close to the right vicinity', which I get by using starting values from a simpler model.

As for the length of time, current record on my 'big machine' is 2+ weeks for a job I ran for Gary at one point.

I am based at a university (Cape Town) so will take your advice and look this side for any computing cluster solutions. We've also currently got a few people here from Creem in St Andrews so I will ask them as well...

Thanks again for all the help...

Greg

If your uni doesn't have capacity, then this is your excuse to ask them what are they doing on the compute side besides managing email and hosting web-based services. However, in my experience, all 'major' schools have a central server room or several) with dedicated failsafe networking and power, designed to host machines that 'can't get turned off'. Most such facilities also have capacity fr number-crunching research machines. Whether they have CPU cycles to spare, or if they'll put 'your box in their nice room' is a different issue.

www.phidot.org

Run-time error?

Run-time error?

Re: Run-time error?

Re: Run-time error?

Re: Run-time error?

Re: Run-time error?

Re: Run-time error?

Re: Run-time error?

Who is online