Scaling flipped my variable effect: positive to negative

posts related to the RMark library, which may not be of general interest to users of 'classic' MARK

Scaling flipped my variable effect: positive to negative

Postby Andre23 » Wed May 14, 2025 7:30 am

I'm analysing juvenile survival in a multi-state capture-recapture study. My best model shows that juvenile survival fluctuates over time, while adult survival is stable across breeding states.

This is my best time/age model:
Code: Select all
ms_model_time_age <- mark(ms_proc, ms_ddl,
                      model.parameters = list(
                        S = list(formula = ~-1 +
                                   juv:time +
                                   youngA:stratum +
                                   olderA:stratum),
                        p = list(formula = ~age1:time + age2plus:stratum:time),
                        Psi = list(formula =  ~-1 + stratum:tostratum:ageclass4:period)))


The Issue
When testing density effects on juvenile survival, I got completely opposite results with scaled vs. unscaled variables:

1. Using raw density ("monitored_pairs", range: 6-143 pairs):
Code: Select all
ms_model_dens <- mark(ms_proc, ms_ddl,
                      model.parameters = list(
                        S = list(formula = ~-1 +
                                   juv:monitored_pairs +
                                   youngA:stratum +
                                   olderA:stratum),
                        p = list(formula = ~age1:time + age2plus:stratum:time),
                        Psi = list(formula =  ~-1 + stratum:tostratum:ageclass4:period)))

Model output showing a positive density dependence:
Parameter Beta SE 95% CI
S:juv:monitored_pair 0.0080300 0.0008769 0.0063112 to 0.0097487


2. Using scaled density via scale() function in R and then merging to the ddl:
Code: Select all
dd.data$s_monitored_pairs = scale(dd.data$monitored_pairs)
ms_ddl$S=merge_design.covariates(ms_ddl$S,dd.data)


Code: Select all
ms_model_dens <- mark(ms_proc, ms_ddl,
                      model.parameters = list(
                        S = list(formula = ~-1 +
                                   juv:s_monitored_pairs +
                                   youngA:stratum +
                                   olderA:stratum),
                        p = list(formula = ~age1:time + age2plus:stratum:time),
                        Psi = list(formula =  ~-1 + stratum:tostratum:ageclass4:period)))


Model output showing a negative density dependence:
Parameter Beta SE 95% CI
S:juv:s_monitored_pa -0.1202963 0.0501466 -0.2185837 to -0.0220090


Has anyone encountered similar sign-flipping when scaling variables in multi-state models?
Andre23
 
Posts: 16
Joined: Thu Jun 20, 2024 6:53 am

Re: Scaling flipped my variable effect: positive to negative

Postby jlaake » Wed May 14, 2025 8:08 am

Scaling should not affect the result. Did the models both converge? Multi state models can be problematic. Check log likelihood values and make sure you have not changed something else.
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Scaling flipped my variable effect: positive to negative

Postby Andre23 » Wed May 14, 2025 9:57 am

Code: Select all
> cat("Unscaled model log-likelihood:", ms_model_dens$results$lnl, "\n")
Unscaled model log-likelihood: 160469.1
> cat("Scaled model log-likelihood:", ms_model_s_dens$results$lnl, "\n")
Scaled model log-likelihood: 160608.4
> cat("Unscaled model deviance:", ms_model_dens$results$deviance, "\n")
Unscaled model deviance: 151603.7
> cat("Scaled model deviance:", ms_model_s_dens$results$deviance, "\n")
Scaled model deviance: 151743


When looking at the "mark.out", I can't see any "convergence" when control+f. So I guess no convergence issue?

Both models are exactly the same, I just changed my variable from the scaled one to the non-scaled one.
Andre23
 
Posts: 16
Joined: Thu Jun 20, 2024 6:53 am

Re: Scaling flipped my variable effect: positive to negative

Postby jlaake » Wed May 14, 2025 10:50 am

But they are clearly not converging to same log likelihood values. Start scaled model using unscaled model in initial argument so it starts at same place. Multi state models can have local max/min points on likelihood surface.
jlaake
 
Posts: 1479
Joined: Fri May 12, 2006 12:50 pm
Location: Escondido, CA

Re: Scaling flipped my variable effect: positive to negative

Postby Andre23 » Tue May 20, 2025 11:32 am

I tried initial = my unscaled model

Code: Select all
ms_model_s_dens <- mark(ms_proc, ms_ddl,
                      model.parameters = list(
                        S = list(formula = ~-1 +
                                   juv:s_monitored_pairs +
                                   youngA:stratum +
                                   olderA:stratum),
                        p = list(formula = ~age1:time + age2plus:stratum:time),
                        Psi = list(formula =  ~-1 + stratum:tostratum:ageclass4:period)), initial = ms_model_dens)


Both models are still not converging to the same log likelihood:
Code: Select all
cat("Unscaled model log-likelihood:", ms_model_dens$results$lnl, "\n")
Unscaled model log-likelihood: [b]147855.6 [/b]
> cat("Unscaled model log-likelihood:", ms_model_s_dens$results$lnl, "\n")
Unscaled model log-likelihood: [b]148027.8[/b]


For now I will continue using unscaled values for my models
Andre23
 
Posts: 16
Joined: Thu Jun 20, 2024 6:53 am

Re: Scaling flipped my variable effect: positive to negative

Postby jhines » Tue May 20, 2025 4:52 pm

Multi-state models can be very complicated and have lots of local maxima. I suggest starting with a simple model and working towards the more complicated one, using results of simpler models as initial values in the next more complicated one. Also, as Jeff stated, it shouldn't matter if you use scaled or unscaled covariates, but with complicated models, large (positive or negative) covariate values can cause numerical underflow and/or overflow, resulting in a failure to converge. I don't think Mark comes out and says the model didn't converge, but if it will tell you if it thinks a parameter is unidentifiable. Also, if you look at the end of the output file, it will list the parameters in order of identifiability. Unidentifiable parameters don't necessarily mean the model didn't converge, but it sometimes does.

Another course of action is to run the model with many sets of random initial values. I've done this in a loop, saving the negative log-likelihood of each run. If the result of this is many sets of initial values producing the same negative log-likelihood, which is also the smallest, then it is likely the overall lowest negative log-likelihood. If each run produces a different negative log-likelihood, then I suspect the model is over-parameterized.
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA


Return to RMark

Who is online

Users browsing this forum: Google [Bot] and 1 guest

cron