www.phidot.org

by **jhines** » Fri Jul 16, 2010 4:15 pm

In case someone had this question...

Just one more thing to confirm about normalizing and scaling: can I used both "normalize" and "scale" at the same time or should I used just either one of them? If I can use both, does the sequence (which one I select first) matter? Also, is it correct that "normalize" only need to be done once as it will change all covariates altogether, while scale should be done for each covariate?

By 'easier to interpret' and 'real' parameter estimate, for example in my case with resulting beta estimate and se 99.49 (45.43) for the altitude based on Digital Elevation Model measured in meter (AltDEM_Avg) for the occupancy model; can I say that,the result means that with every 101.09 meter increase in altitude, there is a unit increase of probability of tiger occupancy? If so, given that beta estimate, is there any way to tell the exact unit of the increase in probability for tiger occupancy?

Alternatively, since I scaled the covariate using the dropdown menu, does that 99.49 beta estimate for AltDEM_Avg can be compared directly with beta estimates for other covariates such as for For07Area which beta estimate was 0.12(0.10). Is that mean that AltDEM_Avg is 825 (=99/0.12) times more important than Forest07Area in determining tiger probability of occurrence?

Thanks,

Dear S,

You can select 'normalize' or 'scale' for each covariate individually. I think if you're comparing estimates from different covariates that are in different units, you should chose 'normalize'. If the covariates are in the same units, I'd suggest 'scale' and use the same constant for each covariate. The purpose of normalizing or scaling (from my (programmers) point of view) is to reduce the size numbers which go into the calculation of the parameters, as the logit-link transformation involves the use of the exponential function and taking the exponential of a large number causes numerical problems.

If we didn't use the logit link transformation function, the formula for occupancy as a function of a single covariate (altitude) would be:

psi(site i) = beta0 + beta1*altitude(site i)

So, if beta0=0 and beta1=99.49 and altitude at site i is zero, then psi at that site would be zero. A site which is 1 meter higher would have psi=99.49.

This is why we use the logit transformation - to force all psi's to be between zero and one. So, psi is computed as:

Logit(psi(site i)) = beta0 + beta1*altitude(site i) or,

<pre>
exp(beta0 + beta1*altitude(site i)
psi(site i) = ------------------------------------------------------------------
( 1 + exp(beta0 + beta1*altitude(site i) )
</pre>

With this transformation and the same beta's as above, psi at the 1st site (alt=0) would be 0.5, and psi at the 2nd site (alt=1) would be 1.0.

Since the logit link function is non-linear, the effect of a change in altitude will depend on the initial altitude. The effect of going from 0 to 1 meter was an increase in psi of 0.5. The effect of going from 0 to 0.5 meters would be 0.5, and the effect of going from 0.5 to 1 meter would be 0.0.

My point is that the logit link function makes it difficult to make statements about exact numerical effects of a covariate, so changing the units by scaling or normalizing probably doesn't matter. You can say things about effects on the logit-scale, but I think it's not as easy to understand as the effects without the logit transformation. The important things about the transformed estimates are the sign of the estimate and the size (relative to it's standard error).

Regarding your question about which covariate is more important, I think I would look at the model AIC results to determine that. This means you would construct one model with altdem-avg as the covariate on psi, then construct another model with for07area as the covariate on psi, then look at which of these models has the lowest AIC. If you were going to compare estimates, you have to take into account the standard error of each estimate. So, you would look at the estimate divided by the standard error.

Cheers,

Jim

by **darryl** » Sun Jul 18, 2010 5:31 pm

Just a follow up point to Jim's about the interpretation of beta parameters using the logit-link. Like he says, because of the non-linear transformation it's difficult to talk about the nature of the effect on the probability scale, especially if it's a continuous covariate, and/or you have more than 1 covariate in the model. An option though is to talk about the effect on the odds-scale. Odds are just the ratio of the probabilities of success to failure (so how much likely a success is compared to a failure) and we use that idea all the time in real life. exp(beta) tells us how much to multiply the base odds by for a 1-unit increase in the associated covariate. If you want to know more try googling 'odds logit link' or 'odds logistic regression' and I'm sure you'll get a few relevant hits.

by **cooch** » Sun Jul 18, 2010 6:30 pm

darryl wrote:Just a follow up point to Jim's about the interpretation of beta parameters using the logit-link. Like he says, because of the non-linear transformation it's difficult to talk about the nature of the effect on the probability scale, especially if it's a continuous covariate, and/or you have more than 1 covariate in the model. An option though is to talk about the effect on the odds-scale. Odds are just the ratio of the probabilities of success to failure (so how much likely a success is compared to a failure) and we use that idea all the time in real life. exp(beta) tells us how much to multiply the base odds by for a 1-unit increase in the associated covariate. If you want to know more try googling 'odds logit link' or 'odds logistic regression' and I'm sure you'll get a few relevant hits.

Or, you can look at section 6.12.1 in Chapter 6 of the MARK book (this is the linear models chapter). This goes into some detail on log-odds, effect size, and related issues.

http://www.phidot.org/software/mark/doc ... /chap6.pdf

by **cnagy** » Mon Sep 27, 2010 2:54 pm

I have read and re-read this old thread and it has been a big help. I do have one question, and I am worried my idea is horrible number wizardry. Based on what Darryl said...

exp(beta) tells us how much to multiply the base odds by for a 1-unit increase in the associated covariate.

can one apply this to normalized variables? In other words, am I right in thinking an increase in 1-unit of a normalized variable would really be an increase in 1 standard deviation, and thus exp(beta) is the change in odds if you change the covariate 1 standard deviation?

So as a further leap (off the cliff), if my raw standard deviation of that covarate is, say, 150 meters, does this mean a change of 150m increases/decreases the odds by exp(beta)

To continue worsening my headache, if you have a number of covariates with different betas...say, 2 of them, with beta1 = 1.61 and beta2 = 2.31, but the SD of cov1 is 100 and the SD of cov2 is 200, then it seems that in reality the two have very similar effect sizes:

exp(beta1) = exp(1.61) = ~5.0
exp(beta2) = exp(2.31) = ~10.0
SD1 = 100 => A 100 unit increase in cov1 increases odds by 5...a 200 unit increase increases the odds by 10
SD2 = 200 => A 200 unit increase in cov2 increases odds by 10...a 100 unit increase increases the odds by 5

Basically, for my current project I normalized the covariate data because i was dealing with measurements across a huge range (0 - hundreds of meters). Now would like to discuss the occupancy/detection rates in terms of the actual measurements.

Thx
chris

www.phidot.org

normalizing/scaling covariates

normalizing/scaling covariates

Re: normalizing/scaling covariates

Re: normalizing/scaling covariates

Re: normalizing/scaling covariates

Who is online