In case someone had this question...
Just one more thing to confirm about normalizing and scaling: can I used both "normalize" and "scale" at the same time or should I used just either one of them? If I can use both, does the sequence (which one I select first) matter? Also, is it correct that "normalize" only need to be done once as it will change all covariates altogether, while scale should be done for each covariate?
By 'easier to interpret' and 'real' parameter estimate, for example in my case with resulting beta estimate and se 99.49 (45.43) for the altitude based on Digital Elevation Model measured in meter (AltDEM_Avg) for the occupancy model; can I say that,the result means that with every 101.09 meter increase in altitude, there is a unit increase of probability of tiger occupancy? If so, given that beta estimate, is there any way to tell the exact unit of the increase in probability for tiger occupancy?
Alternatively, since I scaled the covariate using the dropdown menu, does that 99.49 beta estimate for AltDEM_Avg can be compared directly with beta estimates for other covariates such as for For07Area which beta estimate was 0.12(0.10). Is that mean that AltDEM_Avg is 825 (=99/0.12) times more important than Forest07Area in determining tiger probability of occurrence?
Thanks,
Dear S,
You can select 'normalize' or 'scale' for each covariate individually. I think if you're comparing estimates from different covariates that are in different units, you should chose 'normalize'. If the covariates are in the same units, I'd suggest 'scale' and use the same constant for each covariate. The purpose of normalizing or scaling (from my (programmers) point of view) is to reduce the size numbers which go into the calculation of the parameters, as the logit-link transformation involves the use of the exponential function and taking the exponential of a large number causes numerical problems.
If we didn't use the logit link transformation function, the formula for occupancy as a function of a single covariate (altitude) would be:
psi(site i) = beta0 + beta1*altitude(site i)
So, if beta0=0 and beta1=99.49 and altitude at site i is zero, then psi at that site would be zero. A site which is 1 meter higher would have psi=99.49.
This is why we use the logit transformation - to force all psi's to be between zero and one. So, psi is computed as:
Logit(psi(site i)) = beta0 + beta1*altitude(site i) or,
<pre>
exp(beta0 + beta1*altitude(site i)
psi(site i) = ------------------------------------------------------------------
( 1 + exp(beta0 + beta1*altitude(site i) )
</pre>
With this transformation and the same beta's as above, psi at the 1st site (alt=0) would be 0.5, and psi at the 2nd site (alt=1) would be 1.0.
Since the logit link function is non-linear, the effect of a change in altitude will depend on the initial altitude. The effect of going from 0 to 1 meter was an increase in psi of 0.5. The effect of going from 0 to 0.5 meters would be 0.5, and the effect of going from 0.5 to 1 meter would be 0.0.
My point is that the logit link function makes it difficult to make statements about exact numerical effects of a covariate, so changing the units by scaling or normalizing probably doesn't matter. You can say things about effects on the logit-scale, but I think it's not as easy to understand as the effects without the logit transformation. The important things about the transformed estimates are the sign of the estimate and the size (relative to it's standard error).
Regarding your question about which covariate is more important, I think I would look at the model AIC results to determine that. This means you would construct one model with altdem-avg as the covariate on psi, then construct another model with for07area as the covariate on psi, then look at which of these models has the lowest AIC. If you were going to compare estimates, you have to take into account the standard error of each estimate. So, you would look at the estimate divided by the standard error.
Cheers,
Jim