McKenzie & Bailey GOF test in Presence 2.2

questions concerning analysis/theory using program PRESENCE

McKenzie & Bailey GOF test in Presence 2.2

Postby gurutzeta » Mon Jul 14, 2008 3:59 am

Hi all,

I am working on my MSc thesis about occupancy modelling for the Alaotran gentle lemur, a little lemur that knows how to hide in the marshes very well... and so goes its detectability :o)

I have just started to play a bit with PRESENCE 2.2 (080702.0916) and I have a couple questions/problems regarding the assessment of model fit:

-- I started with a very simple model for my data (constant detectability and occupancy, "pre-defined -> 1group constant P") and selected the option "assess model fit". I understand that this would
run the McKenzie and Bailey GOF method. However the fact is that even if I selected this option apparently the test was not done (nothing was shown in the output file).
Then I checked what happens with the same data if I select "custom model" instead, even if I am not using covariates at all, and in this case the GOF test did take place.
Any ideas about why it did not work in the first place? Is this a bug of the SW or am I misunderstanding something?

-- The second question is regarding the implementation of this GOF method in PRESENCE: what type of correction is used for calculating the test statistic (Pearson chi-square) for the case of small values? As an exercise I programmed the test myself and I see some differences for scenarios where histories have low frequencies...

Thanx for your help in advance!!!

Gurutzeta


PS BTW, I am running on Vista... just in case it makes a difference...
gurutzeta
 
Posts: 11
Joined: Thu Jul 10, 2008 5:29 am
Location: Australia

Postby darryl » Mon Jul 14, 2008 5:30 pm

First, the model fit procedure only works with custom models. I think that tid-bit is buried somewhere within the online help.

Second, Jim Hines has been recently trialling some pooling algorithms to make the test more stable with sparse data, but I'm not sure what stage things are at. Hopefully he'll see this thread and can respond.

Darryl
darryl
 
Posts: 498
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand

Postby gurutzeta » Tue Jul 15, 2008 1:44 pm

Thanx a lot darryl! I will wait for Jim's comments!
gurutzeta
 
Posts: 11
Joined: Thu Jul 10, 2008 5:29 am
Location: Australia

Postby tracicastellon » Sat Nov 22, 2008 4:54 pm

hi gurutzeta,

fyi: i'm running vista and i had the same problem as you. i tried it on my old computer (windows xp) - with the exact same data set - and it ran fine.

the data set has 50 sites, 2 observation occasions at each site, 1 site covariate. ran a bootstrap with 1000 sets.

ugghhh... vista!!
tracicastellon
 
Posts: 6
Joined: Wed Oct 01, 2008 12:31 pm

GOF calculations

Postby jota » Fri Mar 27, 2009 2:39 pm

Hi,

I am running a GOF for a single-season single-species occupancy model and in the output file the calculations corresponding to the observed history finish with:

sum(chisq)= 74.0566
sum+N-seensum= 85.6796
Test Statistic = 7.1400

I can see how the first figure is obtained however I am not sure about what the second and third figures mean (and how they are obtained). Any hints?

J
jota
 
Posts: 5
Joined: Mon Feb 23, 2009 12:02 pm

GOF calculations

Postby jhines » Fri Mar 27, 2009 8:29 pm

The first number (sum) is the sum of the chi-square values. The 2nd number (sum + N -seensum is the sum plus the number of sites minus the number of sites where at least one detection occurred. The test statistic is the 2nd number divided by the degrees of freedom.
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Postby jota » Mon Mar 30, 2009 6:43 am

Thanks Jim for your answer! A couple of further questions:

a) I am afraid I did not understand how the second number is obtained. If the difference between the 2nd and 1st number is just "the number of sites minus the number of sites where at least one detection occurred" I would have expected it to be an integer and not 85.6796 - 74.0566 = 11.623.

b) In the paper "assesing site-occupancy models" by MacKenzie&Bailey (2004) I understood the proposed test statistic to assess fit was the Pearson's chi-square statistic (our 'sum' here) and I got the impression that this is what PRESENCE used to use before (?). I would be interested in reading the theoretical background of the alternative implemented now. Could you point me to the relevant sources?

Thanks a lot once again!
jota
 
Posts: 5
Joined: Mon Feb 23, 2009 12:02 pm

GOF calculations

Postby jhines » Mon Mar 30, 2009 12:10 pm

I should have said 'expected number of sites with detections' instead of 'number of sites with detections'. This expected value can be (usually is) a non-integer.

I'll have to 'pass the buck' to Darryl on the 2nd question. You're right about the sum being the Pearson's chi-square, but I don't remember what Darryl said the additional stuff is.

Cheers,

Jim
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Postby darryl » Mon Mar 30, 2009 11:22 pm

Hi There,
In theory (and in the paper) you have to sum over all possible histories, which can snowball quite quickly, and often you're going to have more possible histories than sampling units. Therefore, a short cut is to break the summation into 2 parts, those histories observed at least once, and those never seen.

For the first part you just calculate expected numbers and apply the Persons chi-square formula. I think Jim has labelled this sum(chisq) = Y.

Then rather than calculate the expected numbers for the other (possibly a lot of) histories, as the observed value is 0, the Persons chi-square formula just reduces to sum(expected values given history unobserved) [=X]. So rather than calculate X by going through each possible history, we know that X = N-sum(expected values given history was observed).

So the test statistic should be Y + N - sum(expected values given history was observed). Not sure exactly how this matches up with what Jim has labelled at this point so I'll pass the buck back to him. ;-)

Cheers
Darryl
darryl
 
Posts: 498
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand

Postby jota » Tue Mar 31, 2009 9:02 am

Thanks for the explanation, Darryl! Now I see. The labelling is perhaps a bit misleading so I guess I got a bit confused with that.

The thing is that I have this dataset for which an older version of PRESENCE indicated a satisfactory fit for the fixed model while the current version indicates evidence of lack of fit. So that's why I was wondering whether (and why) there had been any updates in the GOF test implemented in PRESENCE.

I attach here the summary of the GOF test outputted by PRESENCE:

version 2.2 <080317.1419>
------------------------------------------------------
Test Statistic = 7.0709
Probability of test statistic >= observed
from 999 parametric bootstraps = 0.3730
Average simulated Test Stat = 7.0833
Median simulated Test Stat = 6.5819
Estimate of c-hat = 0.9982 (=TestStat/AvgTestStat)
Estimate of c-hat = 1.0743 (=TestStat/MedianTestStat)
------------------------------------------------------

version 2.2 <090220.1554>
------------------------------------------------------
Test Statistic (data) = 7.1400
From 999 parametric bootstraps...
Probability of test statistic >= observed = 0.0110
Average simulated Test Stat = 3.7670
Median simulated Test Stat = 3.6365
Estimate of c-hat = 1.8954 (=TestStat/AvgTestStat)
Estimate of c-hat = 1.9634 (=TestStat/MedianTestStat)
------------------------------------------------------
jota
 
Posts: 5
Joined: Mon Feb 23, 2009 12:02 pm

Next

Return to analysis help

Who is online

Users browsing this forum: No registered users and 2 guests