I am analyzing a relatively large data set of ~4,000 study sites. Sampling occasions for data collection ranges from 3-21 sampling occasions (mean =12 visits). Consequently, although this is a large data set, there are a number of missing observations. Occurrence data is collected for a number of species ranging in occurrence from relatively rare to ubiquitous. I have 5 site-level covariates and 5 sampling covariates I am interested in testing. At the moment, I am running single-season occupancy models for multiple species. For one such ubiquitous species, I found a naïve estimate of 0.97 and detection probability of 0.60. I am interesting in assessing model fit and received the following results using 500 bootstraps for the chat estimate:
Test Statistic (data) = 2784554169.5975
From 500 parametric bootstraps...
Probability of test statistic >= observed = 0.0020
Average simulated Test Stat = 7125.4114
Median simulated Test Stat = 4626.2077
Estimate of c-hat = 390792.0533 (=TestStat/AvgTestStat)
Estimate of c-hat = 601908.5872 (=TestStat/MedianTestStat)
What could lead to such an astronomical estimate of chat? Sample size? Missing observations? Model complexity? I came across a past post by Darryl suggesting a minimum of 10,000 bootstraps. Am I using too few bootstraps? Any suggestions and/or thoughts would be very much appreciated. Thank you.