GOF issue with songbird acoustic recording data

questions concerning analysis/theory using program PRESENCE

GOF issue with songbird acoustic recording data

Postby northernbio » Wed Mar 27, 2013 10:03 am

Hi: I am using Presence to estimate occupancy rates of 13 forest birds based on interpretation of acoustic recordings. I have a maximum of 4 observations over a 2 - 5 day period, I collect site and survey specific detection covarates, and have about 290 observations. An example output is below.

For about 1/2 the species the GOF is good, but for the other half is poor. In almost every case the detection histories that differ from expected are 1111 (detected in every case) and 1000 (detected only in the first survey, but not the rest). In both cases it is always a higher than expected number of observations in the category. I've tried many different models, and results are always similar.

What might be the cause of the lack of fit? Could it be the movement pattern of the birds (detection distance is 100 m, but territories could be partially outside the detection distance)? Given that c-hat is generally < 6, how big an issue is this? Given lack of fit, is my best bet to use the psi(.)p(.) estimate, the naive estimate, or use the best model and adjust occupancy estimate by c-hat?

Thanks in advance for your thoughts on this.

--Rob

Assessing Model Fit for Single-season model:
History(cohort) Observed Expected Chi-square
0000( 0 0) 219.0000 218.318790923 0.00
0010( 0 22) 5.0000 3.657488431 0.49
1000( 0 52) 13.0000 3.657488431 23.86
000-( 1 54) 7.0000 7.871499268 0.10
00--( 2 80) 1.0000 0.814831002 0.04
0100( 0 120) 3.0000 3.657488431 0.12
0011( 0 126) 3.0000 4.148574702 0.32
0001( 0 129) 6.0000 3.657488431 1.50
1011( 0 139) 2.0000 4.705598495 1.56
011-( 1 144) 2.0000 0.313977773 9.05
010-( 1 145) 1.0000 0.276810749 1.89
0111( 0 151) 4.0000 4.705598495 0.11
1010( 0 176) 2.0000 4.148574702 1.11
0110( 0 183) 1.0000 4.148574702 2.39
1101( 0 187) 3.0000 4.705598495 0.62
1111( 0 193) 15.0000 5.337413158 17.49
1100( 0 194) 3.0000 4.148574702 0.32
1110( 0 275) 3.0000 4.705598495 0.62
Test Statistic = 71.6081 min(expect)=2.768107e-001
------------------------------------------------------
Test Statistic (data) = 71.6081
From 10000 parametric bootstraps...
Probability of test statistic >= observed = 0.0004
Estimate of c-hat = 3.0952 (=TestStat/AvgTestStat)
northernbio
 
Posts: 20
Joined: Thu Nov 18, 2010 11:08 am

Re: GOF issue with songbird acoustic recording data

Postby jhines » Wed Mar 27, 2013 10:14 am

Hi Rob,

Are you doing the GOF test on the simple model, psi(.)p(.)? It is recommended to do the GOF test on the most general model in your model-set. So, if you have no covariates, I'd suggest doing it on psi(.)p(t). That one should do much better than the one for psi(.)p(.).

Any of the model estimates should be better than the naive estimate, but I'd use the best model and adjust the variances by c-hat (c-hat shouldn't affect the occupancy estimates). You'd like the c-hat to be smaller as 6 is pretty high. So, I'd try the psi(.)p(t) model and/or a model with covariates which should fit better.

Jim
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Re: GOF issue with songbird acoustic recording data

Postby northernbio » Wed Mar 27, 2013 10:53 am

Hi Jim:

Thanks for your reply. Yes, I have survey-specific detection covariates, including temperature, humidity, wind level, recording quality, and site-specific vegetation density. Candidate models include single and multiple covariates. I initially do my GOF on the least parsimonious of the candidate models. Because the 1st, 2nd, 3rd, and 4th surveys can occur on different days among sites, I cannot use the model of survey-specific detection rate, rather I use covariates for the day of the survey.

Given the same problem occurs for about 5 species, I am beginning to suspect some sort of bias in my data. I wonder what could be the cause of higher than expected frequencies of the 1111 history.

Rob
northernbio
 
Posts: 20
Joined: Thu Nov 18, 2010 11:08 am

Re: GOF issue with songbird acoustic recording data

Postby jhines » Wed Mar 27, 2013 11:05 am

So, which model did you do the GOF on? Usually, it should be the most general (with the most parameters), but if you have many most general models with the same number of parameters, but different covariates, then I'd suggest using the one with the smallest Log-likelihood value.

Jim
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Re: GOF issue with songbird acoustic recording data

Postby northernbio » Wed Mar 27, 2013 11:23 am

Okay, I hadn't fully figured out how to select the most appropriate model to do the GOF on, so I have explored GOF on a few of the models. Of the candidate models I evaluate, there is usually a subset of plausible models (often with a breakpoint in the AIC). I was doing the GOF on the most parameritized of this subset of models. When I a had a low GOF I also looked at the least parameritized model to see if I got different results. I like your idea of using the smallest log-likelihood.

In several cases even though GOF fails (Probability of test statistic >= observed = 0.0000), c-hat may be in the range of 2 - 3. At what magnitude of c-hat do you generally adjust your error levels?

I am also thinking that perhaps I should have randomly selected recordings instead of picking the best recordings. Maybe some sort of bias got introduced.

Rob
northernbio
 
Posts: 20
Joined: Thu Nov 18, 2010 11:08 am

Re: GOF issue with songbird acoustic recording data

Postby jhines » Wed Mar 27, 2013 11:52 am

A c-hat of 1.0 won't change the variances at all. Any value > 1.0 will inflate the variances, so I would suggest adjusting the variances if c-hat >1. I don't recommend adjusting variances if c-hat < 1.0, though.

In many cases, the model with the most parameters (most general) will be quite far from the best model in the model selection table. You should still use that model for GOF.

I agree that non-randomly selected recordings might be influencing the results.

Jim
jhines
 
Posts: 632
Joined: Fri May 16, 2003 9:24 am
Location: Laurel, MD, USA

Re: GOF issue with songbird acoustic recording data

Postby northernbio » Wed Mar 27, 2013 12:01 pm

Thanks again for your reply! Not sure if this helps in the discussion, but I've discovered that for the low GOF species I am getting a considerable improvement if I use the 2 -group predefined model. In some cases a big improvement if using the survey-specific P version, and other cases constant P. Given a big increase in Model Likelihood, do you think it would be justified using the 2 group approach? I am not sure of all the assumptions with that model. Also, I don't think I can calculate a GOF for those models.

Rob
northernbio
 
Posts: 20
Joined: Thu Nov 18, 2010 11:08 am

Re: GOF issue with songbird acoustic recording data

Postby murray.efford » Thu Mar 28, 2013 3:31 pm

The elephant in the room here is that it is almost certainly inappropriate to apply simple occupancy methods to data from point recordings. There, someone had to say it! You assert a detection distance of 100 m, but I don't see how you can actually apply that cutoff to sound recordings, and birds are free to move in and out between days, so occupancy is ill-defined on both counts. Even if you could define a radius, there would be substantial heterogeneity of within-day detection for birds within that radius so estimates are subject to heterogeneity bias. Heterogeneity might in principle be addressed with a mixture model, but you have only 4 occasions and that leaves open the larger question of what you are measuring (noting that this will vary between species). Perhaps after thought and simulation you can demonstrate that these factors are negligible in your particular case, but prima facie this is the wrong analysis.
Murray
murray.efford
 
Posts: 712
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Re: GOF issue with songbird acoustic recording data

Postby darryl » Fri Mar 29, 2013 6:37 pm

Hi Murray,
I'm going to (partially) disagree with you on this. I don't think it's neccesarily inappropriate to use occupancy models with point recordings, depening on how the results are to be interpreted. Firstly, you have to keep in mind that with occupancy-approaches we're really sampling landscapes, not wildlife populations. The presence or absence of the target species at different places on the landscape is just some characteristic of those places, just like vegetation type, elevation etc. This twist can take a while for people to appreciate.

Movement of individuals in and out of the detection radius (whatever that distance is) is largely a non-issue provided 'occupancy' is interepret as 'use', ie bird species may not be there all of the time, but they use that location at some point during the sampling period. You also have to keep in mind that occupancy is a species-level measure, not individual.

Within-day heterogeneity of individual birds is also unlikely to be much of issue because the recordings have been aggregated to species detected or not, each day. So it doesn't matter whether particular indviduals are more or less likely to be detected at different points during the day, it's the probability of detecting at least one of those individuals at some point during the day that's at issue. If the within-day movement patterns of the different individuals are relative consistent, could be considered random, or explained with covariates (e.g., day of greater activity because it was warmer), then the probability of detecting the species day-to-day at a point should be ok and any variation dealt with using the usual approaches. Heterogeneity between points, however, is more of a concern because that will cause biases if not dealt with appropriately.

The biggest issue with point detection sampling (point recordings, point counts, camera traps) is the effective area being sampled by the method or device. I agree that it's often difficult if not impossible to define; while a nominal radius of 100m may be claimed, for some species it maybe quite a bit greater or smaller than that. This is going to be an issue if you want to interpret the measure of occupancy (or use) in terms of an area because we don't know what area should be assigned to each survey point. But if you're willing to interpret the occupancy measure it terms of 'fractions of points' rather than 'fraction of area' I think occupancy models are still a reasonable approach to take. Yes, there is some fuzziness in terms of interpretation, but as long as users are aware of the limitations, it still beats the alternatives.

Of course, this thought process should have been gone through well before stepping into the field, and if people couldn't live with the interpretations given the field methods they wanted to use, they need to find altertive analysis or field methods that would give them the info they require.

Cheers
Darryl
darryl
 
Posts: 498
Joined: Thu Jun 12, 2003 3:04 pm
Location: Dunedin, New Zealand

Re: GOF issue with songbird acoustic recording data

Postby murray.efford » Fri Mar 29, 2013 8:31 pm

The key point is that when plots are undefined, the parameters occupancy and species detectability are also undefined (plots may be points if animals move).

Yes, occupancy concerns the aggregate presence or absence of a species, but detection relies on detecting individuals, and modelling individual detection and behaviour provides quantitative insight on the species-level detection issues. I assumed, perhaps wrongly, that each of Rob’s recordings was relatively short – say 10 minutes. In this case, the effect of individual location on heterogeneity of species detection and hence bias, can be large (Fig. 5 Efford and Dawson 2012 http://dx.doi.org/10.1890/ES11-00308.1). The problem becomes crippling with large undefined plots, of which sound recordings are an example. We don’t really know what happens in longer recordings, but I doubt the problem goes away.

interpret the occupancy measure in terms of 'fractions of points' rather than 'fraction of area'

Huh? This sounds like a plea to accept at face value an operational measure contaminated by ambient conditions, habitat effects on sound attenuation, species differences in vocalisation, territory size etc. I don’t think that’s what Rob thought he was buying.

Murray

PS 'effective' sampling area is a slippery concept: in distance sampling and SECR it is the notional area that would result in an unbiased estimate of density when divided into the number of individuals detected (i.e. it includes both incomplete detection and spatial extent); in conventional capture-recapture it is the notional area that gives an unbiased estimate of density when divided into N-hat; in occupancy it might also be defined in relation to either the observed presences or the estimated total presences. In none of these does it correspond to a tangible area on the ground, despite our history of drawing lines around trapping grids.
murray.efford
 
Posts: 712
Joined: Mon Sep 29, 2008 7:11 pm
Location: Dunedin, New Zealand

Next

Return to analysis help

Who is online

Users browsing this forum: No registered users and 1 guest

cron