Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What changes need to be made to the notes on dwc:occurrenceStatus? #238

Closed
baskaufs opened this issue Oct 21, 2019 · 22 comments
Closed

What changes need to be made to the notes on dwc:occurrenceStatus? #238

baskaufs opened this issue Oct 21, 2019 · 22 comments

Comments

@baskaufs
Copy link

baskaufs commented Oct 21, 2019

@qgroom I was reading the section on dwc:occurrenceStatus in the https://doi.org/10.3897/biss.3.38084 paper and it noted "We propose adding notes to the documentation of dwc:occurenceStatus, to point users to other status fields that might be appropriate for their needs.". However, the paper didn't suggest what the text should be. Suggestions?

Since the notes aren't normative, we won't necessarily need to go through the fulll change process. Here's what the Vocabulary Maintenance Specification says about this kind of situation:

Because non-normative content provides only supplemental information, the Interest Group may use its discretion to decide the extent to which the community should be involved in implementing changes to non-normative content. For example, relatively cosmetic changes, such as improving figures, changing formatting, minor improvements to examples, etc. can be made without triggering any change process or notification via the TDWG email list [TDWG-CONTENT]. More significant changes or improvements to non-normative content may warrant notification of the community via the TDWG email list [TDWG-CONTENT]. If the Interest Group determines that proposed changes to non-normative content are significant enough, it may chose to invoke the full change process. Substantive changes to non-normative content will usually trigger a version change for the affected document.

My guess is that since we aren't really telling people how to use the term differently (just how to use it correctly) an appropriate course of action would be to add the clarifying notes, then inform the community via tdwg-content. Probably the change process (public comment, executive review, etc.) would not be necessarily, but that would depend somewhat on exactly what the new comments say.

@baskaufs baskaufs added this to the TDWG 2019 milestone Oct 21, 2019
@peterdesmet peterdesmet removed this from the TDWG 2019 milestone Oct 21, 2019
@qgroom
Copy link
Member

qgroom commented Oct 27, 2019

@baskaufs Below are the clarifications I would like to make about dwc:occurrenceStatus

  • dwc:occurrenceStatus has no meaning unless it is combined with clear spatial and temporal boundaries. Something has to be present or absent somewhere and over a period.
    -- Therefore, absence has no meaning for point observations with an coordinateUncertaintyInMeters.
    -- Spatial boundaries can be provided by several Darwin Core terms including waterBody, island, country, county, stateProvince, municipality, locality and footprintWKT
    -- Temporal boundaries are perhaps best provided by eventDate. ISO 8601 supports date ranges.
  • Breeding status can be encoded under dwc:reproductiveCondition
  • IUCN threat status of the organism or someother redlisting can be encoded using the Species Distribution extension (http://rs.gbif.org/extension/gbif/1.0/distribution.xml).

@baskaufs
Copy link
Author

Thanks for this @qgroom. I'm assuming you meant to say:

"Temporal boundaries are perhaps best provided by eventDate. ISO 8601 supports date ranges." rather than "Spatial boundaries..."

@qgroom
Copy link
Member

qgroom commented Oct 28, 2019

I've fixed it now

@albenson-usgs
Copy link

I disagree with this statement "Therefore, absence has no meaning for point observations with an coordinateUncertaintyInMeters." Modelers need information about when researchers look for a species and don't find it. This can and does happen at the point observation level. The species may not be absent from an entire waterBody, stateProvince, etc but it might be "absent" (not detected) at that very specific location and this is important to know. Take for instance a coral reef monitoring program that is looking for staghorn coral (or at least the methodology they are using would detect staghorn coral at locations where they are looking)- they use the point line intercept method for their survey- they detect staghorn coral at 5 points along the transect but not the other 5. Including all ten point observations, and especially the ones where staghorn coral are absent is critical to document and share.

@qgroom
Copy link
Member

qgroom commented Dec 26, 2019

So what does an absence of a point observation mean? Could it be present 1m away. Is it absent within the dwc:coordinateUncertaintyInMeters. Is is absent at that moment in time, and could it be present the day before or the day after?
A point observation is just a moment in time and space. It can't be used to predict absence more generally, which would be useful for modelling.

In the example you give the absences and presences are useful for estimating an abundance, but the whole survey has boundaries. If staghorn coral is absent from every point in the transect then that doesn't mean that coral is absent more extensively, you only know it is less abundant than the sensitivity of the method.
Yes, it is critical to document this information, but it only makes sense in the context of a bounded survey and your transect is such a bounded survey.

@albenson-usgs
Copy link

I have not conducted species distribution modeling myself but my understanding is that when you do so if you are using presence only data then you select pseudo-absences (also points) randomly throughout the area where the species was not seen. It seems to me it would be better to use points of where a species could have been detected but was not seen as a better predictor of species distribution than using pseudo-absences. But modelers can only do this if non-detections are reported. We of course never have perfect detection of species and have to make educated guesses for their distribution. I still posit that having documented point locations for non-detections is better than not including that information when we have it. Yes, the methodology needs to be documented extremely well also. But you can still have a point, in time and space, where you did not see a species if your methodology could have detected it.

@robgur
Copy link

robgur commented Dec 26, 2019 via email

@tucotuco
Copy link
Member

Am I missing something, or is the statement actually supposed to be, "Therefore, absence has no meaning for point observations without a coordinateUncertaintyInMeters"?

@qgroom
Copy link
Member

qgroom commented Dec 26, 2019

Am I missing something, or is the statement actually supposed to be, "Therefore, absence has no meaning for point observations without a coordinateUncertaintyInMeters"?

No, the coordinateUncertaintyInMeters do not transcribe an area that was searched. It doesn't delimit the boundaries of an observation, it delimits the uncertainty of the coordinates associated with a point observation. The organism was presumably observed somewhere in that circle, but you don't know where and you don't know where the observer was looking for that organism.

@qgroom
Copy link
Member

qgroom commented Dec 26, 2019

BTW: It is worth noting that a "point" such as this 50°49'41"N 4°34'43"E on the earth's surface actually describes a quadrangle with a width of about 26 m. So because this describes only the southwest corner of a quadrangle the actual location of the organism can be beyond the coordinateUncertaintyInMeters from this corner. If the coordinate uncertainty is large and the precision is small this has little consequence, but this is not always the cases.

@ArthurChapman
Copy link

You are correct in saying that "I didn't see this at this point" but that point does still have an uncertainty. A point is NEVER just a point. Every point has an uncertainty and even often an extent associated with it (be that it may be very small). So in reality you are not saying that it doesn't occur in the totality of the area covered by the coordinateUncertaintyInMeters, but what you are saying is that "I didn't detect the species at this point, however that point could be anywhere in the area covered by coordinateUncertaintyInMeters". Your coordinateUncertaintyInMeters may be very small if you are using a Differential GPS, or using PPP methodology, etc., but it all depends on how accurate/uncertain is the point you are recording

@ArthurChapman
Copy link

I agree that recording "absences" does require an area component (and a time component). But in reality, absences may be recorded/noted using any one of a number of methodologies, and the methodology used should also be recorded. Transect, a shape around a transect. Then others have used methods whereby they have been searching and recording presences for a species, and have noted that there were no other species of that genus in the area where they collected, summising that as they are an expert in that genus, they would have noticed and noted if there were other species in that genus present. I remember a paper by Winston Ponder on this subject many years ago.

@tucotuco
Copy link
Member

tucotuco commented Dec 27, 2019 via email

@Tasilee
Copy link

Tasilee commented Dec 27, 2019

I agree with @tucotuco. One issue that has not been not been stated explicitly is the discoverability of true absences (with some form of spatial extent for the sites) only after a suite of survey sites have been ‘evaluated’. As in something like “I recorded all tree species in a series of 10m plots positioned randomly across an ecosystem, and after evaluation, species present in some sites were noted as absent in others.”

Also note that some analytical methods (e.g., some SDM’s) value TRUE (observed) absences over pseudo absences. An SDM like MaxEnt will only deliver true probabilities of occurrence with observed absences.

@qgroom
Copy link
Member

qgroom commented Dec 27, 2019

I think we all agree that to describe an absence you need clearly defined boundaries and preferably a explicit methodology, rather than one inferred by the observation coordinates.

@tucotuco
Copy link
Member

tucotuco commented Dec 27, 2019 via email

@albenson-usgs
Copy link

Ok question (not sure where else would be better to pose this question so apologies if this isn't the right place). I have a dataset (https://www.gbif.org/dataset/f56fb306-32e4-4b96-a381-6b87c186ad0f). It uses a stationary point count method for assessing reef fish (https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/fee.2144?campaign=wolearlyview). There are no absence records associated with this dataset as it's currently published. However, there is one event where no fish were seen. As it stands now this is documented as an event with no occurrences but I believe in effect this information will be lost to the data users. What would the recommendation be for how best to represent this information to an end user in GBIF?

@tucotuco
Copy link
Member

tucotuco commented Jan 8, 2020 via email

@albenson-usgs
Copy link

I'll put it over there. Sorry for the tardy response. Just discovered a bunch of Github notifications going to my spam folder in my old email system O_O

@tucotuco
Copy link
Member

Ok question (not sure where else would be better to pose this question so apologies if this isn't the right place). I have a dataset (https://www.gbif.org/dataset/f56fb306-32e4-4b96-a381-6b87c186ad0f). It uses a stationary point count method for assessing reef fish (https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/fee.2144?campaign=wolearlyview). There are no absence records associated with this dataset as it's currently published. However, there is one event where no fish were seen. As it stands now this is documented as an event with no occurrences but I believe in effect this information will be lost to the data users. What would the recommendation be for how best to represent this information to an end user in GBIF?

Since this question doesn't seem to have surfaced anywhere else, I'll offer the following, especially following the recommended clarifications for individualCount and organismQuantity/organismQuantityType. I would generate one or more Occurrence records for the Event (as many as needed to capture the scope of the taxonomic target of observation) in which the individualCount is 0, the organismQuantity is 0, the organismQuantityType is "individuals", and the occurrenceStatus is "absent".

@albenson-usgs
Copy link

I did add it to the DwC Q&A: tdwg/dwc-qa#151

@tucotuco
Copy link
Member

The usage notes recommended in this issue were added to term change proposal Issue #339. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants