-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change term - occurrenceStatus #339
Comments
To facilitate use with the controlled vocabulary proposed in #342, Change Due to the simplicity of the CV, I would leave the examples so as to not require people who know English to follow the link to the CV page. However, note that this term exists in both the |
For those who wonder why defining a controlled vocabulary is necessary for a term as simple as this, it is only simple for those whose native language is English. The establishment of a formal controlled vocabulary provides the framework for linking labels and definitions in many languages to the controlled value terms. Establishing the controlled vocabulary has no effect on the recommended controlled value strings, which are identical to those already in common use. |
Has an existing |
I am sort of okay with the change in definition, given that an the Darwin Core definition of Organism can mean all members of a Taxon in a given area (and 'absent' only makes sense in that way), but I do not agree with the usage comments or the extremely restricted vocabulary (I will address that in #342). The comment that threat status is better dealt with in the Species Distribution Extension, while I agree with it, is neither here nor there, as the Species Distribution Extension goes with a Taxon Core and, when using a Taxon Core, |
@nielsklazenga The vocabulary proposal here is only a formalization of what the precise vocabulary restriction was meant to be from inception. The term was only ever meant to be to distinguish presence from absence. I substituted Organism for Occurrence, because it isn't an Occurrence that was present (or absent), but more precisely an Organism was present, or any instance of an organism of a given Taxon was absent. The bottom line is that the vocabulary isn't really open for dispute unless you propose to change the semantics of the term. The usage comments were to nail down how to deal with things that people might confuse as occurrenceStatus. |
@tucotuco, does not the definition of Organism in Darwin Core make any entry in a species checklist or flora etc. more or less an Occurrence? I agree that the terms related to breeding status and threat status, which were explicitly excluded in the usage comments should indeed not be in |
I commented on this here. |
Sorry, but there is so much to catch up on here. Discussion relevant to this proposal has developed in two distinct issues, this one and the proposal for the controlled vocabulary for Is the proposed definition of occurrenceStatus satisfactory? Controversial To me (also beyond the normative definition), a The actual definition of a Darwin Core I think this definition is an underlying source of a real problem, because it reflects the superficial way in which we talk about what a The rest of the controversy seems to be around the limits on the scope of an The definition of the term An aside: To accept the entire continuum of scope as valid has a side benefit that it alleviates the arbitrary distinction of an Concretely, I suggest to amend the proposed definition of If this is an acceptable definition, the concept The concept of doubt as another vocabulary term is controversial in discussions. This concept (along with all others that are not There are many ways in which the specific nature of doubt can be expressed independent of a statement about presence or absence. The nature or source of the doubt would be lost if all that was left of it was the value
Independent doubt about presence is a strange concept. If I don't know if I detected something or not, why am I bothering to say so? What knowledge am I imparting? I think the concept that is missing is not doubt at all, but rather a measure of certainty (likelihood). I can see that it might be useful to know that it is unlikely that something was at a given place and time, especially for large scales of place and time, because it can impart broadly useful expert knowledge. We don't have another way to capture that in Darwin Core that isn't essentially an I don't see categories of likelihoods as good candidates for values in a controlled vocabulary. No one would really know what they mean. It seems more the stuff of summary analysis and would be a confounding value for On to the rest of the summary... Should the values 'present' and 'absent' be in the controlled vocabulary? 'Yes' Is the proposed definition of Is the proposed definition of Should there be additional values in the controlled vocabulary? Controversial |
Many thanks, @tucotuco. One note of protocol for epic posts: always best to warn @timrobertson100 to go get a cup of tea up at the top. I like how you addressed the relationship between
I agree this is true in principle, but maybe not so much in practice. We wrestled with this when
Hence my earlier comment that we should treat the scope of an That aside...
You addressed the |
To clarify, the proposal when the public review opened was:
which, based on discussion so far, I proposed to amend to the following:
I maintain that a Taxon is a theoretical construct and an Organism is not, so that there is no way to have a continuum from one to another and no issue about where to draw the line in that respect. With respect to the scope of Organism, however, I don't feel there is a defensible line that can be drawn, nor can I imagine what is to be gained by making one. I guess I am not very creative. Clearly if I was a taxonomist I would be a lumper. |
Thanks @tucotuco , I am a taxonomist and I have no problem at all with your characterisation of a Taxon. Our disagreement is in whether or not there can be an occurrence (of an Organism/Taxon, does not matter) independent of an (dwc:) Event.
The last proposed definition:
..., which I agree reflects one side of the discussion, mainly in issue #342, feels to me less like clarification and more like hijacking of a term that works perfectly well elsewhere (and does not need extra clarity) for a different purpose. |
What worries me (deeply) is precisely that "for a different purpose". If it is really a different purpose, then the semantics must necessarily be different, and as soon as we go into a semantic framework the term will break down. The different purpose has different semantics and should have a distinct term. |
@tucotuco :
With this statement:
I certainly agree with the latter, but I guess the problem I have with the former statement is that ultimately "wolf pack" and "taxon" are at two different ends of a continuum of "organisms implicitly or explicitly included within a circumscribed set". The main difference is that an Perhaps your thinking is that I have no problem using instances of The problem I have is in representing a statement like "this taxon is introduced in Hawaii". Such a statement is bounded in space ("Hawaii"). It's also bounded in time (i.e., the time range starting when the first organism of that taxon occurred in Hawaii, and ending at the time when the assertion is made). But the trouble I have is how to represent the I guess there are several questions in play here; namely how, using dwc terms, could one represent statements such as:
|
Hi @deepreef , I think this goes with something I have seen you wrote in another issue that taxa cannot be observed and what I said that an Organism will never become a Taxon. A taxon is a human construct that you superimpose over the organisms that are observed. The same organism can, to different observers, belong to different taxa and the same taxon name can be used for different Organisms with an organism scope of 'taxon'. With populations you do not have this problem, even if people will have different opinions on what populations are, as they do not have the taxonomic baggage. |
That is actually exactly the point I was trying to make. I think both use case fit the semantics of I have never said that I do not mind the change from 'Taxon' to 'Organism' in the original proposal. I do not think it is necessary, but I can see where it is coming from. I would also like a controlled vocabulary on I also think we can improve the definitions of 'present' and 'absent', but I suggest we go with the dictionary definitions rather than something that skewes the meaning toward some use cases at the expense of others. |
Perhaps, but it's still a continuum along a scale of "sets of circumscribed organisms", with the only difference being whether or not a |
@deepreef , yes exactly. It is never going to be easy. |
Now that I've had several showers, traffic jams, and ceiling-staring sessions to contemplate some of the points raised by @tucotuco, I'd like to explore this statement a bit more:
As noted here and elsewhere, I'm on board with the "every member of a Taxon and any scale" part. What I've been thinking more about is the "recommended practice would be to provide the scope explicitly" part. Specifically, I'm wondering whether we might want to clarify the definition, comments and examples for I actually think the definition is fine as is, so perhaps all that would be needed is (non-normative?) alterations to the Comments and examples. In the Comments, I think the statement "This term is not intended to be used to specify a type of taxon." is correct, but potentially could be misinterpreted. Perhaps something more like this would be better:
Also, perhaps another example or two could help clarify what sorts of terms might be included on a controlled vocabulary. At the very least I would add "population". I'm also a little uneasy with the inclusion of "multicellular organism", "virus" and "clone" in the list of Examples. I'm sure there was some rationale/use-case for including those terms (it's entirely possible I either proposed or actively supported their inclusion -- I can't remember), but they seem a little out of place in the context of the definition, and aren't mutually exclusive (e.g., could not an I'm not sure if this warrants a new Change Term issue. |
This proposal has been labelled as controversial. If no evidence of consensus can be reached by the 30-day minimum review period, the proposal will be deferred for later consideration. If there is evidence that a consensus can be reached, the review period will be extended for an additional 30 days from the time apparent consensus is established (everyone participating in the discussion expresses their satisfaction with the proposed solution). |
I would like to try to summarize the controversy. I maintain that an Organism and a Taxon are conceptually distinct. One is the manifestation of a theory and the other a manifestation of biological entities. There is no continuum leading from one into the other and one is not semantically a subtype of the other, because theories are not living beings. Stated in another way, the attributes of a Taxon do not apply to an Organism, nor vice versa. It isn't clear that there is agreement on this much. But there is more. Even if everyone accepts that the two classes are semantically distinct, I am of the opinion that the distinctness means that the term should not be a property of both classes. Granted, in Darwin Core we do not make the formal assignment of properties to Classes, we merely annotate the properties to be "organized in" a class. This was done on purpose in the absence of a rigorous community-wide conceptual schema for the Classes we manifest in various contexts. For Darwin Core, the conceptual schema was expected to be a separate exercise, after which formal assignments of properties to Classes could be made in the future. That future still hasn't arrived. Thus, whereas nothing in Darwin Core prevents a term from being used as if it were a property of any Class, nor of it being assigned to no class at all (see Record-level terms), it is not a good way to prepare for a semantically rigorous future. Herein lies my principle objection. It isn't clear if there is any agreement about my position on this. If there is, it suggests that two distinct terms are necessary - one for Occurrence and one for Taxon. Even if there is agreement on that last point we have contention built off the legacy built around the term to date. The name of the term and its organizational placement in Occurrence suggest that the term should apply to Occurrences only. That was indeed the (only) original purpose of the term, and the reason for the two original examples only. But the original definition is at odds with all the foregoing, being guilty of laziness in the use of the word Taxon and therefore opening the door for it to be used in a different way. Since the door was open, it was used in a different way. It was incorporated into the Species Distribution Extension, but it was given an entirely different definition ( For my part, I would be perfectly happy to help with the correction of downstream problems that arose with the Species Distribution Extension re-defining the term for its own purposes, but I think the only reasonable way forward with that is to use a different term in that extension, whether or not that term (or its recommended controlled vocabulary) is also incorporated into Darwin Core. Though I feel strongly about all of this, I recognize my part in creating the problem to begin with. I also recognize that I have to put my role as convenor and mediator before my role as a stakeholder in the community, so if I am the only one that has the viewpoints I have expressed, I am happy to sequester my objections from consideration of achieving consensus and let the proposal move forward without them. Even so, I am not sure we have achieved a clear consensus. Thoughts? |
I would like to put forward my agreement with what @tucotuco has laid out and voice my support for |
OK, I could quibble with this, but only in an effort to play Devil's advocate (to wit, both Organism and Taxon classes could be construed as conceptually identical in the sense of "circumscribed set of one or more living things"; the only real difference being the formality of the label). HOWEVER, in the context of DwC (not to mention "common sense"), I am in full support of what @tucotuco asserts above. My only concern is that a single instance of
I'm pretty sure I'm the only one who floated the idea that
I completely agree with this opinion.
I'm still unclear on why we need something like this to be a property of a
Like @albenson-usgs, I'm in agreement with your main points, and am likewise OK with |
Though the pragmatic solution proposed by @timrobertson100 takes occurrenceStatus in the exact opposite direction to the proposed changes in this issue, it would be a non-normative change with no bearing on existing implementations, and could be adopted instead of the original proposal without need for public review. |
If all are fine with the non-normative change proposed by @timrobertson100, I can live with that solution as well. However, I am very strongly in favor of the original direction @tucotuco had been pushing this. I desperately hope we (TDWG community) are moving in the direction of @tucotuco's "bigger dream", and that following the proposal of @timrobertson100 now is understood to be only a pragmatic solution to simplify the path immediately in front of us, and that this issue will need to be resolved more robustly in the (not-too-distant) future. Moreover, I hope that this pragmatic solution does not preclude @tucotuco's suggestion for (and my endorsement of) forming a task group to come up with a more appropriate and robust solution. That's my short response. Much of my long response was already written, but lost when I closed the browser before clicking the "Comment" button yesterday. As I indicated in my previous post, a lot of it relates to the idea of evidence-based checklists, and occurrences being the inversion of existing taxonomic checklists. At the beginning of DwC, the predecessor of what is now called That transformation from This discussion had really given me hope that we had achieved critical mass to take that step. But perhaps we're not quite there yet. If we can follow through with a Task group (perhaps integrated with the proposed task group for #314), then I have at least some hope for keeping the dream alive. @tucotuco : This issue might not have broken the camel's back (yet), but I hope at least the camel is in serious need of the services of a chiropractor. |
Just to make myself really clear, I just want a property that I can use on taxon-area statements; it does not have to be There is an analogy with the "establishment means" terms here as well, as I think this is still a simple choice between one or two terms. I fail to see how the data model comes into it, or how anything @timrobertson100 suggested goes against the "bigger dream" of "Darwin core as a true information model for biodiversity", although I am not sure what 'true' means here. |
Not to belabor, but what I meant was that we come to a clear agreement on what actual entity is meant for an instance of the This sort of imprecision is tolerable so long as DwC is a "bag of terms" that are merely "organized" in classes. What I meant by the "bigger dream" of "Darwin core as a true information model for biodiversity" is that each DwC class can achieve a precise definition in the context of an information model, and each term can be clearly mapped as a property of instances in exactly one of those precisely-defined classes. For example, right now, a unique That degree of ambiguity in the conceptual entity represented by an instance uniquely identified by |
I get that we need to distinguish between the primary occurrence data and the distribution data that is inferred from it, but what has it got to do with |
From my perspective: it's not helpful to have a single term serve to represent a property of more than one class of "thing". As @tucotuco outlined, this particular term has a mixed interpretation (both through its definition history and its actual usage history) for application to both "primary" Occurrence data and taxon distribution data. I think the goal would be to clarify its definition and usage to be more consistent to a single purpose. |
I have already said that I am fine with two terms. My understanding of the nature of absence data is not enough to argue one way or another. It seems to me though that some people who want to make the change are uncomfortable with it. You cannot bring the data model into it though, as then you would have to do the same for several other terms.
A "specimen" is an instance of
That is diametrically opposite to the Standard Maintenance Specification, as well as circular, as you can always create a superclass that a property uniquely applies to. |
I guess we'll just have to agree to disagree on some of these things. But I definitely agree with this:
Keeping in mind, of course, that the metaphorical broken camel's back is a good thing (i.e., progress...), in this context. |
Hoping not to cause yet more trouble, may I ask how an absence occurrence should be interpreted if it contains further properties than just a location, time and "subject"? If the Occurrence specifies a chicken with sex=female, basisOfRecord=PreservedSpecimen. Does that mean there can be male chicken specimens around? It would make absence data very hard to use. Should that be something to add to comments? |
|
Just to elaborate a bit, the underlying assumption is that when an instance of I also fully agree that a |
@timrobertson100 removed duplicate post |
1 similar comment
@timrobertson100 removed duplicate post |
...and one more minor elaboration. A preserved specimen that is no longer in a collection can be indicated using the term disposition. That is a different kind of absence, not in the Occurrence, but in the material in a physical collection. |
So... before I had a better idea of the distinction between The concept actually worked extremely elegantly as a way of tracking a specimen through space and time -- no different than, say, a satellite tag affixed to a cougar or whale or shark or something. Each Both of these weird ways of looking at things:
dramatically simplified the data model and provided some very cool ways of parsing and representing patterns in the data. However, of the two, I think the first still has merit, but the second breaks down when we clarify the distinction between Again, I digress... |
We see two cases where
I agree with @mdoering that expanding the comments to guide users when combining other terms is needed. We've identified (Aside - I feel much of the discussion on this thread belongs elsewhere to attract the visibility it deserves. There are good remarks that will help DwC but this thread should focus on comments, and ideally proposals, that help refine |
For 1 the How did it die? Task Group and a possible new term
As a relative newcomer to TDWG, where is that exactly? I'm not trying to be difficult and I don't disagree that this has veered far from the original purpose. I've just had a hard time figuring out where the core discussions happen around the standard. Some discussions seem to happen in the GBIF Github, some discussions in the OBIS Github, some (but fewer) discussions in the TDWG Q & A repo, very few on the listserve (or maybe I'm not on the right one). This is actually the most discussion about the standard I've seen is only on issues where changes are proposed. Where should this discussion happen to get the most visibility? |
@albenson-usgs This sort of discussion used to happen on the tdwg-content email list, and that list was once considered the "official" place to discuss issues related to proposed changes to standards. However, there were complaints that long threads like this one overwhelmed people's inboxes and it was suggested that they would be better documented in issue tracker comments like this, which also allow people to opt in or out of following them. I think that using GitHub issues comments like this is a significant improvement over using the email list. If I am to busy on a given day to read everything that comes in as notifications, at a later time I can just go to the issue tracker page and scroll through the thread. However, I agree that there are problems with the system once discussions veer away from direct comments about the proposal at hand. To some extent, that should probably be handled by opening a new issue more directly related to the other issues being discussed. But that still has the problem that discussions only include people who are paying attention to that repo or who are tagged in the thread. For example, I don't watch either the OBIS or GBIF repos, so I am unaware of discussions taking place there. I would like to see some more attention be paid to how people can opt in to a more general system for discussion. That used to be tdwg-content, but with Slack, Twitter, the TDWG newsletter, and GitHub issues, the platforms that are used for communication are much more diffuse, with many people preferentially following some but not all of the venues. Despite the negative aspects of tdwg-content that I mentioned, it did have the advantage that everyone knew about it, everyone had email, and anyone could sign up for it. That isn't really the case for GitHub, Twitter, and Slack, which all require signing up for an account. I'm actually not sure how one gets subscribed to the TDWG newsletter. I get it, but don't know how I got subscribed, and one can't just post to it as one can in the other platforms, so it isn't really a discussion venue. |
I really appreciate the rigor to which you have all been hammering out this change. I am very strongly supportive of how this change season has been put on here in the GitHub Issues for all to engage with and I thank you all for putting this crucial work in. I've been reading this thread specifically, to see where consensus might emerge, and whether I can still see a fit for my use case in the proposed solutions. That use case is basically: An animal acoustically detected and identified to the individual level by an electronic code recorded at a series of listening stations around the world. (MOTUS will have a similar model on the bird side of things) Since these listening stations can sometimes decode random noise into a numeric code, and that code can match to a tag deployed on another animal somewhere in the world, there are filters that researchers apply to assess the likelihood that any given ping is truly of their tagged individual (who has for the sake of my use case, been identified taxonomically and issued an organismID at the point of their being tagged). Since those filters are often not conclusive, I've got a range of possibilities when reporting (or not reporting) these detections that are flagged by filters. The occurrenceStatus field, if not one of the ways I should accomplish this reporting of imprecision (though not temporal, spatial, or taxonomic imprecision!), is at least a place where I have to make absolutely sure I don't mislead anyone. I wasn't sure how to apply the originally proposed changes to my niche example, and I figure there are possibly other ways to make my records speak clearly about false detections, but if there's a task team coming together and looking for edge cases to explore, ✋ . |
@jdpye Fantastic eye-opening use case for Organisms (individuals even) that may or not be there, and why you would care to say something if you weren't sure they were there. This suggests that the controlled vocabulary proposal (#342) is insufficient even without the issues about distinctions between Organism and Taxon, and the limits on the scope of Organism. This also supports the need for a Task Group. |
The TDWG Darwin Core Q&A repository (and associated Darwin Core Hour of live seminars) was developed exactly for the purpose you described - for uncertainties about the standard to generate discussions that can be summarized in answers, documented, and inform changes (if necessary, and embodied in change requests here in this repository) to the standard. |
At this point in the review process, my assessment is that will not achieve consensus on this issue as proposed. A more comprehensive solution (potentially involving two terms and their respective vocabularies) will be required and that is a perfect job for a task group. |
I am tired of kicking cans on everything and allowing issues to pile up. I don't really have massive amounts of free time to devote to all of this, but I think it needs to be addressed sooner rather than later. |
@Jegelewicz I understand the frustration, but we are following a community-defined and accepted practice and consensus is a part of that process. We can seek consensus, but we can't force it. The reason for deferring the proposal and recommending the task group for it is so it doesn't hold up the 39 other proposals in this massive cleanup effort for which there is consensus. |
This proposal has been labeled as 'Controversial' and in need of a task group to for resolution. It is no longer part of an active public review. |
Change term
Current Term definition: https://dwc.tdwg.org/list/#dwc_occurrenceStatus
Proposed new attributes of the term:
present
,absent
Discussion leading up to this change proposal can be found in Issue #238.
The text was updated successfully, but these errors were encountered: