-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What or who is the agent? #29
Comments
A great point. For the great majority of my own experience, the determination was made by the individual doing the tagging but we are certainly not being as explicit as we could be! |
The primary way we are differentiating human observations versus machine observations is by using |
@jdpye @albenson-usgs Gotcha re: |
and basisofrecord is also a required field. but I always try to convince people to fill out identifiedby by the person or algorithm that did the identification. |
@dshorthouse I guess I'm not sure that someone would NEED to know who captured the animal for conducting downstream analyses. I do agree it's nice to have if you can get it but I don't know that not having it prevents future work from happening? I'm not a biologging expert though so hopefully @peggynewman or maybe @sarahcd can confirm. I can see how you might need to know what machine made the observation because it might help with uncertainty (maybe?). But that information might be better laid out in something like |
Actually after considering this further, I could see the case for making |
Interesting. I agree with the approach that recordedBy and identifiedBy in biologging data goes alongside the Human Observation record and that broadly our approach is to group by organismId. We're likely to see these fields used more in repositories thanks to the kind of work that @dshorthouse is doing with Binomia. For biologging however it's the machine that's doing the observing, not recording or identifying, then that information doesn't belong in those fields. We are describing the machine capture mechanisms in the Event and MoF. |
@peggynewman In the context of biologging, it doesn't make much sense to ascribe credit for effort as might be assumed in the spirit of |
Taking from other best practices. A 'scientficName' should be linked to a 'scientificNameID' which is defined as: An identifier for the nomenclatural (not taxonomic) details of a scientific name. This gives some protection against slippage eg in case the scientific name the accepted and unaccepted names can be linked. The best practice is to have a globally unique identifier for instance a Life Sciences Identifier (LSID). For the marine species we use the World Register of Marine Species. for instance Aptenodytes forsteri Gray, 1844 |
Oh we've had a couple 'fun' taxon shakeups with manta rays and Atlantic torpedo/tetronarce, as well as some ambiguous identifiers with things like sixgill/sevengill sharks. At my institution we run things through marinespecies.org and back to the researcher with any discrepancies from their field reporting, and we identify the marinespecies.org entries as the authority as @Antonarctica has detailed. (We also track cases where the researcher is adamant that the taxonomic database has it wrong, though I don't know what to do with this information yet!) This grants us some ability to crossreference via TSN and AphiaID, now and in the future. |
@jdpye reach out to WoRMS on the cases where the researcher is not in agreement (info at marinespecies.org). They are really responsive and helpful. |
Definitely. they're great at accepting new colloquial names, and I feel like the marine/brackish/fresh distinctions are maybe a little bit my fault because I made them add American alligators once upon a time. |
I agree, a scientificNameID belongs with scientificName. In the situation where an algorithm has provided the species identification, I've been thinking that is more MoF lines. Is a persistent identifier for an algorithm a DOI on a publication or are there other options? |
Hi all. I'm trying to choose a clear way to indicate which software algorithm provided a taxon ID. I agree with a comment above by @jdpye that "identifiedBy" seems appropriate, though it would need its definition changing to encompass machines (not just people or groups) as the agent. On this question, there's a lot more discussion in the Attribution group here: tdwg/attribution#38 |
That is a great discussion, @danstowell , thanks for that link! I think we could potentially 'get away with' a lot because of the 'freetext identifiers separated by pipes' nature of the field in HumanObservations, but I'd love to see the MachineObservation side of things make use of that field. If we did that, you're absolutely right, definitions would need a bit of updating. For algorithms/implementations of identifying software in order to be complete we'd be looking to record a program name/version number, or better, a git URI and commit hash. What I haven't done yet, and can do, is look through some of the other tdwg communities and their conversations to find out what other determinations have been handed down on this specific subject in the past. |
I see that
scientificName
is a required term. But I do not see eitherrecordedBy
oridentifiedBy
. I assume that many determination events will be executed by a human sometime after a captured event but that other determination events will happen in near real-time by a trained machine. Should you requirescientificName
without also requiring an agent who/that made the assertion? Is this meant to be captured insamplingProtocol
and will content there be sufficiently machine-readable so as to differentiate determinations made by a human from those made by a machine?The text was updated successfully, but these errors were encountered: