Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Identifier mapping #1431

Open
sharifX opened this issue Jun 26, 2024 · 4 comments
Open

Question: Identifier mapping #1431

sharifX opened this issue Jun 26, 2024 · 4 comments

Comments

@sharifX
Copy link

sharifX commented Jun 26, 2024

I was wondering if you can give a short explanation of these different identifiers. For example:

Abax ater (Villers, 1789) coming from Nederlands Soortenregister:

Does the checklistbank maintain a mapping between 145CAE57C83->0AHCYFBQVMRK?

Could this mapping be provided as a list? I am looking for a list of NSR IDs that link to checklist bank IDs.

GBIF species page usage the wikidata page to generate a list. Example JSON query from wikidata: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q1303390&props=claims&format=json

Maybe I can get this via API?

@mdoering
Copy link
Member

I am not sure I understand your exact needs, but in CLB we differ between names and name usages, i.e. a taxon or synonym. They have identifiers on their own. The most natural ones are the taxon identifiers.

Any identifiers are used as the source defines them. CLB does not generate new identifiers (except for rare cases when there none in the source), but reuses the original ones.

In the case of the Dutch species register it seems that the identifiers used in their own site and not exported into the data we see, e.g. Abax ater has id=91411 in NSR, but 145CAE57C83 in their DarwinCore export.

So I am afraid the problem is with NSR that they do not supply their actual integer ids. @olafbanki could we ask NSR to change that or even publish ColDP data?

@mdoering
Copy link
Member

mdoering commented Jun 26, 2024

Let me use an example from ITIS. The ITIS TSN 932346 represents Abax parallelepipedus which can be found with the same ID in ChecklistBank, just scoped under the ITIS dataset key 2144:
https://www.checklistbank.org/dataset/2144/taxon/932346
https://api.checklistbank.org/dataset/2144/nameusage/932346

@sharifX
Copy link
Author

sharifX commented Jun 26, 2024

@mdoering thanks for the response.
The ITIS example is helpful. I am trying to see if we can construct a list of all related identifiers in CLB that has NSR id.

I can get a mapping from this wikidata sparql query but only the CLB taxon identifier.

For example, https://www.nederlandsesoorten.nl/linnaeus_ng/app/views/species/nsr_taxon.php?id=120589 links to https://www.catalogueoflife.org/data/taxon/8VVK7 (according to wikidata)

but within the dataset scope we have another ID: https://www.checklistbank.org/dataset/2014/name/xLW8

I will check with NSR to see how they are exporting this.

@mdoering
Copy link
Member

Identifiers in CLB with an x prefix are usually generated identifiers, most often because the incoming data had "flat" records with an higher classification given which then needs to be translated into a normalised form with identifiers for each higher taxon. You can identify such records by origin=denormed classification: https://www.checklistbank.org/dataset/2014/taxon/xLW9

In this case the genus Betula was not explicitly existing as a record on its own in NSR, but given in some species records like this one:
https://www.checklistbank.org/dataset/2014/taxon/DXQ2YTVB8Y4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants