Skip to content
This repository has been archived by the owner on Aug 26, 2022. It is now read-only.

Add web pages describing the various failure codes #28

Open
ansell opened this issue Jun 9, 2017 · 2 comments
Open

Add web pages describing the various failure codes #28

ansell opened this issue Jun 9, 2017 · 2 comments

Comments

@ansell
Copy link

ansell commented Jun 9, 2017

Jan sent the ALA a report from a validator (this one I think), which has the majority of the datasets being rejected with "BASIS_OF_RECORD_INVALID". Given that the Darwin Core spec doesn't give an exhaustive list of valid basisOfRecord values, I had to look through the codebase here to find a link to the list that may be the cause of the validation failures, which I had not seen before:

http://rs.gbif.org/vocabulary/dwc/basis_of_record.xml

In the basisOfRecord case, the ALA has been offering GBIF verbatim data from data providers. The majority of them do not include basisOfRecord themselves, so it is understandable in this case that if GBIF requires it to be present that many of the ALA datasets won't be suitable in their current form. (Leaving a discussion of whether ALA offers its processed data to GBIF instead, for another day)

It would be ideal if there was a web page somewhere (the wiki here could work) that provides descriptions of the reasons for failures, including links to any controlled vocabularies that this validator is using in addition to the Darwin Core spec.

@kbraak
Copy link
Contributor

kbraak commented Jun 9, 2017

Thanks @ansell I fully agree with you better documentation is needed. In the meantime, reasons for failures are listed here, with descriptions partially filled in: https://github.com/gbif/gbif-data-validator/blob/master/doc/evaluation_types.md

After Jan sent ALA the report from the validator last week, I added the description for "BASIS_OF_RECORD_INVALID" referencing the vocabulary http://rs.gbif.org/vocabulary/dwc/basis_of_record.xml which complies with the Darwin Core Type Vocabulary. @cgendreau tells me the GBIF Data Validator actually uses the vocabulary from the GBIF API, however, be aware this vocabulary includes non-accepted values that shouldn't be used - see gbif/gbif-api#14

@ansell
Copy link
Author

ansell commented Jun 9, 2017

Hi @kbraak, Thanks, evaluation_types.md is what I was looking for.

My analysis of the basisOfRecord issue so far is that we have typically added basisOfRecord using default values, but those default values don't modify the "basisOfRecord" field that we send to GBIF. Instead they are stored in a "processed" field, next to basisOfRecord, called "basisOfRecord.p". I have filed an issue on biocache-store to track that part: AtlasOfLivingAustralia/biocache-store#212

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants