-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All measured sampling log sheets: fields where values can be lists need to have their format changed from "automatic" to "plain text" #36
Comments
For the point "also in the sheet above there is only a float (0.7537) entered so what pigment is it?" -> that is an error that @melanthia @melinalou or @isanti have to ask the station to fix -> the station needs to give a type If there is only one correctly-formatted value there, then for the QC we do here I asked Bram to treat that as a 1-element list. So the QC will do a formatting check on the elements of the list by separating the list on the ";"s -- "string : value ; string : value ; etc" - and if there is just a "string : value" there it will assume this is a list of 1 element. In the triple store this will be recorded as a measurement of type "pigment" with subtypes "string1=value1" and "string2 = value2" etc. |
Yeah, I think it's important to separate what the validation and QC are doing. Validation is checking types, not correctness of the values given, only that they are the correct type. QC checks for correctness of the values entered, even if the type is correct. The problem here is that the recorders were entering floats, which because the format of the field is set to automatic, means that the sheet exports them as floats rather than the strings they should be. If the formatting was changed to plain_text, even if they entered a floating point number it would be exported as a string. So it would pass validation (it is correctly a string) but fail QC, because the string did not follow the expected format "pigment: value;" (or whatever it is...) |
That is something only @cpavloud or @isanti has permission to do, I think -> or @melanthia / @melinalou can you do it? |
Added in my To-Do-List (along with all the other issues that need solving). |
Sorry for the delay answer. |
I think that is what Cymon wants, yes. For us at VLIZ it makes no difference (@bulricht can you confirm this?) |
Example sheet: https://docs.google.com/spreadsheets/d/1xfrqraPa0auQ1O-C9RUo68RhxrPCDWkVMCAUbj79AZI
For fields that can have multiple listed entries:
e.g. "pigments" example in definitions_updated is "caroten: 0.125;phycoerythrin: 0.121" ie a string
Concentration of pigments other than chlorophyll-a; can include multiple pigments separated by ";". Look at the example for how to enter multiple values
also in the sheet above there is only a float (0.7537) entered so what pigment is it?
It's throwing an error because the validator correctly expects a string, but the spreadsheet is formatting it as a float (because format is automatic)... it should be throwing an error because the value doesnt say what pigment the value is referring to...
Update
"phaeopigments" in this sheet is another "list" that is being formatted as float:
Processing observatory_id='OSD74' - sampling_strategy='water_column' - sheet_type='measured'
Sample sheet link:
ValidationError: 1 validation error for Model
phaeopigments
Input should be a valid string [type=string_type, input_value=1.83, input_type=float]
For further information visit https://errors.pydantic.dev/2.8/v/string_type
The text was updated successfully, but these errors were encountered: