Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check and correct term definitions #4

Open
cpavloud opened this issue Jul 12, 2024 · 9 comments
Open

Check and correct term definitions #4

cpavloud opened this issue Jul 12, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@cpavloud
Copy link

cpavloud commented Jul 12, 2024

@09012000-tosca
Check the logsheet_schema_extended.csv

-- review the definitions tab of the spreadsheets (water, sediment, ARMS) because sometimes the example does not follow the definition. all definition tabs for Wa and all for So are the same (for each observatory), so you can review one Wa and one So, but if any corrections are necessary, they have to be done manually in all of them.

-- check the official checklists in the GSC website for the water and the sediment, as well as in the ENA website for the water and sediment for the definition of those terms
comm_samp
scientific_name
time_fi
tidal_stage
store_person
size_frac
ship_date
sampl_person
samp_mat_process
samp_size_mass
sample_collect_device
sample accession number
project accession number
or just accession number
project name
organization
all the loc_XX_XX in the logsheets

-- fetch the URL of the terms.
E.g. the term "env_local_scale" has the URL https://genomicsstandardsconsortium.github.io/mixs/0000013/

-- find terms in the BODC vocabulary
biomass
n_alkanes
organism_count
samp_collect_device
samp_size_mass

-- find ENVO or BODC or similar terms for
phytoplankton
diatoms
dinoflagellates
coccolitrophores
other flagellates

@cpavloud cpavloud added the enhancement New feature or request label Jul 12, 2024
@09012000-tosca
Copy link

09012000-tosca commented Jul 16, 2024

  • comm_samp: def corresponds to the exemple. But there is no def in the GSC or ENA.
  • scientific_name: def DOES NOT corresponds to the exemple (marine plankton metagenome???). This term is missing in the seddiments logsheet (I do not know if is normal).
  • time_fi: def corresponds to the exemple. But there is no def in the GSC or ENA.
  • tidal_stage: def corresponds to the exemple and GSC or ENA.
  • store_person: def corresponds to the exemple. But there is no def in the GSC or ENA.
  • size_frac: def corresponds to the exemple. But there is no def in the GSC or ENA.
  • ship_date: def corresponds to the exemple. But there is no def in the GSC or ENA.
  • sampl_person: def corresponds to the exemple.. In ARMS the definition is not correct "Name of person who sampled the seawater"
  • samp_mat_process: ok
  • samp_size_mass: there is no this term in water and arms sheets. def corresponds to the exemple. probably because is samp size volume?. there is no def in the GSC or ENA.
  • sample_collect_device:there is no in ARMS. Def corresponds to example. def is really similar to the ENA def.
  • sample accession number: there is the ENA accession number. It is okay or we need a sample accession number also for genoscope example?
  • project accession number or just accession number: there is the ENA accession number. It is okay or we need a sample accession number also for genoscope example?
  • project name: okay
  • organization: okay

@cpavloud @kmexter
questions: alot of this terms definitions are quite specific on emobon. example scientific name we look at NCBI. Do we want an URl correspondant within NERC voc terms for this also if is not that specific?

@kmexter
Copy link
Contributor

kmexter commented Jul 16, 2024

yes, a lot of the terms are specific to us. for example, a "person" can be defined not via BODC but schema.org as Person (see https://schema.org/Person), however a "sampling person" will never have a defintion as it is so specific, so for that one we do not expect a URL (if there is one, great, if not, that is fine).
accession numbers: if you can find a term for "accession number" that is sufficient.

@09012000-tosca
Copy link

@09012000-tosca
Copy link

-- find terms in the BODC vocabulary

@laurianvm
Copy link
Contributor

laurianvm commented Jul 17, 2024

Thanks a lot for the URI's of those terms! :)

Not sure if following 100% with the definition-example check; does the definition of 'scientific_name' need to be changed? or will the example be changed?

@09012000-tosca
Copy link

Hi, I think the example should be changed :)

@cpavloud
Copy link
Author

The scientific name definition is correct and the example is correct.
It is just missing from the sediment logsheets.
And a Y should be added in columns MiXS_mandatory_(Y/N) and ENA_water_checklist_mandatory_(Y/N)

@kmexter
Copy link
Contributor

kmexter commented Aug 22, 2024

* biomass:
  http://vocab.nerc.ac.uk/collection/P02/current/FIBM/ it is specific to fish but is possible to make it more general: Parameters quantifying the mass in total or by species per unit area or per unit volume in any body of fresh or salt water expressed in any form (e.g. wet weight, dry weight, carbon, nitrogen, etc.)
  Other option: the one proposed in ENA vocabulary

--> the BODC one does not work, @09012000-tosca can you give me the link to the ENA one?

@kmexter
Copy link
Contributor

kmexter commented Aug 22, 2024

So as for the recommended changes to the definitions/examples - see comments from Tosca at the very top - that is for EMOBON HQ to do in the logsheets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants