Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata standards #22

Open
beatrizserrano opened this issue Sep 17, 2019 · 4 comments
Open

Metadata standards #22

beatrizserrano opened this issue Sep 17, 2019 · 4 comments

Comments

@beatrizserrano
Copy link
Member

I’d like to point out some issues that I’ve found related to the lack of metadata standards:

  • The number of columns is heterogeneous, e.g. "Gene.Symbol", "Gene.Identifier", "Control.Comments"... are not present in all the datasets.
  • The negative control labels are in at least two different columns of the metadata (Control.Type, Control.Comments).
  • The negative controls are labelled differently across the datasets. Labels such as "DMSO drug control", "untreated cells (EMPTY)", "non-targetting siRNA", "scrambled siRNA" and "negative control" represent the negative controls in different formats.
  • For the compound screen idr0016, the label “POS” in the column “Compound.Group” refers to the positive controls.
  • The positive controls are sometimes in certain wells that are not described in the metadata but in the original publication. For instance, MitoCheck (idr0013) and the secretion screen (idr0009) share the same layout in which the reagent ids 14851, 28431 are positive controls, together with the wells 1,4,49,52,290,301,338,349,363,333,336,381 and 384. In CellMorph, the positions 4I and 4J are also positive controls.
  • The plate content is redundant sometimes (e.g. idr0033, plateName vs plateName_illum_corrected). The image ids are different in both plates, although the illumination of the images seems to have been corrected.
  • The quality controls are represented with different codes pass/fail, TRUE/FALSE, ""/pass...

Maybe some of the keys have been standardized in the API but I couldn't find them (see issue #21).

@sbesson
Copy link
Member

sbesson commented Sep 18, 2019

Thanks for initiating the discussion @beatrizserrano. As you probably know, the IDR vocabulary is constantly maturing both as we receive new submissions but also as we receive feedback like this one from consumers of the published studies.
Let us review and discuss your various points internally and we will update this thread with some clarifications and suggestions.

@smrgit
Copy link

smrgit commented Dec 20, 2019

I am also new to looking at these published studies. @sbesson you mention the "IDR vocabulary" -- could you please point me towards details about the IDR data model, terms, controlled-vocabularies, and how these are defined? I have been trying to dig through documentation, presentations, github, etc, but so far I have not been able to find what I am looking for. This particular issue is now making me wonder if there are no strict metadata definitions?

@joshmoore
Copy link
Member

@smrgit Can you explain what you are trying to achieve? That will likely help to point you in the right direction. The information certainly still needs gathering together for end-user consumption as it is still most clearly defined on the submission side. Have you come across https://github.com/idr/vocabulary, https://idr.openmicroscopy.org/about/submission.html, and the linked templates?

~Josh

Note: the team is going into holiday mode. Apologies if response times are slow until the first full week of January.

@smrgit
Copy link

smrgit commented Dec 20, 2019

Perfect, thank you @joshmoore, I think that first one is what I was looking for -- clearly I should have found been able to find it! My goals right now are somewhat vague, I'm trying to do a "survey" of imaging metadata, which is why I don't really have concrete questions. Many thanks again and happy holidays!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants