Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3 schema-revision: contribution.schema.json #82

Closed
lzehl opened this issue Oct 13, 2020 · 15 comments
Closed

v3 schema-revision: contribution.schema.json #82

lzehl opened this issue Oct 13, 2020 · 15 comments
Labels
request any request or update for schemas

Comments

@lzehl
Copy link
Member

lzehl commented Oct 13, 2020

The contribution.schema.json will be used to name a contributor (person or organization) and state how he/she/it contributed in producing the research products or their versions.

According to the current documentation, this schema will have the following properties:

  • @type [expects: constant ("https://openminds.ebrains.eu/core/contribution"), count: 1]
  • @id [expects: free text, count: 1]
  • contributor [expects: person|organization.schema.json, count: 1]
  • contributorType: [expects: controlledTerm.schema.json, count: 1 - N]

Note for controlledTerm.schema.json for...

  • contributorType (contributorType): expects JSON-LDs for contact person, data collector, hosting institution, etc (cf. datacite metadata list for contributorType
@lzehl lzehl added request any request or update for schemas revision labels Oct 13, 2020
@apdavison
Copy link
Member

could we also add a contributionType property? The use case I have in mind is where we have an adapted version of a model that was originally published elsewhere, and we would like to distinguish between the original authors and the authors of the adaptation, while listing all of them as contributors.

this has some overlap with contributorType, e.g. contributorType => data collector and contributionType => data collection would be equivalent

@lzehl
Copy link
Member Author

lzehl commented Oct 19, 2020

Mh. I'm not sure if I understand the use case where only the contributionType and not the contributorType would hold the information. Also: should an adapted version of a model not get it's own digitalIdentifier and with that being associated to it's own group of "authors"/developers?

@apdavison Could you explain a bit more why you think it is necessary to have contributionType and contributorType? Why is one of them not sufficient?

@apdavison
Copy link
Member

For most cases, I think contributionType would be sufficient (and I prefer it to contributorType, as the same person can have different roles in different contributions).

The only place where I thought contributorType would be better is with hosting institution, also there contributionType = "hosting" might be better.

(I don't think either option works for contact person, which is semantically not really a contribution.)

@lzehl
Copy link
Member Author

lzehl commented Oct 20, 2020

I'm okay replacing contributorType with contributionType.

I only chose the first because this is how DataCite is doing it (they do not use contributionType).
Also in DataCite contact person is a valid controlledTerm for contributorType, although I see your point about the semantic here.

I don't think we have to stick to DataCite here. I suggest using contributionType only.

Note: hosting institution or actually hostedBy is covered elsewhere (cf. research product versions). I understand "hosted by" as information on where the files of the research product version are stored, but maybe this is a wrong assumption?

@olinux
Copy link
Member

olinux commented Nov 4, 2020

I wonder if we could make contributionType count:1 instead. If we e.g. have a person being an author and a custodian, we need to be able to sort the contribution for "author" into the right authorship position. This would be easy if there would be two contribution instances with the same person but hard if there's only one with multiple contribution types...

@lzehl
Copy link
Member Author

lzehl commented Nov 4, 2020

I think we should discuss this. I have another idea which might work as well without introducing several contribution instances for one and the same person

@lzehl
Copy link
Member Author

lzehl commented Nov 11, 2020

@apdavison & @olinux I need help setting up the correct contribution types

Side note for @apdavison : Oli and I discussed that we separate at least "authors", "custodians", "developers", and "funding" from "otherContributions". The first three because they need to be applied in order (order of authors etc) and "funding" because we ask for metadata here that do not match with the contribution schema.

These are the contributor types available for DataCite:

  1. ContactPerson
  2. DataCollector
  3. DataCurator
  4. DataManager
  5. Distributor
  6. Editor
  7. HostingInstitution
  8. Producer
  9. ProjectLeader
  10. ProjectManager
  11. ProjectMember
  12. RegistrationAgency
  13. RegistrationAuthority
  14. RelatedPerson
  15. Researcher
  16. ResearchGroup
  17. RightsHolder
  18. Sponsor
  19. Supervisor
  20. WorkPackageLeader

Question are:
A) Which ones should we cover?
B) How should we translate those to a contribution type?
C) Which ones are missing?

@jcolomb
Copy link
Contributor

jcolomb commented Nov 12, 2020

Especially concerning question C: reminder about openMetadataInitiative/openMINDS_controlledTerms#49 and the use of other contribution terminologies/ ontologies,

Also, what I saw in practice is people using the same list of authors for the paper and the attached data, speaking for a integration of credit taxonomy here ?

@lzehl
Copy link
Member Author

lzehl commented Nov 12, 2020

@jcolomb yes! thanks for the reminder about openMetadataInitiative/openMINDS_controlledTerms#49 ; datacite terms are captured above. I did not yet compare those to the ROC ontology.

Considering paper and attached data : using the same authors is often the case, but not a rule and I know several cases where the authors are different (in order, or the number, for the latter usually the data publication only holds a subset of authors listed on the research publication)

@jcolomb
Copy link
Contributor

jcolomb commented Nov 12, 2020

In most cases, the authors should be different, but copy paste means no need to ask and is therefore easier ;)

see data2health/contributor-role-ontology#125, I think they did it but never updated the issue, nor add the info in that repository.

@lzehl
Copy link
Member Author

lzehl commented Nov 19, 2020

@jcolomb unfortunately these are "ContributorTypes" not "ContributionTypes"... but I agree with you. I think we should switch back to ContributorTypes, because these are used by DataCite and are much easier to define.

@apdavison what do you think? If you would like to keep the contributionTypes I need help defining them properly for the different contributor types. Here what I came up with (including comments):

contributorType (DataCite) contributionType comments
ContactPerson correspondence
DataCollector collecting data
DataCurator curating data would we expect local curators here, or the responsible EBRAINS curator ?
DataManager managing data
Distributor distributing not sure if this is needed in our case
Editor editing not sure if would need to be more specific here... or maybe I don't get when to use it
HostingInstitution hosting is captured separately (cf. hostedBy in fileRepository schema)
Producer producing not sure if that is used in our context
ProjectLeader project coordination I would get probably rid of the "project" because we use this as a schema
ProjectManager project management I would get probably rid of the "project" because we use this as a schema
ProjectMember project participation I would get probably rid of the "project" because we use this as a schema
RegistrationAgency registration do we need this in our context?
RegistrationAuthority registration do we need this in our context?
RelatedPerson collaboration not sure if I interpreted that correctly
Researcher research
ResearchGroup research
RightsHolder holding rights captured elsewhere (cf. copyright schema)
Sponsor funding captured elsewhere (cf. funding schema)
Supervisor supervision
WorkPackageLeader lead of work package

I have the feeling that this does not capture all of what we need...
Authors for the dataset/model/software publication, will be explicitly stated, because they need to be ordered.
Developers for the dataset/model/software publication, will be explicitly stated, because they need to be ordered.
Custodians for the dataset/model/software publication, will be explicitly stated, because they need to be ordered.

I kind of miss contributions like (only some examples):

  • analyzing data
  • experiment development
  • concept development
  • information technology support
  • laboratory assistance
    etc.

@jcolomb
Copy link
Contributor

jcolomb commented Nov 19, 2020

CRO/CREDIT list roles, so something closer to contribution types. It seems to have what you feel is missing. (best navigate the ontology here https://www.ebi.ac.uk/ols/ontologies/cro/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCRO_0000000&viewMode=All&siblings=false)

@lzehl
Copy link
Member Author

lzehl commented Nov 19, 2020

Thanks @jcolomb this looks indeed closer to what we planned to integrate. I really appreciate your feedback and contributions 🙂

@jcolomb
Copy link
Contributor

jcolomb commented Nov 20, 2020

pleasure is mine. I think the people behind CRO are busy on covid projects at the moment, but I am quite sure they would be happy to see their work useful for a big project like this one, and to get feedback on their initiative.

@lzehl
Copy link
Member Author

lzehl commented Nov 23, 2020

I've moved the discussion to controlledTerms: https://github.com/HumanBrainProject/openMINDS_controlledTerms/issues/9

@lzehl lzehl closed this as completed Nov 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
request any request or update for schemas
Projects
None yet
Development

No branches or pull requests

4 participants