Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect possibly obsolete or wrong mappings after vocabulary updates #91

Open
stefandesu opened this issue Mar 31, 2020 · 6 comments
Open
Labels
mappings related to mapping-management question Further information is requested

Comments

@stefandesu
Copy link
Member

This issue can be used to discuss how we could implement updates for vocabularies and possibly versioning them or at least allowing to access the difference between versions.

The objective is that it should be possible to determine possibly obsolete or wrong mappings where one of the concepts has been changed since the mapping was last updated.

@stefandesu stefandesu added the question Further information is requested label Mar 31, 2020
@nichtich
Copy link
Member

This could be solved (at least to some degree) by making use of the date fields created, issued, and modified. We could generate lists of concepts with the latest of this values is at or after a given date - for instance a query parameter since (since=x meaning created>=X OR issued>=X or modified>=X).

Deletions require a query to list all mappings involving unknown concepts.

@stefandesu
Copy link
Member Author

The problem with all this: The mapping registry doesn't necessarily also contain the concepts used in a mapping. For example, Wikidata, GND, and RVK are all hosted separately from our main jskos-server instance. jskos-server doesn't necessarily have a means of accessing other registries, or even know where to get information about a concept. Also, doing this for a live query would be basically impossible because we would need to query all concepts in all mappings. (Even if all those concepts were on the server itself, this would be a huge operation.)

Two options that I could imagine:

  1. We give jskos-server a way of accessing concept information from other registries (for example by using cocoda-sdk) and create some regular checks to go through mappings and determine whether they contain outdated/invalid concepts. These would then be marked in some way.
  2. We move this operation to the client (i.e. Cocoda). This wouldn't work if we want to query a list of mappings with possibly outdated concepts though, only to determine whether mappings that are already loaded (for example in Mapping Browser) contain outdated concepts.

In retrospect, having versions for vocabularies (and maybe even other entities like mappings) is a completely separate issue from querying mappings that possibly contain outdated or deleted concepts.

@nichtich nichtich changed the title Vocabulary updates/versions Detect possibly obsolete or wrong mappings after vocabulary updates May 14, 2020
@nichtich
Copy link
Member

How about checking the mappings via cronjob and tagging mappings that require inspection? The tag could also be removed or overridden if change of the concepts does not make the mapping invalid. Sounds like annotations, no? The client only needs to get a list of mappings with this specific kind of annotation.

@stefandesu
Copy link
Member Author

Short summary of yesterday's conversation:

  • Run a scheduled job to check mappings (cronjob or something internal)
    • I would suggest having it run regularly, but only check a smaller number of mappings
    • The actual check: Check whether the mapping's modified date is older than the modified date of one of the contained concepts
  • Tag mappings via special annotations
    • Yet to be decided how those annotations will look like
  • Mappings are still requested via /mappings with a specific parameter, but internally we will check for existence of such annotations
  • Modifying a mapping removes or confirms the annotation
    • Question: Should this be done implicitly or should we require explicit confirmation?
    • Idea to update modified date without changing the mapping: Send an empty PATCH request for that mapping.
  • Requires completion of cocoda-sdk which will be used to query concept data

Am I missing something, @nichtich ?

@nichtich
Copy link
Member

Some thoughts:

  • The feature might be part of general workflow features. Annotation of possibly invalid mappings by an script is similar to annotation of mappings as to-be-reviewed (a kind of annotation which is not supported yet). This might be "questioning" or "highlighting" from Web Annotation Data Model
  • We already have confirmed (aka moderated) mappings.

Implementing this feature with tagged annotation would be more flexible but also more complex. A solution with minimal additional features could be:

  1. bot regularly searches for possible invalid mappings (based on modified). For each of these mappings:
  • downvote the mapping
  • delete existing confirmations
    by simple downvote annotation. Possible confirmation annotations on these mappings are deleted
  1. users can browse the list of mappings downvoted by the bot
  2. user can confirm the mapping. By adding a confirmation annotation, jskos-server automatically checks whether the mapping has been downvoted by the bot. In this case:
  • delete existing downvote by the bot
  • change the modified timestamp of the mapping

@stefandesu
Copy link
Member Author

Sounds good to me. I like that this works basically by only adding the bot and the ability to query mappings by annotations, but without any other changes. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mappings related to mapping-management question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants