Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing TASKS tool in the project 3: error 500, etc. #1380

Open
yroskov opened this issue Nov 22, 2024 · 12 comments
Open

Testing TASKS tool in the project 3: error 500, etc. #1380

yroskov opened this issue Nov 22, 2024 · 12 comments

Comments

@yroskov
Copy link

yroskov commented Nov 22, 2024

Testing Tasks tool in the project 3 (CoL draft) at https://www.checklistbank.org/catalogue/3/tasks

I have chosen WCVP checklist id2232. I am expecting to get a report on duplicated names only inside WCVP sectors included in the Project 3.

@yroskov
Copy link
Author

yroskov commented Nov 22, 2024

Problem 1 I have got message error 500:

image

@yroskov
Copy link
Author

yroskov commented Nov 22, 2024

Problem 2 I have got report on duplicated names bewtween WCVP and all other checklists. Whereas I need the report on duplicated names only inside WCVP sectors included in the Project 3.

https://www.checklistbank.org/dataset/3/duplicates?authorshipDifferent=true&category=binomial&limit=100&minSize=2&mode=STRICT&sourceDatasetKey=2232&status=accepted

image

@yroskov
Copy link
Author

yroskov commented Nov 22, 2024

Problem 3 The report on duplicated names has no option (i.e. check-boxes) for applying decision:

image

@yroskov
Copy link
Author

yroskov commented Nov 22, 2024

Quite possible, I did not understand how to use the tool...

@yroskov yroskov changed the title Tasks tool in the project 3: error 500 Testing TASKS tool in the project 3: error 500, etc. Nov 22, 2024
@mdoering
Copy link
Member

  1. 500 responses are nearly always a bug, I will look into that Monday.

  2. Reporting duplicates across all names in the project as long as at least 1 name comes from the chosen source is the expected behavior. This was the initial requirement when we first introduces the duplicate tool to projects. You, we wanted to search also for duplicates across sources. Sth you could not do in workbenches in sources alone. If you desire to look only for duplicates within a source in a project we need to add another parameter that restricts the search to names just in the source. I can do that, it should not be difficult.

  3. I have to defer to @thomasstjerne. There is an Identifier issue with applying decisions through the project, but we should be able to deal with it. We just need to get hold of the original source id which we keep in the verbatim source record for all synced records - unless it did not exist in a source, e.g. genera and species can be created during a sync if missing in the source which only has species and subspecies respectively.

@mdoering
Copy link
Member

I do not get a 500 now. Do you still see that? I did deploy the backend many times some 10-12h ago which might have caused problems occasionally...

@yroskov
Copy link
Author

yroskov commented Nov 25, 2024

I confirm: no error 500 for me today, 2024-11-25

@yroskov
Copy link
Author

yroskov commented Nov 25, 2024

Sorry for a confusion. Let's clarify:

Functionality which was implemented before: Reporting duplicates across all names & GSDs in the project.

Functionality of today: Reporting duplicates between selected GSD and all other GSDs in the project.

Additional functionality which I need: Reporting duplicates inside selected GSD, i.e. inside its part (= CoL sectors) included in the project.
= If you desire to look only for duplicates within a source in a project we need to add another parameter that restricts the search to names just in the source. I can do that, it should not be difficult.

(Again, sorry for a confusion, I thought that I am testing a new implementation - reports inside GSD included in the project).

All 3 functionalities are necessary. Perhaps, switcher "across whole project/selected checklist vs others/inside checklist" (or something like that) could be helpful.

@mdoering
Copy link
Member

I am adding a new boolean sourceOnly parameter that restricts all considered names to be from the same source, not just one record. There will be a new switch in the task board & duplicate tool for that

@mdoering
Copy link
Member

@thomasstjerne I have added the new parameter to the API which will be deployed to today.
Could you update the frontend adding new checkboxes?
See CatalogueOfLife/checklistbank#1504

@thomasstjerne
Copy link

Additional functionality which I need: Reporting duplicates inside selected GSD, i.e. inside its part (= CoL sectors) included in the project.

I actually thought that duplicates within a single source GSD in a project was handled here: https://www.checklistbank.org/catalogue/3/dataset/2232/duplicates?catalogueKey=3&limit=50&offset=0

i.e., select the source, and then select duplicates

@mdoering
Copy link
Member

That shows duplicates directly in the source, not in the project after the data has been synced.
Often there is not much difference, but synced project data has decisions applied and most importantly sometimes contains far less data as only some parts/sectors have been used, e.g. from ITIS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants