Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

Open
markwhiting opened this issue Aug 30, 2024 · 0 comments

Comments

@markwhiting
Copy link
Member

To help us deduplicate and do various other things, it would be great to have a lot of certainty about the formal identifiers of a paper as cheaply as possible. For example can we get a paper's DOI with high reliability (e.g., even when the paper shares the title, filename, authors or other properties with other papers in our corpus).

We currently make requests to services like crossref, altmetric or openAlex for this but even those require estimating things. So we might want some general purpose feature that aims to find and check some basic bibliometrics more robustly.

I suspect this will need some iteration inside, e.g., if title matches, does abstract, or DOI or other stuff? How far do we go, and how many instances of disagreement prove to us that things are different? etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant