Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

markwhiting · 2024-08-30T18:25:15Z

To help us deduplicate and do various other things, it would be great to have a lot of certainty about the formal identifiers of a paper as cheaply as possible. For example can we get a paper's DOI with high reliability (e.g., even when the paper shares the title, filename, authors or other properties with other papers in our corpus).

We currently make requests to services like crossref, altmetric or openAlex for this but even those require estimating things. So we might want some general purpose feature that aims to find and check some basic bibliometrics more robustly.

I suspect this will need some iteration inside, e.g., if title matches, does abstract, or DOI or other stuff? How far do we go, and how many instances of disagreement prove to us that things are different? etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

markwhiting commented Aug 30, 2024

Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

Comments

markwhiting commented Aug 30, 2024