Skip to content

Augment reference list with DOIs automatically? #93

@LukasWallrich

Description

@LukasWallrich

The following code uses crossref to retrieve DOIs for a reference list - worth including in the Annotator rather than only linking to the tool ...

# Requires: httr2, rvest, xml2
library(httr2)
library(rvest)
library(xml2)
library(stringr)

stq_post <- function(refs, email, include_pm = FALSE, multihit = FALSE, timeout_sec = 60) {
  stopifnot(is.character(refs), length(refs) >= 1, nzchar(email))
  
  url <- "https://apps.crossref.org/SimpleTextQuery"
  ua  <- user_agent(sprintf("R httr; mailto:%s", email))
  h   <- handle("https://apps.crossref.org")  # keeps cookies across requests
  
  # 1) GET to start a session (needed; direct POST fails)
  invisible(GET(url, handle = h, ua, timeout(30)))
  
  # 2) POST the form
  body <- list(
    email    = email,
    command  = "Submit",
    freetext = if (length(refs) > 1) paste(refs, collapse = "\n") else refs
  )
  if (include_pm) body$includePM <- "on"
  if (multihit)   body$multihit  <- "on"
  
  resp <- POST(url, handle = h, ua, body = body, encode = "form", timeout(timeout_sec))
  stop_for_status(resp)
  html <- content(resp, as = "text", encoding = "UTF-8")
  
  # Extract DOIs from returned page
  doc   <- read_html(html)
  hrefs <- html_attr(html_elements(doc, "a"), "href")
  dois  <- hrefs[grepl("doi\\.org/", hrefs, ignore.case = TRUE)]
  dois  <- unique(str_replace(tolower(dois), "^https?://(dx\\.)?doi\\.org/", ""))
  
  list(dois = dois, html = html)
}


# Example:
refs <- c(
   "Wallrich, L., Opara, V., Wesołowska, M., Barnoth, D., & Yousefi, S. (2024). The relationship between team diversity and team performance: reconciling promise and reality through a comprehensive meta-analysis registered report. Journal of Business and Psychology, 39(6), 1303-1354.",
   "Knowles MR, Boucher RC (2002) Mucus clearance as a primary innate defense mechanism for mammalian airways. J Clin Investig 109: 571–577."
 )
 res <- stq_post(refs, include_pm = TRUE, multihit = FALSE, email = "you@uni.ac.uk")
 str(res$dois)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions