Skip to content

Commit

Permalink
Tidy JSON string before parsing
Browse files Browse the repository at this point in the history
  • Loading branch information
seancarmody committed Jan 8, 2022
1 parent ed1d60d commit ae3884f
Show file tree
Hide file tree
Showing 6 changed files with 19 additions and 10 deletions.
6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: ngramr
Type: Package
Title: Retrieve and Plot Google n-Gram Data
Version: 1.7.4
Date: 2021-05-13
Version: 1.7.5
Date: 2022-01-08
Authors@R: c(
person("Sean", "Carmody", email = "[email protected]", role = c("aut", "cre", "cph"))
)
Expand All @@ -29,7 +29,7 @@ Imports:
URL: https://github.com/seancarmody/ngramr
BugReports: https://github.com/seancarmody/ngramr/issues
License: GPL (>=2)
RoxygenNote: 7.1.1
RoxygenNote: 7.1.2
Roxygen: list(markdown = TRUE)
Encoding: UTF-8
Suggests:
Expand Down
5 changes: 5 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
ngramr 1.7.5
------------
* Tidied fromJSON call
* Started to use lifecycle in documentation (ngrami)

ngramr 1.7.4
------------
* Imposed version dependency for dplyr to ensure relocate available
Expand Down
8 changes: 4 additions & 4 deletions R/ngram.R
Original file line number Diff line number Diff line change
Expand Up @@ -182,17 +182,17 @@ ngram_check_warnings <- function(html) {
return(warnings)
}

ngram_fetch_data <- function(html, debug = FALSE) {
ngram_fetch_data <- function(html) {
corpus <- xml2::xml_find_first(html, "//select[@id='form-corpus']/option")
corpus <- as.integer(xml2::xml_attr(corpus, "value"))
json <- xml2::xml_find_first(html, "//div[@id='chart']/following::script")
json <- xml2::xml_text(json)
json <- stringr::str_split(json, "\n")[[1]]
json <- stringr::str_trim(json)
years <- as.integer(stringr::str_split(grep("drawD3Chart", json, value = TRUE), ",")[[1]][2:3])
if (debug) return(list(json = json, years = years, corpus = corpus))
json <- grep("ngrams.data", json, value = TRUE)
data <- rjson::fromJSON(sub(".*?=", "", json))
json <- grep("ngrams.data =", json, value = TRUE)
json <- stringr::str_match(json, "ngrams.data = (.*);")[2]
data <- rjson::fromJSON(json)
if (length(data) == 0) return(NULL)
data <- lapply(data,
function(x) tibble::add_column(tibble::as_tibble(x),
Expand Down
3 changes: 3 additions & 0 deletions R/ngrami.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
#' @param phrases vector of phrases
#' @param aggregate sum up each of the terms
#' @param ... remaining parameters passed to ngram
#' @description
#' `r lifecycle::badge("stable")`
#' This function is a simple wrapper of `ngram` for case insensitive searches.
#' @export

ngrami <- function(phrases, aggregate = TRUE, ...){
Expand Down
4 changes: 2 additions & 2 deletions R/ngramr-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@
#' millions of digitized books." \emph{Science} 331, No. 6014 (2011): 176--182.
#'
#' @keywords internal
#' @import dplyr tidyr ggplot2
#' @importFrom rlang .data
#' @import dplyr tidyr ggplot2
#' @importFrom rlang .data
#' @docType package
#' @name ngramr
#' @aliases ngramr ngramr-package
Expand Down
3 changes: 2 additions & 1 deletion man/ngrami.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit ae3884f

Please sign in to comment.