Skip to content

Commit

Permalink
Add back default support for parquet ref #315
Browse files Browse the repository at this point in the history
  • Loading branch information
chainsawriot committed Jul 15, 2024
1 parent 894bd2e commit fb3bc51
Show file tree
Hide file tree
Showing 12 changed files with 16 additions and 12 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,8 @@ Imports:
writexl,
lifecycle,
R.utils,
readr
readr,
nanoparquet
Suggests:
datasets,
bit64,
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# rio 1.1.1.999 (development)

* Fix lintr issues #434 (h/t @bisaloo Hugo Gruson)
* Drop support for R < 4.0.0 see #436
* Add support for parquet in the import tier using `nanoparquet` see rio 1.0.1 below.

Bug fixes

Expand Down
2 changes: 1 addition & 1 deletion R/export.R
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
#' \item Weka Attribute-Relation File Format (.arff), using [foreign::write.arff()]
#' \item Fixed-width format data (.fwf), using [utils::write.table()] with `row.names = FALSE`, `quote = FALSE`, and `col.names = FALSE`
#' \item [CSVY](https://github.com/csvy) (CSV with a YAML metadata header) using [data.table::fwrite()].
#' \item Apache Arrow Parquet (.parquet), using [arrow::write_parquet()]
#' \item Apache Arrow Parquet (.parquet), using [nanoparquet::write_parquet()]
#' \item Feather R/Python interchange format (.feather), using [arrow::write_feather()]
#' \item Fast storage (.fst), using [fst::write.fst()]
#' \item JSON (.json), using [jsonlite::toJSON()]. In this case, `x` can be a variety of R objects, based on class mapping conventions in this paper: [https://arxiv.org/abs/1403.2805](https://arxiv.org/abs/1403.2805).
Expand Down
2 changes: 1 addition & 1 deletion R/export_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ export_delim <- function(file, x, fwrite = lifecycle::deprecated(), sep = "\t",

#' @export
.export.rio_parquet <- function(file, x, ...) {
.docall(arrow::write_parquet, ..., args = list(x = x, sink = file))
.docall(nanoparquet::write_parquet, ..., args = list(x = x, file = file))
}

#' @export
Expand Down
2 changes: 1 addition & 1 deletion R/import.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
#' \item Fortran data (no recognized extension), using [utils::read.fortran()]
#' \item Fixed-width format data (.fwf), using a faster version of [utils::read.fwf()] that requires a `widths` argument and by default in rio has `stringsAsFactors = FALSE`
#' \item [CSVY](https://github.com/csvy) (CSV with a YAML metadata header) using [data.table::fread()].
#' \item Apache Arrow Parquet (.parquet), using [arrow::read_parquet()]
#' \item Apache Arrow Parquet (.parquet), using [nanoparquet::read_parquet()]
#' \item Feather R/Python interchange format (.feather), using [arrow::read_feather()]
#' \item Fast storage (.fst), using [fst::read.fst()]
#' \item JSON (.json), using [jsonlite::fromJSON()]
Expand Down
4 changes: 2 additions & 2 deletions R/import_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -413,8 +413,8 @@ extract_html_row <- function(x, empty_value) {

#' @export
.import.rio_parquet <- function(file, which = 1, ...) {
.check_pkg_availability("arrow")
.docall(arrow::read_parquet, ..., args = list(file = file, as_data_frame = TRUE))
#.check_pkg_availability("arrow")
.docall(nanoparquet::read_parquet, ..., args = list(file = file, options = nanoparquet::parquet_options(class = "data.frame")))
}

#' @export
Expand Down
Binary file modified R/sysdata.rda
Binary file not shown.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,7 @@ The full list of supported formats is below:
| Gzip | gz / gzip | base | base | Default | |
| Zip files | zip | utils | utils | Default | |
| Ambiguous file format | dat | data.table | | Default | Attempt as delimited text data |
| Apache Arrow (Parquet) | parquet | nanoparquet | nanoparquet | Default | |
| CSVY (CSV + YAML metadata header) | csvy | data.table | data.table | Default | |
| Comma-separated data | csv | data.table | data.table | Default | |
| Comma-separated data (European) | csv2 | data.table | data.table | Default | |
Expand All @@ -159,7 +160,6 @@ The full list of supported formats is below:
| Text Representations of R Objects | dump | base | base | Default | |
| Weka Attribute-Relation File Format | arff / weka | foreign | foreign | Default | |
| XBASE database files | dbf | foreign | foreign | Default | |
| Apache Arrow (Parquet) | parquet | arrow | arrow | Suggest | |
| Clipboard | clipboard | clipr | clipr | Suggest | default is tsv |
| EViews | eviews / wf1 | hexView | | Suggest | |
| Fast Storage | fst | fst | fst | Suggest | |
Expand Down
6 changes: 3 additions & 3 deletions data-raw/single.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
{
"input": "parquet",
"format": "parquet",
"type": "suggest",
"type": "import",
"format_name": "Apache Arrow (Parquet)",
"import_function": "arrow::read_parquet",
"export_function": "arrow::write_parquet",
"import_function": "nanoparquet::read_parquet",
"export_function": "nanoparquet::write_parquet",
"note": ""
},
{
Expand Down
2 changes: 1 addition & 1 deletion man/export.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/import.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/rio.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit fb3bc51

Please sign in to comment.