Multipart get_object() #383

gdbassett · 2021-01-05T20:27:04Z

Before filing an issue, please make sure you are using the latest development version which you can install using install.packages("aws.s3",repo="https://rforge.net") (see README) since the issue may have been fixed already. Also search existing issues first to avoid duplicates.

Please specify whether your issue is about:

a possible bug
a question about package functionality
a suggested code or documentation change, improvement to the code, or feature request

both data.table::fread() and readr::read_csv() are able to load csvs using aws.s3::get_object():
dt <- data.table::fread(aws.s3::get_object(filename, bucket=bucket, as = "text"))

However, if the full file is larger than the R maximum string vector size (2^31-1 bytes), the read will fail for that reason. I assume this is happening because fread and read_csv cannot read the file in in chunks and are instead reading it as an entire vector before splitting out the columns.

Would it be possible to to provide the CSV in chunks to fread/read_csv to avoid this hard limit?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multipart get_object() #383

Multipart get_object() #383

gdbassett commented Jan 5, 2021

Multipart get_object() #383

Multipart get_object() #383

Comments

gdbassett commented Jan 5, 2021