diff --git a/NAMESPACE b/NAMESPACE index a2e51f2..683736c 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -16,6 +16,7 @@ export(get_unit_code_info) export(get_unit_info) export(load_core_metadata) export(load_data_package) +export(load_data_package_deprecated) export(load_data_packages) export(load_domains) export(load_pkg_metadata) diff --git a/docs/404.html b/docs/404.html index dc40cd6..3949753 100644 --- a/docs/404.html +++ b/docs/404.html @@ -32,7 +32,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index e0c578d..3db8c14 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/LICENSE.html b/docs/LICENSE.html index 4e75229..1ad2900 100644 --- a/docs/LICENSE.html +++ b/docs/LICENSE.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/articles/NPSutils.html b/docs/articles/NPSutils.html index 7bbf9e3..6d63125 100644 --- a/docs/articles/NPSutils.html +++ b/docs/articles/NPSutils.html @@ -32,7 +32,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/articles/index.html b/docs/articles/index.html index 7587471..42a4eb9 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/authors.html b/docs/authors.html index 2402ff8..b3dad51 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 @@ -74,13 +74,13 @@

Citation

Baker R, DeVivo J, Patterson J (2024). NPSutils: Collection of Functions to read and manipulate information from the NPS DataStore. -R package version 0.3.1, https://nationalparkservice.github.io/NPSutils/. +R package version 0.3.2, https://nationalparkservice.github.io/NPSutils/.

@Manual{,
   title = {NPSutils: Collection of Functions to read and manipulate information from the NPS DataStore},
   author = {Robert Baker and Joe DeVivo and Judd Patterson},
   year = {2024},
-  note = {R package version 0.3.1},
+  note = {R package version 0.3.2},
   url = {https://nationalparkservice.github.io/NPSutils/},
 }
diff --git a/docs/index.html b/docs/index.html index 4d69894..35b3854 100644 --- a/docs/index.html +++ b/docs/index.html @@ -33,7 +33,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/news/index.html b/docs/news/index.html index ff54202..3d53506 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 @@ -51,8 +51,10 @@

Changelog

- -
diff --git a/docs/reference/check_is_data_package.html b/docs/reference/check_is_data_package.html index 1bc4bf8..e3c199e 100644 --- a/docs/reference/check_is_data_package.html +++ b/docs/reference/check_is_data_package.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/check_new_version.html b/docs/reference/check_new_version.html index b27c1b0..610ff5a 100644 --- a/docs/reference/check_new_version.html +++ b/docs/reference/check_new_version.html @@ -18,7 +18,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/check_ref_exists.html b/docs/reference/check_ref_exists.html index e3f8daf..7d0d038 100644 --- a/docs/reference/check_ref_exists.html +++ b/docs/reference/check_ref_exists.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_data_packages.html b/docs/reference/get_data_packages.html index a2f6384..f02241d 100644 --- a/docs/reference/get_data_packages.html +++ b/docs/reference/get_data_packages.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_new_version_id.html b/docs/reference/get_new_version_id.html index e4c44eb..7c70310 100644 --- a/docs/reference/get_new_version_id.html +++ b/docs/reference/get_new_version_id.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_park_code.html b/docs/reference/get_park_code.html index 12f797f..048af29 100644 --- a/docs/reference/get_park_code.html +++ b/docs/reference/get_park_code.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_park_taxon_citations.html b/docs/reference/get_park_taxon_citations.html index 3a7a1e8..c29de14 100644 --- a/docs/reference/get_park_taxon_citations.html +++ b/docs/reference/get_park_taxon_citations.html @@ -19,7 +19,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_park_taxon_refs.html b/docs/reference/get_park_taxon_refs.html index ea0426e..63575cd 100644 --- a/docs/reference/get_park_taxon_refs.html +++ b/docs/reference/get_park_taxon_refs.html @@ -21,7 +21,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_park_taxon_url.html b/docs/reference/get_park_taxon_url.html index 5db45c9..a714250 100644 --- a/docs/reference/get_park_taxon_url.html +++ b/docs/reference/get_park_taxon_url.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_ref_info.html b/docs/reference/get_ref_info.html index af6ad7b..4fcdb79 100644 --- a/docs/reference/get_ref_info.html +++ b/docs/reference/get_ref_info.html @@ -18,7 +18,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_unit_code.html b/docs/reference/get_unit_code.html index 38b2279..7f00270 100644 --- a/docs/reference/get_unit_code.html +++ b/docs/reference/get_unit_code.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_unit_code_info.html b/docs/reference/get_unit_code_info.html index dc325e7..2d75bfe 100644 --- a/docs/reference/get_unit_code_info.html +++ b/docs/reference/get_unit_code_info.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/get_unit_info.html b/docs/reference/get_unit_info.html index 71f2333..424e864 100644 --- a/docs/reference/get_unit_info.html +++ b/docs/reference/get_unit_info.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/index.html b/docs/reference/index.html index 183fbbc..a7e1335 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 @@ -110,13 +110,13 @@

All functions

Gets common EML metadata elements and puts them in a dataframe #' `r lifecycle::badge('experimental')`

-

load_data_package()

+

load_data_packages() load_data_package()

-

Read contents of data package and constructs a list of tibbles based on the data file(s)

+

Read contents of data package(s) and return a list of tibbles list of tibbles based on the data file(s). Can use metadata to specify data types.

-

load_data_packages()

+

load_data_package_deprecated()

-

Read contents of data package(s) and return a tibble with a tibble for each data file.

+

Read contents of data package and constructs a list of tibbles based on the data file(s)

load_domains()

diff --git a/docs/reference/load_data_package.html b/docs/reference/load_data_package.html deleted file mode 100644 index 923a297..0000000 --- a/docs/reference/load_data_package.html +++ /dev/null @@ -1,105 +0,0 @@ - -Read contents of data package and constructs a list of tibbles based on the data file(s) — load_data_package • NPSutils - - -
-
- - - -
-
- - -
-

load_data_package reads the data file(s) from a package and loads it into a list of tibbles. Current implementation only supports .csv data files.

-
- -
-
load_data_package(reference_id)
-
- -
-

Arguments

- - -
reference_id
-

is a 6-7 digit number corresponding to the reference ID of the data package.

- -
-
-

Value

-

a list of one or more tibbles contained within the data package to the global environment.

-
- -
-

Examples

-
if (FALSE) { # \dontrun{
-load_data_package(2272461)
-} # }
-
-
-
- -
- - -
- -
-

Site built with pkgdown 2.1.0.

-
- -
- - - - - - - - diff --git a/docs/reference/load_data_packages.html b/docs/reference/load_data_packages.html index 1e7afae..d9e69d5 100644 --- a/docs/reference/load_data_packages.html +++ b/docs/reference/load_data_packages.html @@ -1,5 +1,5 @@ -Read contents of data package(s) and return a tibble with a tibble for each data file. — load_data_packages • NPSutilsRead contents of data package(s) and return a list of tibbles list of tibbles based on the data file(s). Can use metadata to specify data types. — load_data_packages • NPSutils @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 @@ -46,19 +46,26 @@
-

`load_data_packages()` loads one to may data packages and returns a tibble of tibbles where each data package is a tibble and within that each data file is it's own tibble. `load_data_packages()` will only work with .csv data files and EML metadata. `load_data_packages()` can also utilize the metadata to assign attributes to each data column.

+

`load_data_packages()` loads one to many data packages and returns a list. If only one data package is loaded, the list will be a list of tibbles where each tibble is a data (.csv) file from the data package. If multiple data packages are loaded, the list will be a list of lists where each nested list contains a list of tibble and each tibble is a data file (.csv). See `simplify` below for details on handling these lists.

load_data_packages(
   reference_id,
-  directory = here::here(),
+  directory = here::here("data"),
+  assign_attributes = FALSE,
+  simplify = TRUE
+)
+
+load_data_package(
+  reference_id,
+  directory = here::here("data"),
   assign_attributes = FALSE,
   simplify = TRUE
 )
@@ -69,19 +76,19 @@

Arguments

reference_id
-

is a list of 6-7 digit numbers corresponding to the DataStore reference ID of the datapackage(s) to load. Alternatively, you can set `reference_id` to "load_all", which will load all the data packages in your /data folder.

+

the immediate directory/directories where your data packages reside. For data packages downloaded from DataStore using `get_data_package()` or `get_data_packages()` default settings, this is the DataStore reference ID for your data package(s). Alternatively, you can set `reference_id` to "`load_all`", which will load all the data packages in the directory specified in via `directory` (typically ./data).

directory
-

is the location of a folder, 'data' (created during `get_data_packages()`) which contains sub-directories where each sub-directory is the DataStore referenceId of the data package. Again, this file structure is all set up using `get_data_packages()`. Defaults to the current working directory (which is the default location for `get_data_packages()`).

+

is the location of a folder that contains all of the data packages (where data packages are a folder containing .csv data files and a single .xml EML metadata file). If these data packages were downloaded from DataStore using the default settings for `get_data_packages`, this folder is "./data" and you can use the default settings for `directory`.

assign_attributes
-

Logical. Defaults to FALSE. Data will be loaded using `readr::read_csv()` guessing algorithm for calling column types. If set to TRUE, column types will be set using metadata attributes via the yet-to-be written `load_metadata()` function. `r lifecycle::badge('experimental')`

+

Logical. Defaults to FALSE. Data will be loaded using `readr::read_csv()` guessing algorithm for calling column types. If you set to `assign_attributes = TRUE`, column types will be set using the data types specified in the metadata. Currently supported data types include string, dateTime, float, double, integer, and categorical (factor in R). This assignment is very stringent: for instance if you did not specify date-time formats using ISO-8601 notation (i.e. "YYYY", not "yyyy"), your data will import as NAs. If you have undefined missing values or blank cells, your data will not import at all. If you run into problems consider using the default settings and letting `read_csv` guess the column types.

simplify
-

Logical. Defaults to TRUE. If there is only a single data package loaded, the function will return a simple list of tibbles (where each tibble reflects a data file from within the data package). If set to FALSE, the function will return a list that contains a list of tibbles. This structure mirrors the object structure returned if multiple data packages are simultaneously loaded (a list of data packages with each data package containing a list of tibbles where each tibble corresponds to a data file in the given data package).

+

Logical. Defaults to TRUE. If `simplify = TRUE`, the function will return a list of tibbles where each tibble is a data file from the data package(s) specified. The tibbles are named using the following format: "pkg_<reference_id.filename" (without the filename extension). If you want to load each individual data file into R for further processing, use `simplify = TRUE` and then run `list2env(x, envir=.GlobalEnv)`. If you set `simplify = FALSE`, the object returned will either be a list of tibbles identical to that returned by `simplify = TRUE` (if only one data package is loaded) or will be a list of lists where each nested list is a contains one tibble for each data file in each data package.Setting `simplify = FALSE` may make it easier to do post-processing on a package-by-package level rather than a tibble-by-tibble level.

@@ -90,8 +97,7 @@

Value

Details

-

`r lifecycle::badge("experimental")`

-

currently `load_data_packages()` only supports EML metadata and .csv files. To take advantage of the default settings in load_data_packages, use the default settings in `get_data_package()` or `get_data_packages()`. Archived (.zip) files must be extracted before `load_data_packages()` will work properly. Again, `get_data_package()` or `get_data_packages()` will accomplish this for you. +

currently `load_data_packages()` only supports EML metadata and .csv files. The reference_id '

diff --git a/docs/reference/load_domains.html b/docs/reference/load_domains.html index 91b6a08..ef92238 100644 --- a/docs/reference/load_domains.html +++ b/docs/reference/load_domains.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2
diff --git a/docs/reference/load_pkg_metadata.html b/docs/reference/load_pkg_metadata.html index 421cc7a..d199c11 100644 --- a/docs/reference/load_pkg_metadata.html +++ b/docs/reference/load_pkg_metadata.html @@ -19,7 +19,7 @@ NPSutils - 0.3.1 + 0.3.2
diff --git a/docs/reference/map_wkt.html b/docs/reference/map_wkt.html index 22a6292..70771c7 100644 --- a/docs/reference/map_wkt.html +++ b/docs/reference/map_wkt.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/rm_local_packages.html b/docs/reference/rm_local_packages.html index db5ee0a..ed0de13 100644 --- a/docs/reference/rm_local_packages.html +++ b/docs/reference/rm_local_packages.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/reference/validate_data_package.html b/docs/reference/validate_data_package.html index 314bd5b..84fe751 100644 --- a/docs/reference/validate_data_package.html +++ b/docs/reference/validate_data_package.html @@ -17,7 +17,7 @@ NPSutils - 0.3.1 + 0.3.2 diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 0911b3e..80cfae3 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -22,8 +22,8 @@ /reference/get_unit_info.html /reference/index.html /reference/load_core_metadata.html -/reference/load_data_package.html /reference/load_data_packages.html +/reference/load_data_package_deprecated.html /reference/load_domains.html /reference/load_pkg_metadata.html /reference/map_wkt.html diff --git a/man/load_data_package.Rd b/man/load_data_package_deprecated.Rd similarity index 59% rename from man/load_data_package.Rd rename to man/load_data_package_deprecated.Rd index ff5ba7b..55fefad 100644 --- a/man/load_data_package.Rd +++ b/man/load_data_package_deprecated.Rd @@ -1,10 +1,10 @@ % Generated by roxygen2: do not edit by hand % Please edit documentation in R/load_data_package.R -\name{load_data_package} -\alias{load_data_package} +\name{load_data_package_deprecated} +\alias{load_data_package_deprecated} \title{Read contents of data package and constructs a list of tibbles based on the data file(s)} \usage{ -load_data_package(reference_id) +load_data_package_deprecated(reference_id) } \arguments{ \item{reference_id}{is a 6-7 digit number corresponding to the reference ID of the data package.} @@ -13,7 +13,10 @@ load_data_package(reference_id) a list of one or more tibbles contained within the data package to the global environment. } \description{ -\code{load_data_package} reads the data file(s) from a package and loads it into a list of tibbles. Current implementation only supports .csv data files. +`load_data_package_deprecated()` reads the data file(s) from a package and loads it into a list of tibbles. Current implementation only supports .csv data files. +} +\details{ +`r lifecycle::badge("deprecated")` } \examples{ \dontrun{ diff --git a/man/load_data_packages.Rd b/man/load_data_packages.Rd index 5e2e58e..6fd978c 100644 --- a/man/load_data_packages.Rd +++ b/man/load_data_packages.Rd @@ -2,34 +2,40 @@ % Please edit documentation in R/load_data_packages.R \name{load_data_packages} \alias{load_data_packages} -\title{Read contents of data package(s) and return a tibble with a tibble for each data file.} +\alias{load_data_package} +\title{Read contents of data package(s) and return a list of tibbles list of tibbles based on the data file(s). Can use metadata to specify data types.} \usage{ load_data_packages( reference_id, - directory = here::here(), + directory = here::here("data"), + assign_attributes = FALSE, + simplify = TRUE +) + +load_data_package( + reference_id, + directory = here::here("data"), assign_attributes = FALSE, simplify = TRUE ) } \arguments{ -\item{reference_id}{is a list of 6-7 digit numbers corresponding to the DataStore reference ID of the datapackage(s) to load. Alternatively, you can set `reference_id` to "load_all", which will load all the data packages in your /data folder.} +\item{reference_id}{the immediate directory/directories where your data packages reside. For data packages downloaded from DataStore using `get_data_package()` or `get_data_packages()` default settings, this is the DataStore reference ID for your data package(s). Alternatively, you can set `reference_id` to "`load_all`", which will load all the data packages in the directory specified in via `directory` (typically ./data).} -\item{directory}{is the location of a folder, 'data' (created during `get_data_packages()`) which contains sub-directories where each sub-directory is the DataStore referenceId of the data package. Again, this file structure is all set up using `get_data_packages()`. Defaults to the current working directory (which is the default location for `get_data_packages()`).} +\item{directory}{is the location of a folder that contains all of the data packages (where data packages are a folder containing .csv data files and a single .xml EML metadata file). If these data packages were downloaded from DataStore using the default settings for `get_data_packages`, this folder is "./data" and you can use the default settings for `directory`.} -\item{assign_attributes}{Logical. Defaults to FALSE. Data will be loaded using `readr::read_csv()` guessing algorithm for calling column types. If set to TRUE, column types will be set using metadata attributes via the yet-to-be written `load_metadata()` function. `r lifecycle::badge('experimental')`} +\item{assign_attributes}{Logical. Defaults to FALSE. Data will be loaded using `readr::read_csv()` guessing algorithm for calling column types. If you set to `assign_attributes = TRUE`, column types will be set using the data types specified in the metadata. Currently supported data types include string, dateTime, float, double, integer, and categorical (factor in R). This assignment is very stringent: for instance if you did not specify date-time formats using ISO-8601 notation (i.e. "YYYY", not "yyyy"), your data will import as NAs. If you have undefined missing values or blank cells, your data will not import at all. If you run into problems consider using the default settings and letting `read_csv` guess the column types.} -\item{simplify}{Logical. Defaults to TRUE. If there is only a single data package loaded, the function will return a simple list of tibbles (where each tibble reflects a data file from within the data package). If set to FALSE, the function will return a list that contains a list of tibbles. This structure mirrors the object structure returned if multiple data packages are simultaneously loaded (a list of data packages with each data package containing a list of tibbles where each tibble corresponds to a data file in the given data package).} +\item{simplify}{Logical. Defaults to TRUE. If `simplify = TRUE`, the function will return a list of tibbles where each tibble is a data file from the data package(s) specified. The tibbles are named using the following format: "pkg_