diff --git a/.Rbuildignore b/.Rbuildignore new file mode 100644 index 0000000..53d2e54 --- /dev/null +++ b/.Rbuildignore @@ -0,0 +1,4 @@ +^cepiigravity\.Rproj$ +^\.Rproj\.user$ +^LICENSE\.md$ +^dev$ diff --git a/.Rhistory b/.Rhistory new file mode 100644 index 0000000..0aaa824 --- /dev/null +++ b/.Rhistory @@ -0,0 +1,11 @@ +install() +install() +cepiigeodist::dist_cepii +install() +install() +check() +check() +library(cepiigravity) +check() +use_readme_md() +use_news_md() diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..5bd2208 --- /dev/null +++ b/.gitignore @@ -0,0 +1,2 @@ +.Rproj.user +dev/*.dta diff --git a/DESCRIPTION b/DESCRIPTION new file mode 100644 index 0000000..3580157 --- /dev/null +++ b/DESCRIPTION @@ -0,0 +1,29 @@ +Package: cepiigravity +Title: The CEPII Gravity Database +Version: 0.0.0.9000 +Authors@R: + c( + person("Mauricio", "Vargas Sepulveda", , "hello+r@pacha.dev", + role = c("aut", "cre"), + comment = c(ORCID = "0000-0003-1017-7574")), + person(family = "Centre d'études prospectives et d'informations + internationales (CEPII)", + role = "dtc") + ) +Description: The Gravity database aims at gathering in a single place a set of + variables that could be useful to researchers or practitioners willing to + understand the determinants of international trade. Each observation + corresponds to a combination of exporter-importer-year (i.e. + origin-destination-year), for which we provide trade flows, as well as + geographic, cultural, trade facilitation and macroeconomic variables. +License: CC0 +Encoding: UTF-8 +Roxygen: list(markdown = TRUE) +RoxygenNote: 7.2.1 +Depends: + R (>= 2.10) +LazyData: false +URL: https://pacha.dev/cepiigravity +BugReports: https://github.com/pachamaltese/cepiigravity/issues +Suggests: + gravity diff --git a/LICENSE.md b/LICENSE.md new file mode 100644 index 0000000..139c68e --- /dev/null +++ b/LICENSE.md @@ -0,0 +1,43 @@ +## creative commons + +# CC0 1.0 Universal + +CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER. + +### Statement of Purpose + +The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work"). + +Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others. + +For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights. + +1. __Copyright and Related Rights.__ A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following: + + i. the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work; + + ii. moral rights retained by the original author(s) and/or performer(s); + + iii. publicity and privacy rights pertaining to a person's image or likeness depicted in a Work; + + iv. rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below; + + v. rights protecting the extraction, dissemination, use and reuse of data in a Work; + + vi. database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and + + vii. other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof. + +2. __Waiver.__ To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose. + +3. __Public License Fallback.__ Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose. + +4. __Limitations and Disclaimers.__ + + a. No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document. + + b. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law. + + c. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work. + + d. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work. diff --git a/NAMESPACE b/NAMESPACE new file mode 100644 index 0000000..6ae9268 --- /dev/null +++ b/NAMESPACE @@ -0,0 +1,2 @@ +# Generated by roxygen2: do not edit by hand + diff --git a/NEWS.md b/NEWS.md new file mode 100644 index 0000000..9205129 --- /dev/null +++ b/NEWS.md @@ -0,0 +1,3 @@ +# cepiigravity 0.0.0.9000 + +* Added a `NEWS.md` file to track changes to the package. diff --git a/R/cepiigravity-package.R b/R/cepiigravity-package.R new file mode 100644 index 0000000..e4b8ad3 --- /dev/null +++ b/R/cepiigravity-package.R @@ -0,0 +1,167 @@ +#' @keywords internal +"_PACKAGE" + +#' @title The Countries dataset: Static country-level information +#' @name countries +#' @docType data +#' @author CEPII, adapted from the World Bank and other sources +#' @format A data frame with 257 rows and 8 columns: +#' |variable |description | +#' |:--------------------|:-----------------------------------| +#' |iso3 |ISO3 alphabetic | +#' |iso3num |ISO3 numeric | +#' |country |Country name | +#' |countrylong |Country official name | +#' |first_year |First year of territorial existence | +#' |last_year |Last year of territorial existence | +#' |countrygroup_iso3 |Country group (ISO3 alphabetic) | +#' |countrygroup_iso3num |Country group (ISO3 numeric) | +#' @description Countries is the dataset that includes static country-level +#' variables, allowing for a full identification of each country included in +#' Gravity and, if relevant, for a tracking of its territorial changes (splits +#' and merges). Some of the variables provided in Countries are also included +#' in the main Gravity dataset. +#' Countries includes one observation for each territorial configuration, +#' mapping the full set of territorial changes that are accounted for in +#' Gravity. For example, Countries includes one observation for West Germany, +#' one for East Germany and one for the unified Germany. Similarly, it includes +#' one observation for Sudan before the split of South Sudan, one observation +#' for South Sudan, and one observation for Sudan after the split of South +#' Sudan. +#' @details There are differences with respect to the original Stata version. +#' ISO3 alphabetic codes of length zero were converted to NAs and the +#' attributes (i.e., column descriptions), when missing, were added after +#' reading the original documentation. +#' The universe of Countries (and of the Gravity dataset) is based on +#' CEPII's GeoDist dataset (Mayer and Zignago 2011). This dataset is augmented +#' with some countries and territories that either appear in the World Bank's +#' World Integrated Trade Solution (WITS) or that are necessary to construct +#' the full chain of territorial changes that have led to the creation of +#' countries appearing in the GeoDist dataset. In addition, some names are +#' updated, as well as ISO3 alphabetic numeric codes, by comparing the GeoDist +#' dataset with the WITS dataset and with the official source for ISO country +#' codes. Countries' official names also come from the WITS dataset, augmented +#' by Wikipedia for countries or territories that are not present in the WITS +#' dataset but that appear in GeoDist. +#' Countries (and the Gravity dataset) carefully tracks territorial changes, +#' i.e. the country's previous membership (in case of a split) and the +#' country's new membership (in case of a unification of two territories). We +#' only take into account the modifications that occurred over the time span +#' of the database, i.e 1948-2019. This is done using the CIA World Factbook +#' and Wikipedia. +#' @keywords data +NULL + +#' @title The Gravity dataset +#' @name gravity +#' @docType data +#' @author CEPII, adapted from the World Bank and other sources +#' @format A data frame with 4,428,288 rows and 79 columns: +#' |variable |description | +#' |:----------------------|:--------------------------------------------------------------------------------| +#' |year |Year | +#' |iso3_o |Origin ISO3 alphabetic | +#' |iso3_d |Destination ISO3 alphabetic | +#' |iso3num_o |Origin ISO3 numeric | +#' |iso3num_d |Destination ISO3 numeric | +#' |country_exists_o |1 = Origin country exists | +#' |country_exists_d |1 = Destination country exists | +#' |gmt_offset_2020_o |Origin GMT offset (hours) | +#' |gmt_offset_2020_d |Destination GMT offset (hours) | +#' |contig |1 = Contiguity | +#' |dist |Distance between most populated cities, in km | +#' |distw |Population-weighted distance between most populated cities, in km | +#' |distcap |Distance between capitals, in km | +#' |distwces |Population-weighted distance between most populated cities, in km, using CES for | +#' |dist_source |Distance source | +#' |comlang_off |1 = Common official or primary language | +#' |comlang_ethno |1 = Language is spoken by at least 9% of the population | +#' |comcol |1 = Common colonizer post 1945 | +#' |comrelig |Common religion index | +#' |col45 |1 = Pair in colonial relationship post 1945 | +#' |legal_old_o |Origin legal system before transition | +#' |legal_old_d |Destination legal system before transition | +#' |legal_new_o |Origin legal system after transition | +#' |legal_new_d |Destination legal system after transition | +#' |comleg_pretrans |1 = Common legal origins before transition | +#' |comleg_posttrans |1 = Common legal origins after transition | +#' |transition_legalchange |1 = Common legal origin changed since transition | +#' |heg_o |1 = Origin is current or former hegemon of destination | +#' |heg_d |1 = Destination is current or former hegemon of origin | +#' |col_dep_ever |1 = Pair ever in colonial or dependency relationship | +#' |col_dep |1 = Pair currently in colonial or dependency relationship | +#' |col_dep_end_year |Independence date, if col_dep = 1 | +#' |col_dep_end_conflict |1 = Independence involved conflict, if col_dep_ever = 1 | +#' |empire |Hegemon if sibling = 1 and year < sever_year | +#' |sibling_ever |1 = Pair ever in sibling relationship | +#' |sibling |1 = Pair currently in sibling relationship | +#' |sever_year |Severance year for pairs if sibling == 1 | +#' |sib_conflict |1 = Pair ever in sibling relationship and conflict with hegemon | +#' |pop_o |Origin Population, total in thousands | +#' |pop_d |Destination Population, total in thousands | +#' |gdp_o |Origin GDP (current thousands US$) | +#' |gdp_d |Destination GDP (current thousands US$) | +#' |gdpcap_o |Origin GDP per cap (current thousands US$) | +#' |gdpcap_d |Destination GDP per cap (current thousands US$) | +#' |pop_source_o |Origin Population source | +#' |pop_source_d |Destination Population source | +#' |gdp_source_o |Origin GDP source | +#' |gdp_source_d |Destination GDP source | +#' |gdp_ppp_o |Origin GDP, PPP (current thousands international $) | +#' |gdp_ppp_d |Destination GDP, PPP (current thousands international $) | +#' |gdpcap_ppp_o |Origin GDP per cap, PPP (current thousands international $) | +#' |gdpcap_ppp_d |Destination GDP per cap, PPP (current thousands international $) | +#' |pop_pwt_o |Origin Population, total in thousands (PWT) | +#' |pop_pwt_d |Destination Population, total in thousands (PWT) | +#' |gdp_ppp_pwt_o |Origin GDP, current PPP (2011 thousands US$) (PWT) | +#' |gdp_ppp_pwt_d |Destination GDP, current PPP (2011 thousands US$) (PWT) | +#' |gatt_o |Origin GATT membership | +#' |gatt_d |Destination GATT membership | +#' |wto_o |Origin WTO membership | +#' |wto_d |Destination WTO membership | +#' |eu_o |1 = Origin is a EU member | +#' |eu_d |1 = Destination is a EU member | +#' |rta |1 = RTA (source: WTO) | +#' |rta_coverage |Coverage of RTA (source: WTO) | +#' |rta_type |Type of RTA (source: WTO) | +#' |entry_cost_o |Origin Cost of business start-up procedures (% of GNI per capita) | +#' |entry_cost_d |Destination Cost of business start-up procedures (% of GNI per capita) | +#' |entry_proc_o |Origin Start-up procedures to register a business (number) | +#' |entry_proc_d |Destination Start-up procedures to register a business (number) | +#' |entry_time_o |Origin Time required to start a business (days) | +#' |entry_time_d |Destination Time required to start a business (days) | +#' |entry_tp_o |Origin Days + procedures to start a business | +#' |entry_tp_d |Destination Days + procedures to start a business | +#' |tradeflow_comtrade_o |Trade flows as reported by the origin, 1000 Current USD (source: UNSD) | +#' |tradeflow_comtrade_d |Trade flows as reported by the destination, 1000 Current USD (source: UNSD) | +#' |tradeflow_baci |Trade flow, 1000 USD (source: BACI) | +#' |manuf_tradeflow_baci |Trade flow of manufactured goods, 1000 USD (source: BACI) | +#' |tradeflow_imf_o |Trade flows as reported by the origin, 1000 Current USD (source: IMF) | +#' |tradeflow_imf_d |Trade flows as reported by the destination, 1000 Current USD (source: IMF) | +#' @description In Gravity, each observation is uniquely identified by the +#' combination of the country_id of the origin country, the country_id of the +#' destination country and the year. Gravity is “squared”, meaning that each +#' country pair appears every year, even if one of the countries actually does +#' not exist. However, based on the territorial changes tracked in the +#' Countries dataset, we set to missing all variables for country pairs in +#' which at least one of the countries does not exist in a given year. +#' Furthermore, we provide two dummy variables indicating whether the origin +#' and the destination countries exist. These dummies allow users wishing drop +#' non-existing country pairs from the dataset to do so easily. Users looking +#' for a more detailed account of country existence should turn to the +#' Countries dataset. +#' A few caveats on the identification of countries through country_id must be +#' noted. Firstly, when countries merge, it is the new country or territorial +#' configuration that exists during transition year but not the old country or +#' territorial configuration. As an example DEU.1 (West Germany) has 1989 as +#' last year, not 1990, while DEU.2 (the unified Germany) has 1990 as first +#' year. This is consistent with the construction of underlying variables that +#' varies over time, such as GDP, population, trade. Secondly, since the +#' dataset is square in terms of country_id, there exist cases in which two +#' configurations of the same alphabetic ISO3 code appear bilaterally, e.g. +#' DEU.1 and DEU.2. While DEU.1 and DEU.2 never existed simultaneously, we +#' still keep these null observations to ensure that the final dataset is +#' square. +#' @details The details are the same as for the Countries dataset. +#' @keywords data +NULL diff --git a/README.md b/README.md new file mode 100644 index 0000000..a331cdd --- /dev/null +++ b/README.md @@ -0,0 +1,23 @@ +# cepiigravity + + + + +The goal of cepiigravity is to provide the same data from [Gravity](http://www.cepii.fr/CEPII/en/bdd_modele/bdd_modele_item.asp?id=8) +ready to be used in R (i.e. with the [gravity](https://pacha.dev/gravity) +package). + +The package provides data on countries and distance measures alongside dummy +variables indicating whether two countries are contiguous, share a common +language or a colonial relationship, and others. + +`cepiigravity` can be installed by running + +``` +# install.packages("remotes") +install_github("pachamaltese/cepiigravity") +``` + +The main source to obtain the data in this package is: + +Conte, M., Cotterlaz, P. & Mayer, T. (2021). *The CEPII Gravity Database*. CEPII Working Paper 2022-05. diff --git a/cepiigravity.Rproj b/cepiigravity.Rproj new file mode 100644 index 0000000..69fafd4 --- /dev/null +++ b/cepiigravity.Rproj @@ -0,0 +1,22 @@ +Version: 1.0 + +RestoreWorkspace: No +SaveWorkspace: No +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: Sweave +LaTeX: pdfLaTeX + +AutoAppendNewline: Yes +StripTrailingWhitespace: Yes +LineEndingConversion: Posix + +BuildType: Package +PackageUseDevtools: Yes +PackageInstallArgs: --no-multiarch --with-keep.source +PackageRoxygenize: rd,collate,namespace diff --git a/data/countries.rda b/data/countries.rda new file mode 100644 index 0000000..660cd9d Binary files /dev/null and b/data/countries.rda differ diff --git a/data/gravity.rda b/data/gravity.rda new file mode 100644 index 0000000..e72f337 Binary files /dev/null and b/data/gravity.rda differ diff --git a/dev/Gravity_documentation.pdf b/dev/Gravity_documentation.pdf new file mode 100644 index 0000000..be5310b Binary files /dev/null and b/dev/Gravity_documentation.pdf differ diff --git a/dev/get-data.R b/dev/get-data.R new file mode 100644 index 0000000..967807a --- /dev/null +++ b/dev/get-data.R @@ -0,0 +1,98 @@ +library(purrr) +library(haven) +library(janitor) +library(dplyr) + +# download ---- + +# http://www.cepii.fr/DATA_DOWNLOAD/gravity/legacy/202102/Gravity_rds_V202102.zip +# doesn't work :( +url_data <- "http://www.cepii.fr/DATA_DOWNLOAD/gravity/legacy/202102/Gravity_dta_V202102.zip" +url_docs <- "http://www.cepii.fr/DATA_DOWNLOAD/gravity/legacy/202102/Gravity_documentation.pdf" + +zip_data <- gsub(".*/", "dev/", url_data) +pdf_docs <- gsub(".*/", "dev/", url_docs) + +map2( + c(url_data, url_docs), + c(zip_data, pdf_docs), + function(x,y) { + if (!file.exists(y)) { + try( + download.file(x, y) + ) + } + } +) + +finp <- list.files("dev", pattern = "\\.dta", full.names = T) + +if (length(finp) == 0L) { + unzip(zip_data, exdir = "dev") +} + +# tidy ---- + +finp <- list.files("dev", pattern = "\\.dta", full.names = T) + +countries <- read_stata(finp[1]) %>% + clean_names() %>% + mutate(iso3 = tolower(iso3)) %>% + mutate(iso3 = ifelse(nchar(iso3) == 0L, NA, iso3)) + +unique(nchar(countries$iso3)) + +gravity <- read_stata(finp[2]) %>% + clean_names() %>% + mutate( + iso3_o = tolower(iso3_o), + iso3_d = tolower(iso3_d) + ) %>% + mutate( + iso3_o = ifelse(nchar(iso3_o) == 0L, NA, iso3_o), + iso3_d = ifelse(nchar(iso3_d) == 0L, NA, iso3_d) + ) + +unique(nchar(gravity$iso3_o)) +unique(nchar(gravity$iso3_d)) + +# descriptions ----- + +attr(countries[[1]], "label") <- "ISO3 alphabetic" + +countries_desc <- tibble( + variable = colnames(countries), + description = map_chr( + seq_along(colnames(countries)), + function(x) { + y <- attr(countries[[x]], "label") + if (is.null(y)) y <- NA + return(y) + } + ) +) + +attr(gravity[[2]], "label") <- "Origin ISO3 alphabetic" +attr(gravity[[3]], "label") <- "Destination ISO3 alphabetic" + +gravity_desc <- tibble( + variable = colnames(gravity), + description = map_chr( + seq_along(colnames(gravity)), + function(x) { + y <- attr(gravity[[x]], "label") + if (is.null(y)) y <- NA + return(y) + } + ) +) + +knitr::kable(countries_desc) + +knitr::kable(gravity_desc) + +# export ---- + +usethis::use_data(countries, compress = "xz", overwrite = T) + +usethis::use_data(gravity, compress = "xz", overwrite = T) diff --git a/man/cepiigravity-package.Rd b/man/cepiigravity-package.Rd new file mode 100644 index 0000000..3ae3801 --- /dev/null +++ b/man/cepiigravity-package.Rd @@ -0,0 +1,28 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/cepiigravity-package.R +\docType{package} +\name{cepiigravity-package} +\alias{cepiigravity} +\alias{cepiigravity-package} +\title{cepiigravity: The CEPII Gravity Database} +\description{ +The Gravity database aims at gathering in a single place a set of variables that could be useful to researchers or practitioners willing to understand the determinants of international trade. Each observation corresponds to a combination of exporter-importer-year (i.e. origin-destination-year), for which we provide trade flows, as well as geographic, cultural, trade facilitation and macroeconomic variables. +} +\seealso{ +Useful links: +\itemize{ + \item \url{https://pacha.dev/cepiigravity} + \item Report bugs at \url{https://github.com/pachamaltese/cepiigravity/issues} +} + +} +\author{ +\strong{Maintainer}: Mauricio Vargas Sepulveda \email{hello+r@pacha.dev} (\href{https://orcid.org/0000-0003-1017-7574}{ORCID}) + +Other contributors: +\itemize{ + \item Centre d'études prospectives et d'informations internationales (CEPII) [data contributor] +} + +} +\keyword{internal} diff --git a/man/countries.Rd b/man/countries.Rd new file mode 100644 index 0000000..366ac95 --- /dev/null +++ b/man/countries.Rd @@ -0,0 +1,60 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/cepiigravity-package.R +\docType{data} +\name{countries} +\alias{countries} +\title{The Countries dataset: Static country-level information} +\format{ +A data frame with 257 rows and 8 columns:\tabular{ll}{ + variable \tab description \cr + iso3 \tab ISO3 alphabetic \cr + iso3num \tab ISO3 numeric \cr + country \tab Country name \cr + countrylong \tab Country official name \cr + first_year \tab First year of territorial existence \cr + last_year \tab Last year of territorial existence \cr + countrygroup_iso3 \tab Country group (ISO3 alphabetic) \cr + countrygroup_iso3num \tab Country group (ISO3 numeric) \cr +} +} +\description{ +Countries is the dataset that includes static country-level +variables, allowing for a full identification of each country included in +Gravity and, if relevant, for a tracking of its territorial changes (splits +and merges). Some of the variables provided in Countries are also included +in the main Gravity dataset. +Countries includes one observation for each territorial configuration, +mapping the full set of territorial changes that are accounted for in +Gravity. For example, Countries includes one observation for West Germany, +one for East Germany and one for the unified Germany. Similarly, it includes +one observation for Sudan before the split of South Sudan, one observation +for South Sudan, and one observation for Sudan after the split of South +Sudan. +} +\details{ +There are differences with respect to the original Stata version. +ISO3 alphabetic codes of length zero were converted to NAs and the +attributes (i.e., column descriptions), when missing, were added after +reading the original documentation. +The universe of Countries (and of the Gravity dataset) is based on +CEPII's GeoDist dataset (Mayer and Zignago 2011). This dataset is augmented +with some countries and territories that either appear in the World Bank's +World Integrated Trade Solution (WITS) or that are necessary to construct +the full chain of territorial changes that have led to the creation of +countries appearing in the GeoDist dataset. In addition, some names are +updated, as well as ISO3 alphabetic numeric codes, by comparing the GeoDist +dataset with the WITS dataset and with the official source for ISO country +codes. Countries' official names also come from the WITS dataset, augmented +by Wikipedia for countries or territories that are not present in the WITS +dataset but that appear in GeoDist. +Countries (and the Gravity dataset) carefully tracks territorial changes, +i.e. the country's previous membership (in case of a split) and the +country's new membership (in case of a unification of two territories). We +only take into account the modifications that occurred over the time span +of the database, i.e 1948-2019. This is done using the CIA World Factbook +and Wikipedia. +} +\author{ +CEPII, adapted from the World Bank and other sources +} +\keyword{data} diff --git a/man/gravity.Rd b/man/gravity.Rd new file mode 100644 index 0000000..2e6eb41 --- /dev/null +++ b/man/gravity.Rd @@ -0,0 +1,123 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/cepiigravity-package.R +\docType{data} +\name{gravity} +\alias{gravity} +\title{The Gravity dataset} +\format{ +A data frame with 4,428,288 rows and 79 columns:\tabular{ll}{ + variable \tab description \cr + year \tab Year \cr + iso3_o \tab Origin ISO3 alphabetic \cr + iso3_d \tab Destination ISO3 alphabetic \cr + iso3num_o \tab Origin ISO3 numeric \cr + iso3num_d \tab Destination ISO3 numeric \cr + country_exists_o \tab 1 = Origin country exists \cr + country_exists_d \tab 1 = Destination country exists \cr + gmt_offset_2020_o \tab Origin GMT offset (hours) \cr + gmt_offset_2020_d \tab Destination GMT offset (hours) \cr + contig \tab 1 = Contiguity \cr + dist \tab Distance between most populated cities, in km \cr + distw \tab Population-weighted distance between most populated cities, in km \cr + distcap \tab Distance between capitals, in km \cr + distwces \tab Population-weighted distance between most populated cities, in km, using CES for \cr + dist_source \tab Distance source \cr + comlang_off \tab 1 = Common official or primary language \cr + comlang_ethno \tab 1 = Language is spoken by at least 9\% of the population \cr + comcol \tab 1 = Common colonizer post 1945 \cr + comrelig \tab Common religion index \cr + col45 \tab 1 = Pair in colonial relationship post 1945 \cr + legal_old_o \tab Origin legal system before transition \cr + legal_old_d \tab Destination legal system before transition \cr + legal_new_o \tab Origin legal system after transition \cr + legal_new_d \tab Destination legal system after transition \cr + comleg_pretrans \tab 1 = Common legal origins before transition \cr + comleg_posttrans \tab 1 = Common legal origins after transition \cr + transition_legalchange \tab 1 = Common legal origin changed since transition \cr + heg_o \tab 1 = Origin is current or former hegemon of destination \cr + heg_d \tab 1 = Destination is current or former hegemon of origin \cr + col_dep_ever \tab 1 = Pair ever in colonial or dependency relationship \cr + col_dep \tab 1 = Pair currently in colonial or dependency relationship \cr + col_dep_end_year \tab Independence date, if col_dep = 1 \cr + col_dep_end_conflict \tab 1 = Independence involved conflict, if col_dep_ever = 1 \cr + empire \tab Hegemon if sibling = 1 and year < sever_year \cr + sibling_ever \tab 1 = Pair ever in sibling relationship \cr + sibling \tab 1 = Pair currently in sibling relationship \cr + sever_year \tab Severance year for pairs if sibling == 1 \cr + sib_conflict \tab 1 = Pair ever in sibling relationship and conflict with hegemon \cr + pop_o \tab Origin Population, total in thousands \cr + pop_d \tab Destination Population, total in thousands \cr + gdp_o \tab Origin GDP (current thousands US$) \cr + gdp_d \tab Destination GDP (current thousands US$) \cr + gdpcap_o \tab Origin GDP per cap (current thousands US$) \cr + gdpcap_d \tab Destination GDP per cap (current thousands US$) \cr + pop_source_o \tab Origin Population source \cr + pop_source_d \tab Destination Population source \cr + gdp_source_o \tab Origin GDP source \cr + gdp_source_d \tab Destination GDP source \cr + gdp_ppp_o \tab Origin GDP, PPP (current thousands international $) \cr + gdp_ppp_d \tab Destination GDP, PPP (current thousands international $) \cr + gdpcap_ppp_o \tab Origin GDP per cap, PPP (current thousands international $) \cr + gdpcap_ppp_d \tab Destination GDP per cap, PPP (current thousands international $) \cr + pop_pwt_o \tab Origin Population, total in thousands (PWT) \cr + pop_pwt_d \tab Destination Population, total in thousands (PWT) \cr + gdp_ppp_pwt_o \tab Origin GDP, current PPP (2011 thousands US$) (PWT) \cr + gdp_ppp_pwt_d \tab Destination GDP, current PPP (2011 thousands US$) (PWT) \cr + gatt_o \tab Origin GATT membership \cr + gatt_d \tab Destination GATT membership \cr + wto_o \tab Origin WTO membership \cr + wto_d \tab Destination WTO membership \cr + eu_o \tab 1 = Origin is a EU member \cr + eu_d \tab 1 = Destination is a EU member \cr + rta \tab 1 = RTA (source: WTO) \cr + rta_coverage \tab Coverage of RTA (source: WTO) \cr + rta_type \tab Type of RTA (source: WTO) \cr + entry_cost_o \tab Origin Cost of business start-up procedures (\% of GNI per capita) \cr + entry_cost_d \tab Destination Cost of business start-up procedures (\% of GNI per capita) \cr + entry_proc_o \tab Origin Start-up procedures to register a business (number) \cr + entry_proc_d \tab Destination Start-up procedures to register a business (number) \cr + entry_time_o \tab Origin Time required to start a business (days) \cr + entry_time_d \tab Destination Time required to start a business (days) \cr + entry_tp_o \tab Origin Days + procedures to start a business \cr + entry_tp_d \tab Destination Days + procedures to start a business \cr + tradeflow_comtrade_o \tab Trade flows as reported by the origin, 1000 Current USD (source: UNSD) \cr + tradeflow_comtrade_d \tab Trade flows as reported by the destination, 1000 Current USD (source: UNSD) \cr + tradeflow_baci \tab Trade flow, 1000 USD (source: BACI) \cr + manuf_tradeflow_baci \tab Trade flow of manufactured goods, 1000 USD (source: BACI) \cr + tradeflow_imf_o \tab Trade flows as reported by the origin, 1000 Current USD (source: IMF) \cr + tradeflow_imf_d \tab Trade flows as reported by the destination, 1000 Current USD (source: IMF) \cr +} +} +\description{ +In Gravity, each observation is uniquely identified by the +combination of the country_id of the origin country, the country_id of the +destination country and the year. Gravity is “squared”, meaning that each +country pair appears every year, even if one of the countries actually does +not exist. However, based on the territorial changes tracked in the +Countries dataset, we set to missing all variables for country pairs in +which at least one of the countries does not exist in a given year. +Furthermore, we provide two dummy variables indicating whether the origin +and the destination countries exist. These dummies allow users wishing drop +non-existing country pairs from the dataset to do so easily. Users looking +for a more detailed account of country existence should turn to the +Countries dataset. +A few caveats on the identification of countries through country_id must be +noted. Firstly, when countries merge, it is the new country or territorial +configuration that exists during transition year but not the old country or +territorial configuration. As an example DEU.1 (West Germany) has 1989 as +last year, not 1990, while DEU.2 (the unified Germany) has 1990 as first +year. This is consistent with the construction of underlying variables that +varies over time, such as GDP, population, trade. Secondly, since the +dataset is square in terms of country_id, there exist cases in which two +configurations of the same alphabetic ISO3 code appear bilaterally, e.g. +DEU.1 and DEU.2. While DEU.1 and DEU.2 never existed simultaneously, we +still keep these null observations to ensure that the final dataset is +square. +} +\details{ +The details are the same as for the Countries dataset. +} +\author{ +CEPII, adapted from the World Bank and other sources +} +\keyword{data}