PDH SDD GitHub's template

A general template for SDD/PDH projects, incorporating some good practices for github based development.

Usage

To use this template create a new repository using this repository as a template. See in the top right corner of this page the green button "Use this template". Click on it and follow the instructions. This will create a new repository with the same structure as this one. Then clone the new repository to your local machine and start working on your project.

Current status

The code is functional. In src/script.R you can find a usage example, where we compare the staging and production version for a variety of tables.

known limitations

The current version depends on comparing table across different instances of .Stat (e.g., base and new data version can be reached through different .stat urls) rather than different spaces (i.e., validate and disseminate). This is possible to achieve by changing the agency field.
The current version performs {|dataflows| *} |indicators| * |geographies| calls, which is a lot if you are trying to compare many big, dense, dataflows. It can be improved by reducing the nummber performing the groupings at a second stage (eventually, it can be brought down to {|dataflows|}API calls).
Changes in DSD schema are not handled. And I suspect they won't be handled that nicely if the dimensions between base and new data updates are different.
it might be nice to offer the possibility of generating directly the .pdfor .md versions of the diff tables. This should be possible thanks to {kblExtra} but Windows is not playing nicely.

Folder structure

There are four main folders in this repository:

docs: Contains the documentation of the project.
src: Contains the source code of the project.
raw_data: Contains temporary local copies of the raw data used in the project. This folder won't be uploaded to the repository.
output: Contains the temporary output files generated by the project (png, pdfs, small data units). This folder won't be uploaded to the repository.

gitignore

The .gitignore file is configured to ignore the most common development temporary files for Python, R, and Stata. It also ignore most file formats in the /temp/ subdirectories.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDH SDD GitHub's template

Usage

Current status

known limitations

Folder structure

gitignore

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs		docs
raw_data		raw_data
src		src
temp		temp
.gitignore		.gitignore
README.md		README.md

PacificCommunity/dotstat-compare-tables

Folders and files

Latest commit

History

Repository files navigation

PDH SDD GitHub's template

Usage

Current status

known limitations

Folder structure

gitignore

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages