Skip to content

gnoblet/impactR

Repository files navigation

impactR

Ease IMPACT’s data and database officers woRk

impactR started as a simple project: mainly a reminder of totally-perfectible functions used and made on the go for the Burkina Faso team in 2021. It became broader, aiming now to ease data teams daily R work and to cover most of the research cycle’s tasks.

It is based on three spreadsheets that need to be filled in and coordinated by either assessment officers, data officers or field officers:

  1. To monitor data collection and get a log to fill: a spreadsheet of logical tests based on the questionnaire and the Kobo tool
  2. To clean data: a cleaning log that has been (well-)filled
  3. To analyze data: a data analysis plan

Specs:

  • mainly, it is aimed at data collection with Kobo
  • it extensively uses the tidyverse, and srvyr for survey data analysis
  • since version 0.7.8, it is considered robust enough and has been tested on 3 different
  • it requires R 4.1+ (mostly for the native pipe |>)

Installation

You can install the last version of impactR from GitHub with:

# install.packages("devtools")
devtools::install_github("gnoblet/impactR", build_vignettes = T)

Roadmap

From version 0.6, contributions should go with minimal and complete commits as a good practice. The dev branch will be used from there. Well, in practice, it isn’t much.

Roadmap is as follows:

  • introduce tidy eval wherever it makes sense
  • add (re) count columns post-cleaning for multiple choices columns and simple choice’s other column
  • write more documentation
  • tidy eval to cleaning functions
  • dots not as the last arg, not always at least
  • functions to create a small report of the values that effectively changed or were removed when cleaning thanks to a cleaning log
  • more robust check cleaning log and check check list functions
  • export clean (open)-xlsx files
  • add a grouping arg to make_log_outlier()
  • (ongoing) MSNA analysis tools : roster (education, demography, WGI), weighting functions, analysis functions
  • (ongoing) Split this big mess into several consolidated small packages : a viz one, an analysis one and a cleaning one

Side projects

There will be a Shiny app for cleaning and monitoring (in French for now) whose repo will be collectoR. It is experimental and based on older versions of impactR.

Vignettes

Youpi! some documentation:

In R, use:

vignette("base_de_travail", "impactR")
vignette("main_workflow", "impactR")

Example

These are basics example of daily uses:

# Attach all functions, equivalent to library("impactR")
box::use(impactR[...])

## basic example codes and uses (not run!)

## Import a csv file with clean names and clean types, do guess types on the max number of linse
# import_csv("data.csv")

## Get colnames for sector foodsec whose variables start with "f_"
# tbl_col_start(data, "f_")

## Group split to a named list
# named_group_split(data, admin2)

## Left join many tibbles
# left_joints(tibble_list, id_col)

## Make an outlier log for all numeric variables in the data.frame/tibble
# make_log_outlier(rawdata, survey, id_col = uuid, i_enum_id)

## Make a log based on logical tests, outliers and "other" answers
# make_all_logs(rawdata, 
#               survey, 
#               check_list,
#               other = "other_", 
#               id_col = uuid, 
#               i_enum_id)

## Clean from log
# make_all_logs(rawdata,
#               log,
#               survey, 
#               choices,
#               other = "other_", 
#               id_col = uuid)

## Calculate weigthed proportion for shelter type by group (e.g. administrative areas or population groups)
# svy_prop(design, s_shelter_type, c(admin1, group_pop), na.rm = T, stat_name = "prop", level = 0.95)