This repo contains the functions and run script to generate a dataframe of the costs of installing Air Source Heat Pumps (ASHP) in 8 different types of home at a series of percentiles given by the user. The 8 housing archetypes are: flats; semi-detached & terraced houses and maisonettes; detached houses; and bungalows; with each group split into pre- and post-1950 construction. Costs are adjusted for inflation against a chosen base year.
- Clone the repo
- Meet the data science cookiecutter requirements, in brief:
- Install:
direnv
andconda
- Install:
- Run
make install
to configure the development environment. This will:- Setup the conda environment
- Configure
pre-commit
- Navigate to your local copy of the repo
- Activate the conda environment with
conda activate asf_heat_pump_affordability
To recreate the data used in the Q1 2024 cost policy analysis, run the following lines in your terminal:
pip install jupytext
jupytext --to notebook asf_heat_pump_affordability/notebooks/investigate_effect_of_room_number.py
Run all lines of the resulting investigate_effect_of_room_number.ipynb
notebook. The analytical output dataset will be saved into the asf-heat-pump-affordability
bucket on S3.
- MCS-EPC
most_relevant
version featuring data up and including Q2 2023: mcs_installations_epc_most_relevant_231009.csv - Xoserve off-gas postcode register October 2023
- ONS Postcode Directory (August 2023) (contains 2011 rural-urban classification indicator data)
- Indices of Deprivation 2019: income and employment domains combined for England and Wales
- Scottish Index of Multiple Deprivation 2020v2 - ranks (see further documentation)
- Conversions for rural-urban classification codes to a 2-fold ("rural"/"urban") classification can be found on p19, section 39 in the ONS Postcode Directory (August 2023) User Guide for England and Wales, and in Table 2.3 of Scottish Government Urban Rural Classification 2020 for Scotland.
NB: the URLs used by getter functions to load source data can be found and updated in asf_heat_pump_affordability/config/base.yaml
Key directories and files:
asf_heat_pump_affordability
├───config
│ base.yaml - core variables, including source URLs used by getter functions
│ schema.json - data types for loading flat files
├───getters
│ get_data.py - functions to retrieve data from external sources
├───notebooks
│ identify_and_review_analytical_sample.py - notebook to explore impact of applying exclusion criteria on core MCS-EPC dataset
│ investigate_effect_of_room_number.py - notebook using quantile regression models to estimate installation costs by archetype and room number
├───pipeline
│ archetypes.py - functions to classify housing archetypes
│ generate_cost_percentiles.py - functions to generate cost percentiles by archetype
│ preprocess_cpi.py - functions to preprocess Consumer Price Index dataset
│ preprocess_data.py - functions to preprocess MCS-EPC dataset to produce sample used to calculate cost distributions
│ produce_costs_dataset.py - run script with main() function
Technical and working style guidelines
Project based on Nesta's data science project template (Read the docs here).