Skip to content

Commit 5023d6d

Browse files
authored
GridPath Data Toolkit (#1169)
Pre-release
1 parent bb2ef3c commit 5023d6d

File tree

238 files changed

+1073519
-19171
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

238 files changed

+1073519
-19171
lines changed

.github/workflows/test_gridpath.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,9 @@ jobs:
2222

2323
steps:
2424
- uses: actions/checkout@v3
25+
- name: Install sqlite3 3.45.0
26+
run: |
27+
bash ./.github/workflows/upgrade_sqlite_on_linux.sh
2528
- name: Set up Python
2629
uses: actions/setup-python@v3
2730
with:
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# required to support UNIXEPOCH
2+
# installing build: 3.45.0
3+
wget https://www.sqlite.org/2024/sqlite-autoconf-3450000.tar.gz
4+
# unzipping build
5+
tar -xvzf sqlite-autoconf-3450000.tar.gz
6+
7+
# below steps are for installing the build in /usr/local/bin
8+
cd sqlite-autoconf-3450000 || exit
9+
./configure
10+
make
11+
sudo make install
12+
13+
# remove the previous version
14+
sudo apt-get remove -y --auto-remove sqlite3

.gitignore

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -95,14 +95,10 @@ ENV/
9595
.idea
9696

9797
# Don't track example run CSV results
98-
# Cap-expansion run
9998
examples/*/results/*.csv
100-
# Multi-horizon, single-stage prod cost run
10199
examples/*/*/results/*.csv
102-
# Multi-horizon, multi-stage prod cost run
103100
examples/*/*/*/results/*.csv
104-
# RA iteration runs
105-
# Weather/hydro/availability iterations + subproblems
101+
examples/*/*/*/*/results/*.csv
106102
examples/*/*/*/*/*/results/*.csv
107103

108104
# Don't track pass-through inputs directory in multi-stage runs

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
include gridpath/project/operations/operational_types/opchar_param_requirements.csv
22
include db/db_schema.sql
33
include db/data/*.*
4+
include gridpath_data_toolkit/raw_data_db_schema.sql

data_toolkit/README.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
## GridPath Data Toolkit
2+
3+
This is a pre-release of the GridPath Data Toolkit. The Toolkit includes
4+
previously available functionality from the GridPath RA Data Toolkit that
5+
generates GridPath input CSV files for use in resource adequacy studies,
6+
including weather-dependent load profiles as well as wind and solar profiles,
7+
generator availabilities, and hydro conditions. New functionality takes
8+
advantage of the public data available in the PUDL database maintained by
9+
Catalyst Cooperative.
10+
11+
GridPath can currently utilize the following open datasets available from PUDL:
12+
* **Form EIA-860**: generator-level specific information about existing and
13+
planned generators
14+
* **Form EIA-930**: hourly operating data about the high-voltage bulk electric
15+
power grid in the Lower 48 states collected from the electricity balancing authorities (BAs) that operate the grid
16+
* **EIA AEO** *Table 54 (Electric Power Projections by Electricity Market
17+
Module Region)*: fuel price forecasts
18+
* **GridPath RA Toolkit** variable generation profiles created for the 2026
19+
Western RA Study: these include hourly wind profiles by WECC BA based on
20+
assumed 2026 wind buildout for weather years 2007-2014 and hourly solar
21+
profiles by WECC BA based on assumed 2026 buildout (as of 2021) for weather
22+
years 1998-2019
23+
24+
## Usage
25+
### Download data from PUDL
26+
27+
```bash
28+
gridpath_get_pudl_data
29+
```
30+
Downloads data to *./pudl_download* by default.
31+
This will download the *pudl.sqlite* database as well as the RA Toolkit
32+
wind and solar profiles Parquet file, and the EIA930 hourly interchange
33+
data Parquet file. See *--help* menu for options. Note these are relatively
34+
large files and the download process may take a few minutes depending on
35+
your internet speed.
36+
37+
### Get subset of raw data for GridPath from downloaded PUDL data
38+
39+
```bash
40+
gridpath_pudl_to_gridpath_raw
41+
```
42+
Gets subset of the downloaded PUDL data and converts it to GridPath raw data format.
43+
This will create the following files in the user-specified raw data directory:
44+
* pudl_eia860_generators.csv
45+
* pudl_eia930_hourly_interchange.csv
46+
* pudl_eiaaeo_fuel_prices.csv
47+
* pudl_ra_toolkit_var_profiles.csv
48+
49+
### Get other GridPath RA Toolkit data not yet on PUDL
50+
51+
```bash
52+
gridpath_get_ra_toolkit_data_raw
53+
54+
```
55+
Also get the load data and hydro data from the GridPath RA Toolkit dataset.
56+
Note that this is the same dataset but in a changed format from what is on the
57+
GridLab RA Toolkit website and is currently stored on Blue Marble's Google Drive.
58+
* ra_toolkit_load.csv
59+
* ra_toolkit_hydro.csv
60+
61+
62+
### Process the data with the GridPath Data Toolkit
63+
64+
```bash
65+
gridpath_run_data_toolkit --settings_csv PATH/TO/SETTINGS
66+
```
67+
68+
See the *Using the GridPath Data Toolkit* section of the GridPath documentation.

data_toolkit/__init__.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
"""
2+
The **GridPath Data Toolkit** provides functionality to create GridPath scenario
3+
inputs from raw data. The user may provide their own data and use the
4+
Toolkit to convert the data to GridPath CSV input format for use in buildling a
5+
GridPath database. The Toolkit also includes functionality to download raw data
6+
from `PUDL <https://catalyst.coop/pudl/>`__ and from the
7+
`GridPath RA Toolkit <https://gridlab.org/gridpathratoolkit/>`__.
8+
"""

data_toolkit/common_methods.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Copyright 2016-2025 Blue Marble Analytics LLC.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
"""
16+
17+
"""
18+
19+
import os.path
20+
21+
22+
def create_csv_generic(
23+
filename,
24+
df,
25+
overwrite,
26+
):
27+
""" """
28+
if not os.path.exists(filename) or overwrite:
29+
df.to_csv(
30+
filename,
31+
mode="w",
32+
index=False,
33+
)
34+
else:
35+
raise ValueError(
36+
f"The file {filename} already exists and overwrite has not been "
37+
"indicated."
38+
)
File renamed without changes.
Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# Copyright 2016-2024 Blue Marble Analytics LLC.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
"""
16+
EIA AEO Fuel Prices
17+
*******************
18+
19+
Create GridPath fuel price inputs (fuel_scenario_id) based on the EIA AEO.
20+
21+
.. warning:: The user is reponsible for ensuring that all prices and costs in
22+
their model are in a consistent real currency year.
23+
24+
=====
25+
Usage
26+
=====
27+
28+
>>> gridpath_run_data_toolkit --single_step eiaaeo_fuel_price_input_csvs --settings_csv PATH/TO/SETTINGS/CSV
29+
30+
===================
31+
Input prerequisites
32+
===================
33+
34+
Thios module assumes the following raw input database tables have been
35+
populated:
36+
* raw_data_eiaaeo_fuel_prices
37+
* user_defined_eiaaeo_region_key
38+
39+
=========
40+
Settings
41+
=========
42+
* database
43+
* output_directory
44+
* model_case
45+
* report_year
46+
* fuel_price_id
47+
48+
"""
49+
50+
import csv
51+
from argparse import ArgumentParser
52+
import os.path
53+
import pandas as pd
54+
import sys
55+
56+
from db.common_functions import connect_to_database
57+
58+
59+
def parse_arguments(args):
60+
"""
61+
:param args: the script arguments specified by the user
62+
:return: the parsed known argument values (<class 'argparse.Namespace'>
63+
Python object)
64+
65+
Parse the known arguments.
66+
"""
67+
parser = ArgumentParser(add_help=True)
68+
69+
parser.add_argument("-db", "--database", default="../../db/open_data_raw.db")
70+
71+
parser.add_argument(
72+
"-o",
73+
"--output_directory",
74+
default="../../db/csvs_open_data/fuels/fuel_prices",
75+
)
76+
parser.add_argument("-fuel_price_id", "--fuel_price_scenario_id", default=1)
77+
parser.add_argument(
78+
"-case",
79+
"--model_case",
80+
default="aeo2022",
81+
)
82+
parser.add_argument("-r_yr", "--report_year", default=2023)
83+
84+
parser.add_argument("-q", "--quiet", default=False, action="store_true")
85+
86+
parsed_arguments = parser.parse_known_args(args=args)[0]
87+
88+
return parsed_arguments
89+
90+
91+
def get_fuel_prices(
92+
conn, output_directory, subscenario_id, subscenario_name, report_year, model_case
93+
):
94+
""" """
95+
96+
sql = f"""
97+
SELECT gridpath_generic_fuel || '_' || fuel_region as fuel, projection_year as period,
98+
fuel_cost_real_per_mmbtu_eiaaeo as fuel_price_per_mmbtu
99+
FROM raw_data_eiaaeo_fuel_prices
100+
JOIN (SELECT DISTINCT gridpath_generic_fuel, fuel_type_eiaaeo FROM user_defined_eia_gridpath_key) USING (fuel_type_eiaaeo)
101+
JOIN user_defined_eiaaeo_region_key using (
102+
electricity_market_module_region_eiaaeo)
103+
WHERE report_year = {report_year}
104+
AND model_case_eiaaeo = '{model_case}'
105+
ORDER BY fuel, period
106+
"""
107+
108+
df = pd.read_sql(sql, conn)
109+
month_df_list = []
110+
for month in range(1, 13):
111+
month_df = df
112+
month_df["month"] = month
113+
cols = month_df.columns.tolist()
114+
cols = cols[:2] + [cols[3]] + [cols[2]]
115+
month_df = month_df[cols]
116+
117+
month_df_list.append(month_df)
118+
119+
final_df = pd.concat(month_df_list)
120+
121+
final_df.to_csv(
122+
os.path.join(output_directory, f"{subscenario_id}_" f"{subscenario_name}.csv"),
123+
index=False,
124+
)
125+
126+
127+
def main(args=None):
128+
if args is None:
129+
args = sys.argv[1:]
130+
131+
parsed_args = parse_arguments(args=args)
132+
133+
if not parsed_args.quiet:
134+
print("Creating fuel prices...")
135+
136+
os.makedirs(parsed_args.output_directory, exist_ok=True)
137+
138+
conn = connect_to_database(db_path=parsed_args.database)
139+
140+
get_fuel_prices(
141+
conn=conn,
142+
output_directory=parsed_args.output_directory,
143+
subscenario_id=parsed_args.fuel_price_scenario_id,
144+
subscenario_name=parsed_args.model_case,
145+
report_year=parsed_args.report_year,
146+
model_case=parsed_args.model_case,
147+
)
148+
149+
150+
if __name__ == "__main__":
151+
main()

0 commit comments

Comments
 (0)