Skip to content

Commit

Permalink
Opennem facilities (#4)
Browse files Browse the repository at this point in the history
* Removed spurious index

* Added opennem facillities snippet

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Dylan McConnell <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
3 people committed Jul 17, 2023
1 parent 295e423 commit 6d2c59a
Show file tree
Hide file tree
Showing 3 changed files with 218 additions and 0 deletions.
1 change: 1 addition & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ website:
- text: "Snippets"
menu:
- aemo_data.qmd
- opennem_facilities.qmd
- text: Contributing
href: contributing.qmd
right:
Expand Down
68 changes: 68 additions & 0 deletions opennem_facilities.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: openNEM facility data
format:
html:
code-fold: true
code-overflow: wrap
---

## Very basic module for downloading and parsing openNEM facility data

This is a simple set of functions for downloading and parsing station and duid meta data from openNEM.

Essentially works as follows:

- gets the master list of stations from openNEM
- iteratively downloads and saves the json data for each of the stations within this list (about 400)
- parses the downloaded data into a flat dataframe

The json data is stored locally, to prevent having to re-download the the every station each time you might want to adapt the parser and/or change the data you want to record.

The json is validated with pydantic (to deal with missing fields, and other irreularities in the openNEM json). There is probably a smarter way to flatten the validated data to pandas than what I have now, but it does the job.

Note there are two stations (commented out in the code) that are missing or have another issue.

### Requirements

Written using Python 3.11. Uses `pandas`, `requests`, `simplejson` and `pydantic` (for json data validation).

### Usage

Before using the module, there is global variable (`LOCALDIR`) that needs to be set to specifify where the station json data is stored.

To download all the station json:

```python
import opennem_facilities
opennem_facilities.download_all_stations()
```

Top parse the station data:

```python
import opennem_facilities
df = opennem_facilities.parse_station_data()
```

This should return a dataframe as follows (where the `code` here is DUID)

| | network_region | code | fueltech | capacity_registered | lat | lon | station_name | station_code |
|---:|:-----------------|:---------|:-----------------|----------------------:|---------:|--------:|:---------------|:---------------|
| 0 | NSW1 | APPIN | gas_wcmg | 55 | -34.2109 | 150.793 | Appin | APPIN |
| 1 | NSW1 | AVLSF1 | solar_utility | 245 | -34.9191 | 146.61 | Avonlie | AVLSF |
| 2 | NSW1 | AWABAREF | bioenergy_biogas | 1 | -33.0233 | 151.551 | Awaba | AWABAREF |
| 3 | NSW1 | BANGOWF2 | wind | 84.8 | -34.7672 | 148.921 | Bango | BANGOWF |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |



### Extending / adapting

To parse additional details / metadata - you would have to adapt the `Station` pydantic model (i.e. add the fields you want to parse), and also adapt the function to flatten the data to pandas as appropriate.

### code

The code csan be downloaded from here: ['opennem_faciulities.py](snippets/aemo_data/opennem_facilities.py), and is shown below as well:

```python {include="snippets/aemo_data/opennem_facilities.py"}
```
149 changes: 149 additions & 0 deletions snippets/aemo_data/opennem_facilities.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Basic python script to download and restructure DUID and station data
# from the openNEM facilities dataset
#
# Copyright (C) 2023 Dylan McConnell
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.

import os
from typing import List, Optional

import pandas as pd
import requests
import simplejson
from pydantic import BaseModel

GEOJSON = "https://data.opennem.org.au/v3/geo/au_facilities.json"
LOCALDIR = "/path/to/local/dir/"
STATION_URL = "https://api.opennem.org.au/station/au/NEM/{}"


def get_master():
"""
Download master geojson file from openNEM, returning JSON
"""
response = requests.get(GEOJSON)
return simplejson.loads(response.content)


def get_station(station_code: str = "LIDDELL"):
"""
Download and store station json from openNEM
"""
response = requests.get(STATION_URL.format(station_code))
json = simplejson.loads(response.content)

filename = station_filename(json["code"])
with open(os.path.join(LOCALDIR, filename), "w") as f:
simplejson.dump(json, f, indent=2)


def station_filename(code: str):
"""
Simple function to replace problematic characters in station codes
and return a filename
"""
clean_code = code.replace("/", "_")
return f"{clean_code}.json"


def load_station(station_code: str):
"""
Load station json from local directory
"""
filename = station_filename(station_code)
with open(os.path.join(LOCALDIR, filename), "r") as f:
return simplejson.load(f)


def station_generator(master_json):
"""
Generator that yields the station code for every station in the NEM
"""
for station in master_json["features"]:
if station["properties"]["network"] == "NEM":
yield station["properties"]["station_code"]


def download_all_stations():
"""
Downloads all the station json data from the master list.
"""
master_json = get_master()
for station_code in station_generator(master_json):
if station_code != "SLDCBLK":
try:
load_station(station_code)
except FileNotFoundError:
print("downloading ", station_code)
get_station(station_code)


"""
Some pydantic models for validating openNEM data
"""


class DispatchUnit(BaseModel):
network_region: str
code: str
fueltech: str
capacity_registered: Optional[float] = None


class Location(BaseModel):
lat: Optional[float] = None
lng: Optional[float] = None


class Station(BaseModel):
name: str
code: str
location: Location
facilities: List[DispatchUnit]


def parse_station_data():
"""
Parses all station data from the master list.
Assumes all station json already downloaded.
"""
master_json = get_master()
data = []

for station_code in station_generator(master_json):
if station_code not in ["MWPS", "SLDCBLK"]:
station_json = load_station(station_code)
valid_station = Station(**station_json)
data.append(flatten_station(valid_station))

return pd.concat(data).reset_index(drop=True)


def flatten_station(valid_station: Station):
"""
Simple function to convert a validated station to pandas dataframe
(probably could be done neater / cleaner with pd.normalize_json)
"""
d = []
station_dict = valid_station.dict()
for du in valid_station.facilities:
data = du.dict()
data["lat"] = station_dict["location"]["lat"]
data["lon"] = station_dict["location"]["lng"]
data["station_name"] = station_dict["name"]
data["station_code"] = station_dict["code"]
d.append(data)

return pd.DataFrame(d)

0 comments on commit 6d2c59a

Please sign in to comment.