Skip to content

Commit

Permalink
refactor to use catchment id
Browse files Browse the repository at this point in the history
Uses catchment id as a default rather then wb_id to reduce confusion. Now when selecting a catchment, the outflow nexus is determined, then everything upstream of that nexus is subset. rather than just what is upstream of the inflow to that catchment
  • Loading branch information
JoshCu committed Aug 9, 2024
1 parent fe06490 commit 3fc4e69
Show file tree
Hide file tree
Showing 18 changed files with 309 additions and 271 deletions.
46 changes: 23 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,28 +92,28 @@ Once all the steps are finished, you can run NGIAB on the folder shown underneat
## Arguments
- `-h`, `--help`: Show the help message and exit.
- `-i INPUT_FILE`, `--input_file INPUT_FILE`: Path to a CSV or TXT file containing a list of waterbody IDs, lat/lon pairs, or gage IDs; or a single waterbody ID (e.g., `wb-5173`), a single lat/lon pair, or a single gage ID.
- `-l`, `--latlon`: Use latitude and longitude instead of waterbody IDs. When used with `-i`, the file should contain lat/lon pairs.
- `-g`, `--gage`: Use gage IDs instead of waterbody IDs. When used with `-i`, the file should contain gage IDs.
- `-s`, `--subset`: Subset the hydrofabric to the given waterbody IDs, locations, or gage IDs.
- `-f`, `--forcings`: Generate forcings for the given waterbody IDs, locations, or gage IDs.
- `-r`, `--realization`: Create a realization for the given waterbody IDs, locations, or gage IDs.
- `-i INPUT_FILE`, `--input_file INPUT_FILE`: Path to a CSV or TXT file containing a list of catchment IDs, lat/lon pairs, or gage IDs; or a single catchment ID (e.g., `cat-5173`), a single lat/lon pair, or a single gage ID.
- `-l`, `--latlon`: Use latitude and longitude instead of catchment IDs. When used with `-i`, the file should contain lat/lon pairs.
- `-g`, `--gage`: Use gage IDs instead of catchment IDs. When used with `-i`, the file should contain gage IDs.
- `-s`, `--subset`: Subset the hydrofabric to the given catchment IDs, locations, or gage IDs.
- `-f`, `--forcings`: Generate forcings for the given catchment IDs, locations, or gage IDs.
- `-r`, `--realization`: Create a realization for the given catchment IDs, locations, or gage IDs.
- `--start_date START_DATE`: Start date for forcings/realization (format YYYY-MM-DD).
- `--end_date END_DATE`: End date for forcings/realization (format YYYY-MM-DD).
- `-o OUTPUT_NAME`, `--output_name OUTPUT_NAME`: Name of the subset to be created (default is the first waterbody ID in the input file).
- `-o OUTPUT_NAME`, `--output_name OUTPUT_NAME`: Name of the subset to be created (default is the first catchment ID in the input file).
## Examples
`-l`, `-g`, `-s`, `-f`, `-r` can be combined like normal CLI flags. For example, to subset, generate forcings, and create a realization, you can use `-sfr` or `-s -f -r`.
1. Subset hydrofabric using waterbody IDs:
1. Subset hydrofabric using catchment IDs:
```
python -m ngiab_data_cli -i waterbody_ids.txt -s
python -m ngiab_data_cli -i catchment_ids.txt -s
```
2. Generate forcings using a single waterbody ID:
2. Generate forcings using a single catchment ID:
```
python -m ngiab_data_cli -i wb-5173 -f --start_date 2023-01-01 --end_date 2023-12-31
python -m ngiab_data_cli -i cat-5173 -f --start_date 2023-01-01 --end_date 2023-12-31
```
3. Create realization using lat/lon pairs from a CSV file:
Expand All @@ -138,22 +138,22 @@ Once all the steps are finished, you can run NGIAB on the folder shown underneat
## File Formats
### 1. Waterbody ID input:
- CSV file: A single column of waterbody IDs, or a column named 'wb_id', 'waterbody_id', or 'divide_id'.
- TXT file: One waterbody ID per line.
### 1. Catchment ID input:
- CSV file: A single column of catchment IDs, or a column named 'cat_id', 'catchment_id', or 'divide_id'.
- TXT file: One catchment ID per line.
Example CSV (waterbody_ids.csv):
Example CSV (catchment_ids.csv):
```
wb_id,soil_type
wb-5173,some
wb-5174,data
wb-5175,here
cat_id,soil_type
cat-5173,some
cat-5174,data
cat-5175,here
```
Or:
```
wb-5173
wb-5174
wb-5175
cat-5173
cat-5174
cat-5175
```
### 2. Lat/Lon input:
Expand Down Expand Up @@ -195,6 +195,6 @@ Or:
## Output
The script creates an output folder named after the first waterbody ID in the input file, the provided output name, or derived from the first lat/lon pair or gage ID. This folder will contain the results of the subsetting, forcings generation, and realization creation operations.
The script creates an output folder named after the first catchment ID in the input file, the provided output name, or derived from the first lat/lon pair or gage ID. This folder will contain the results of the subsetting, forcings generation, and realization creation operations.
</details>
14 changes: 7 additions & 7 deletions modules/data_processing/create_realization.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ def make_noahowp_config(


def configure_troute(
wb_id: str, config_dir: Path, start_time: datetime, end_time: datetime
cat_id: str, config_dir: Path, start_time: datetime, end_time: datetime
) -> int:
with open(file_paths.template_troute_config(), "r") as file:
troute = yaml.safe_load(file) # Use safe_load for loading
Expand All @@ -140,7 +140,7 @@ def configure_troute(
network_topology = troute["network_topology_parameters"]
supernetwork_params = network_topology["supernetwork_parameters"]

geo_file_path = f"/ngen/ngen/data/config/{wb_id}_subset.gpkg"
geo_file_path = f"/ngen/ngen/data/config/{cat_id}_subset.gpkg"
supernetwork_params["geo_file_path"] = geo_file_path

troute["compute_parameters"]["restart_parameters"]["start_datetime"] = start_time.strftime(
Expand Down Expand Up @@ -177,10 +177,10 @@ def make_ngen_realization_json(
json.dump(realization, file, indent=4)


def create_realization(wb_id: str, start_time: datetime, end_time: datetime):
def create_realization(cat_id: str, start_time: datetime, end_time: datetime):
# quick wrapper to get the cfe realization working
# without having to refactor this whole thing
paths = file_paths(wb_id)
paths = file_paths(cat_id)

# make cfe init config files
cfe_atts_path = paths.config_dir() / "cfe_noahowp_attributes.csv"
Expand All @@ -191,7 +191,7 @@ def create_realization(wb_id: str, start_time: datetime, end_time: datetime):
make_noahowp_config(paths.config_dir(), cfe_atts_path, start_time, end_time)

# make troute config files
num_timesteps = configure_troute(wb_id, paths.config_dir(), start_time, end_time)
num_timesteps = configure_troute(cat_id, paths.config_dir(), start_time, end_time)

# create the realization
make_ngen_realization_json(paths.config_dir(), start_time, end_time, num_timesteps)
Expand All @@ -200,9 +200,9 @@ def create_realization(wb_id: str, start_time: datetime, end_time: datetime):


if __name__ == "__main__":
wb_id = "wb-1643991"
cat_id = "cat-1643991"
start_time = datetime(2010, 1, 1, 0, 0, 0)
end_time = datetime(2010, 1, 2, 0, 0, 0)
# output_interval = 3600
# nts = 2592
create_realization(wb_id, start_time, end_time)
create_realization(cat_id, start_time, end_time)
10 changes: 5 additions & 5 deletions modules/data_processing/file_paths.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,15 @@ class file_paths:

config_file = Path("~/.NGIAB_data_preprocess").expanduser()

def __init__(self, wb_id: str):
def __init__(self, cat_id: str):
"""
Initialize the file_paths class with a water body ID.
The following functions require a water body ID:
config_dir, forcings_dir, geopackage_path, cached_nc_file
Args:
wb_id (str): Water body ID.
cat_id (str): Water body ID.
"""
self.wb_id = wb_id
self.cat_id = cat_id

@staticmethod
def get_working_dir() -> Path:
Expand Down Expand Up @@ -101,7 +101,7 @@ def template_noahowp_config() -> Path:
return file_paths.data_sources() / "noah-owp-modular-init.namelist.input"

def subset_dir(self) -> Path:
return file_paths.root_output_dir() / self.wb_id
return file_paths.root_output_dir() / self.cat_id

def config_dir(self) -> Path:
return file_paths.subset_dir(self) / "config"
Expand All @@ -110,7 +110,7 @@ def forcings_dir(self) -> Path:
return file_paths.subset_dir(self) / "forcings"

def geopackage_path(self) -> Path:
return self.config_dir() / f"{self.wb_id}_subset.gpkg"
return self.config_dir() / f"{self.cat_id}_subset.gpkg"

def cached_nc_file(self) -> Path:
return file_paths.subset_dir(self) / "merged_data.nc"
Expand Down
10 changes: 5 additions & 5 deletions modules/data_processing/forcings.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,8 +190,8 @@ def compute_zonal_stats(
)


def setup_directories(wb_id: str) -> file_paths:
forcing_paths = file_paths(wb_id)
def setup_directories(cat_id: str) -> file_paths:
forcing_paths = file_paths(cat_id)
for folder in ["by_catchment", "temp"]:
os.makedirs(forcing_paths.forcings_dir() / folder, exist_ok=True)
return forcing_paths
Expand Down Expand Up @@ -220,8 +220,8 @@ def create_forcings(start_time: str, end_time: str, output_folder_name: str) ->
# Example usage
start_time = "2010-01-01 00:00"
end_time = "2010-01-02 00:00"
output_folder_name = "wb-1643991"
# looks in output/wb-1643991/config for the geopackage wb-1643991_subset.gpkg
# puts forcings in output/wb-1643991/forcings
output_folder_name = "cat-1643991"
# looks in output/cat-1643991/config for the geopackage cat-1643991_subset.gpkg
# puts forcings in output/cat-1643991/forcings
logger.basicConfig(level=logging.DEBUG)
create_forcings(start_time, end_time, output_folder_name)
18 changes: 9 additions & 9 deletions modules/data_processing/gpkg_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,22 +121,22 @@ def blob_to_centroid(blob: bytes) -> Point:
return Point(x, y)


def get_wbid_from_point(coords):
def get_catid_from_point(coords):
"""
Retrieves the watershed boundary ID (wbid) of the watershed that contains the given point.
Retrieves the watershed boundary ID (catid) of the watershed that contains the given point.
Args:
coords (dict): A dictionary containing the latitude and longitude coordinates of the point.
Example: {"lat": 40.7128, "lng": -74.0060}
Returns:
int: The watershed boundary ID (wbid) of the watershed containing the point.
int: The watershed boundary ID (catid) of the watershed containing the point.
Raises:
IndexError: If no watershed boundary is found for the given point.
"""
logger.info(f"Getting wbid for {coords}")
logger.info(f"Getting catid for {coords}")
q = file_paths.conus_hydrofabric()
d = {"col1": ["point"], "geometry": [Point(coords["lng"], coords["lat"])]}
point = gpd.GeoDataFrame(d, crs="EPSG:4326")
Expand Down Expand Up @@ -297,7 +297,7 @@ def get_table_crs(gpkg: str, table: str) -> str:
return crs


def get_wb_from_gage_id(gage_id: str, gpkg: Path = file_paths.conus_hydrofabric()) -> str:
def get_cat_from_gage_id(gage_id: str, gpkg: Path = file_paths.conus_hydrofabric()) -> str:
"""
Get the nexus id of associated with a gage id.
Expand All @@ -312,13 +312,13 @@ def get_wb_from_gage_id(gage_id: str, gpkg: Path = file_paths.conus_hydrofabric(
"""
gage_id = "".join([x for x in gage_id if x.isdigit()])
logger.info(f"Getting wbid for {gage_id}, in {gpkg}")
logger.info(f"Getting catid for {gage_id}, in {gpkg}")
with sqlite3.connect(gpkg) as con:
sql_query = f"SELECT id FROM hydrolocations WHERE hl_uri = 'Gages-{gage_id}'"
nex_id = con.execute(sql_query).fetchone()[0]
sql_query = f"SELECT id FROM network WHERE toid = '{nex_id}'"
wb_id = con.execute(sql_query).fetchall()
wb_ids = [str(x[0]) for x in wb_id]
cat_id = con.execute(sql_query).fetchall()
cat_ids = [str(x[0]) for x in cat_id]
if nex_id is None:
raise IndexError(f"No nexus found for gage ID {gage_id}")
return wb_ids
return cat_ids
32 changes: 32 additions & 0 deletions modules/data_processing/graph_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,36 @@ def get_graph() -> ig.Graph:
return network_graph


def get_outlet_id(wb_or_cat_id: str) -> str:
"""
Retrieves the ID of the node downstream of the given node in the hydrological network.
Given a node name, this function identifies the downstream node in the network, effectively tracing the water flow
towards the outlet.
When finding the upstreams of a 'wb' waterbody or 'cat' catchment, what we actually want is the upstreams of the outlet of the 'wb'.
Args:
name (str): The name of the node.
Returns:
str: The ID of the node downstream of the specified node.
"""
# all the watebody and catchment IDs are the same, but the graph nodes are named wb-<id>
# remove everything that isn't a digit, then prepend wb- to get the graph node name
stem = "".join(filter(str.isdigit, wb_or_cat_id))
name = f"wb-{stem}"
graph = get_graph()
node_index = graph.vs.find(name=name).index
# this returns the current node, and every node downstream of it in order
downstream_node = graph.subcomponent(node_index, mode="OUT")
if len(downstream_node) >= 2:
# if there is more than one node in the list,
# then the second is the downstream node of the first
return graph.vs[downstream_node[1]]["name"]
return None


def get_upstream_ids(names: Union[str, List[str]]) -> Set[str]:
"""
Retrieves IDs of all nodes upstream of the given nodes in the hydrological network.
Expand All @@ -102,6 +132,8 @@ def get_upstream_ids(names: Union[str, List[str]]) -> Set[str]:
names = [names]
parent_ids = set()
for name in names:
if "wb" in name or "cat" in name:
name = get_outlet_id(name)
if name in parent_ids:
continue
node_index = graph.vs.find(name=name).index
Expand Down
4 changes: 2 additions & 2 deletions modules/data_processing/subset.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,10 @@ def subset_parquet(ids: List[str], paths: file_paths) -> None:


def subset(
wb_ids: List[str], hydrofabric: str = file_paths.conus_hydrofabric(), subset_name: str = None
cat_ids: List[str], hydrofabric: str = file_paths.conus_hydrofabric(), subset_name: str = None
) -> str:

upstream_ids = get_upstream_ids(wb_ids)
upstream_ids = get_upstream_ids(cat_ids)

if not subset_name:
# if the name isn't provided, use the first upstream id
Expand Down
1 change: 1 addition & 0 deletions modules/map_app/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ def open_browser():

def set_logs_to_warning():
logging.getLogger("werkzeug").setLevel(logging.WARNING)
console_handler.setLevel(logging.DEBUG)


if __name__ == "__main__":
Expand Down
20 changes: 10 additions & 10 deletions modules/map_app/static/css/colors.css
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
/* colourblind safe taken from https://personal.sron.nl/~pault/ */
:root {
--selected-wb-outline: rgba(238, 51, 119, 0.7);
--selected-wb-fill: rgba(238, 51, 119, 0.316);
--selected-cat-outline: rgba(238, 51, 119, 0.7);
--selected-cat-fill: rgba(238, 51, 119, 0.316);

--upstream-wb-outline: rgba(238, 119, 51, 0.7);
--upstream-wb-fill: rgba(238, 119, 51, 0.278);
--upstream-cat-outline: rgba(238, 119, 51, 0.7);
--upstream-cat-fill: rgba(238, 119, 51, 0.278);

--flowline-to-wb-outline: rgba(0, 153, 136, 1);
--flowline-to-cat-outline: rgba(0, 153, 136, 1);
--flowline-to-nexus-outline: rgba(0, 119, 187, 1);

--nexus-outline: rgba(1, 1, 1, 0.5);
Expand All @@ -15,13 +15,13 @@
}

.high-contrast {
--selected-wb-outline: rgba(0, 68, 136, 1);
--selected-wb-fill: rgba(0, 68, 136, 0.316);
--selected-cat-outline: rgba(0, 68, 136, 1);
--selected-cat-fill: rgba(0, 68, 136, 0.316);

--upstream-wb-outline: rgba(221, 170, 51, 1);
--upstream-wb-fill: rgba(221, 170, 51, 0.278);
--upstream-cat-outline: rgba(221, 170, 51, 1);
--upstream-cat-fill: rgba(221, 170, 51, 0.278);

--flowline-to-wb-outline: #000000;
--flowline-to-cat-outline: #000000;
--flowline-to-nexus-outline: #BB5566;

--nexus-outline: rgba(1, 1, 1, 0.5);
Expand Down
10 changes: 5 additions & 5 deletions modules/map_app/static/css/legend.css
Original file line number Diff line number Diff line change
Expand Up @@ -49,16 +49,16 @@
margin: 5px;
}

#legend_selected_wb_layer_icon {
background-color: var(--selected-wb-outline);
#legend_selected_cat_layer_icon {
background-color: var(--selected-cat-outline);
}

#legend_upstream_layer_icon {
background-color: var(--upstream-wb-outline);
background-color: var(--upstream-cat-outline);
}

#legend_to_wb_icon {
background-color: var(--flowline-to-wb-outline);
#legend_to_cat_icon {
background-color: var(--flowline-to-cat-outline);
}

#legend_to_nexus_icon {
Expand Down
Loading

0 comments on commit 3fc4e69

Please sign in to comment.