Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.1.5 #633

Merged
merged 16 commits into from
Feb 23, 2024
Merged

Release 2.1.5 #633

merged 16 commits into from
Feb 23, 2024

Conversation

nickchadwick-noaa
Copy link
Collaborator

No description provided.

nickchadwick-noaa and others added 16 commits February 21, 2024 09:57
I found an issue in UAT - this was my fault for missing this in my
testing. Sorry!
#558)

This PR migrates the ensemble-based Rapid Onset Flooding Probability
products from the Viz EC2 to the Viz Max Values lambda function (now
called Python Preprocessing). It broadly encapsulates the following
changes:

**Changes**
- New rapid_onset_flooding product script in the python_preprocessing
lambda function that supports both 12 hour SRF and 5 day GFS MRF
configurations.
- Removal of rapid_onset_flooding pipeline and product files in the
source-aws_loosa library

**Deployment Considerations:**
- Not sure if we should include this in the 2.1.4 release or not? I'm
good to test/fix quickly and thoroughly next week if you want to include
it. Otherwise, fine to go in the next one (wait to merge for now)
- We will need to include a ingest schema db dump when deploying to UAT.

---------

Co-authored-by: CoreyKrewson-NOAA <[email protected]>
This PR migrates the Anomaly product from the Viz EC2 to the Viz Python
Preprocessing lambda function. It broadly encapsulates the following
changes:

Changes:
- New anomaly product script in the python_preprocessing lambda function
that supports both 7day and 14day configurations.
- Removal of anomaly pipeline and product files in the source-aws_loosa
library
- A second python_preprocessing lambda function, which uses the same
source code, but with more RAM (I still need to add logic to choose the
correct function, based on the anomaly config. This will be much easier
to do once we deploy this to TI next week, so I will do it then).

Deployment Considerations:
- I may need to resolve merge conflicts after we merge Part 2 - Rapid
Onset Flooding
- We will need to include a ingest schema db dump when deploying to UAT.

---------

Co-authored-by: CoreyKrewson-NOAA <[email protected]>
Co-authored-by: NickChadwick-NOAA <[email protected]>
This contains a fix to the ROF layer names for HI and PRVI. This was
manually fixed on production during deployment on 11/8/2023.
The boto3 S3 client function `list_objects_v2` has a max result count of
1000 objects. There are more objects than this in many of the
`viz_cache` folders that are being searched and we've already had one
case of matching objects being missed because they weren't included in
that 1000 results. This PR replaces the basic `list_objects_v2` function
with a paginator-wrapped version that will continue grabbing 1000 object
batches until they all have been checked.
… recalculate (#603)

The RFC 5-Day Max Downstream Streamflow service was completely
recalculating its status based on comparing the output RnR flows to
streamflow thresholds where available. Because streamflow thresholds are
not available for every site, certain sites were not able to be
categorized. This is silly since the flood status of every relevant
point is already pre-calculated as the starting point for the RnR
WRF-Hydro inputs. Thus, this fix writes those statuses to the db during
pre-processing and then joins to that table to re-assign the status at
post-processing.

These changes fix a number of issues reported by WPOD:
https://vlab.noaa.gov/redmine/issues/124216
https://vlab.noaa.gov/redmine/issues/124212
https://vlab.noaa.gov/redmine/issues/124202
https://vlab.noaa.gov/redmine/issues/124165

The changes herein were hotfixed in-place, manually, on Production.
This PR wraps up the migration of the remaining EC2 VPP services
(ensemble-based probability services + ana anomaly) to the serverless
step functions. It includes:

- Couple minor bug fixes to previous PRs
- Initialize Pipeline / Step function logic to choose the correct
python_preprocessing lambda function during python preprocessing steps,
based on a flag in the product config files.

---------

Co-authored-by: Nick Chadwick <[email protected]>
This PR, if merged, will replace the existing "rfc_max_forecast" service
with a new version of the service whose backend data is now produced by
direct access to the WRDS RFC Forecast database as opposed to the WRDS
API. As a result:

* The `every_five_minute` EventBridge rule that triggered the
`viz-wrds-api-handler` Lambda has been modified to instead trigger the
`initialize-viz-pipeline` Lambda directly. The viz-wrds-api-handler code
is now deprecated and can be eventually removed.
* The `initialize-viz-pipeline` Lambda code had to be modified to allow
the "rfc" pipeline to be kicked off without referencing any input files
- as there are none. This allows it to act on more of a "cron" basis and
less of a "file-driven-event" basis.
* The new `products/rfc/rfc_max_forecast.sql` query relies upon a few
new database structures:
* Two new ENUM types - `flood_status` and `forecast_ts` - are used for
assigning trend and prioritizing "duplicate" forecasts. These are now
created in the RDS Bastion `postgresql_setup.sh.tftpl` script.
* Two new database views - `rnr.stage_thresholds` and
`rnr.flow_thresholds` - reorganize data in the external/foreign WRDS
Ingest DB threshold table (external.threshold) for more efficient use
here and eventually in other places (e.g. RnR). The SQL for these views
was committed for the record in the
`Core/LAMBDA/rnr_functions/rnr_domain_generator/sql/dba_stuff.sql` file,
but I believe this view should get copied over with the dump of the RnR
schema on deployments and thus should not need to be recreated manually.
* Forecasts that are distributed as flow-only (i.e. have no associated
rating curve to produce stage) are now also included as a value-added
win (addressing issue #312). As a result of this:
* The DB table produced by `rfc_max_forecast.sql` has new/modified
column names, replacing every occurrence of "stage" with "value" (i.e.
`max_stage_timestamp` to `max_value_timestamp`).
* The `rfc_max_forecast.mapx` file was thus also modified to replace
every occurrence of "stage" with "value" in both the "fieldName" and
"alias" fields (i.e. `max_stage_timestamp` to `max_value_timestamp` and
"Forecast Min Stage Timestamp" to "Forecast Min Value Timestamp")

This work will also allow the Replace and Route to be somewhat
redesigned to be completely in-sync with this new rfc_max_forecast
service - likely using the underlying `publish.rfc_max_forecast` table
as its starting point.
All is pretty much in the title. In the past 6 months or so I have
received two requests for schism (coastal) FIM data. I was foolish and
didn't formalize the work I had done on the first request, so when this
second one came in (see ticket #609), I made sure to save the work. So
here it is.

This script could be made more generic-user friendly, but as it stands
now, a user would only need to modify lines 546 through 564 with the
specifics per the request.
This change had been left out of my #621 PR on accident. This has
already been hard-coded directly on TI to confirm it corrects my issue
(so no need for an immediate redeploy). Everything is now working as
expected.
The rfc_max_forecast service was just re-designed to rely directly upon
the WRDS database rather than the API. This came with some value-added
updates and bug-fixes. With those in place, it made sense to also
re-design the Replace and Route workflow to now rely fully upon the
rfc_max_forecast service data rather than essentially recreating that
foundation itself.

This completely aligns the RnR domain with the RFC Max Forecast domain -
although the RnR domain will still be a subset of that. But before,
there was a chance for a point being in RnR that was not in RFC Max
Forecast in the case of flow-only forecasts.

A new column, rating_curve_source, is also added to the
rfc_based_5day_max_forecast service, which will indicate which rating
curve, if any, was used to produce the flows at each forecast point.
This is a super simple bug fix that changes the way the ANA Anomaly
loops through files in the python_preprocessing lambda function.
Downloaded NWM files are now deleted as data is loaded into a dataframe
in memory, so that ephemeral storage does not run out. RAM still seems
stable at about ~7GB used (of 10).

This was done because recent heavier weather was causing this function
to fail due to no space left on device.
various small fixes to allow deployment and some hotfixes that were
applied manually.
@nickchadwick-noaa nickchadwick-noaa merged commit cf420b6 into uat Feb 23, 2024
1 check passed
@nickchadwick-noaa nickchadwick-noaa deleted the release-2.1.5 branch February 26, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants