-
Notifications
You must be signed in to change notification settings - Fork 1
Posting timeseries data directly to a WRES web‐service as inputs for a WRES job
- Which steps of Instructions for Programmatic Interaction with a WRES web-service are modified?
- Using fully functional
bash
andPython
scripts, provided in WRES Scripts Usage Guide, is recommended -
Prepare the declaration by omitting the
sources
for data to be sent directly -
Post the declaration to
https://.../job
as usual, but add the form parameterpostInput=true
- Post to
.../job/{jobId}/input
with form parameterpostInputDone=true
- Monitor the job as usual, get output as usual, clean up as usual
- Is it possible to post the data so that it is kept after the evaluation completes successfully?
- Job state diagram for status shown at
.../job/{jobId}/status
A WRES web-service instance supports posting timeseries (evaluation inputs) directly to the service. This capability is only available when executing an evaluation from the command-line or programmatically (e.g., via a script or other software), as described in Instructions for Programmatic Interaction with a WRES web-service.
NOTE: In all of the examples below, the WRES web-service URL is omitted and replaced with "...". To apply the example to your use case, replace the "..." in the examples with the appropriate web-service URL.
Which steps of Instructions for Programmatic Interaction with a WRES web-service are modified?
Steps 1 and 3 of those instructions are modified as follows, with details provided below:
The source
tags for data to be posted directly should be omitted from the declaration, as explained below. Otherwise, this step is unchanged.
3. “POST” the evaluation project declaration to the web-service using the web-service CA .pem file and record the server’s response.
When POSTing the declaration, include postInput=true
as another form parameter; see the example below. Then post the evaluation data to the observed
, predicted
, or baseline
inputs. Finally, when done posting data, signal to the web-service that all data has been posted and the evaluation can begin by posting another form parameter, postInputDone=true
; again, see the example below. All steps are described in detail below.
After posting the signal that all timeseries have been posted, the job will be processed and should be monitored, output obtained, and cleaned up per Steps 4 - 8 in Instructions for Programmatic Interaction with a WRES web-service.
All posted data will be removed upon successful completion of the evaluation unless the keepInput
flag was set, as described below.
Using fully functional bash
and Python
scripts, provided in WRES Scripts Usage Guide, is recommended
The scripts are described with examples in the WRES Scripts Usage Guide wiki. It is recommended that you use these scripts when interacting with a WRES web-service programmatically. The options to use to post data are -l
, -p
, and -b
. However, if you need to make the HTTP requests directly, perhaps because you are using another programming language, then the below sections describe how to do so.
For example, your declaration could look like this if you wish to post the observed
dataset and the predicted
dataset directly to the web-service for this job:
observed:
variable: 00060
predicted:
variable: streamflow
Yes. Data posted directly to a WRES web-service will be added to the declaration, one sources
entry per post. Sources already present will remain and be processed as usual.
Use the projectConfig
form parameter for the project declaration, as usual, but add form parameter postInput
with the value true
to tell web-service to wait for timeseries data before sending a job to a worker to be executed. For example, note the use of &postInput=true
in the following curl
command, where the declaration file has the name test2_config.yml
:
curl -v --cacert [web-service CA .pem] -d "projectConfig=$(cat test2_config.yml)&postInput=true" https://.../job/
Check the response code for 200 or 201, as usual. If successful (200 or 201), then the job status will have the status AWAITING_POSTS_OF_DATA
. You can navigate to the job location URL (see Step 5 of Instructions for Programmatic Interaction with a WRES web-service), .../job/{jobId}/status
, to verify.
While the job status is AWAITING_POSTS_OF_DATA
, the web-service will successfully accept posts of data for that particular job.
Post data to .../job/{jobId}/input/left
, .../job/{jobId}/input/right
, and/or .../job/{jobId}/input/baseline
It is highly recommend that files of data be gzipped prior to posting to save bandwidth and time during the posting process.
For each timeseries document (or blob), post to the corresponding dataset under the job’s input
URL as multipart/form-data
, with the data in the data
variable.
- For
observed
data, use.../job/{jobId}/input/left
- For
predicted
data, use.../job/{jobId}/input/right
- For (optional)
baseline
data, use.../job/{jobId}/input/baseline
For example, the following curl
commands will post the file test2_data/DRRC2QINE.xml.gz (gzipped XML)
for the observed
(or left
) source and test2_data/right_data.tgz
(a gzipped tarball) for the predicted
(or right
) source (note that the -F
option posts the data as multipart/form-data
):
curl -v --cacert [web-service CA .pem] -F data=@test2_data/DRRC2QINE.xml.gz https://.../job/{jobId}/input/left
curl -v --cacert [web-service CA .pem] -F data=@test2_data/right_data.tgz https://.../job/{jobId}/input/right
To post data using Python and the requests
library, be sure to post the data using the files
parameter to ensure it is posted as multipart/form-data
. For example:
input_file = "input_data.csv.gz"
right = open(input_file, "rb")
data_post_response = requests.post( url=job_location + "/input/right",
verify = wres_ca_file,
files = {"data": right} )
Any simple formats supported by core WRES as described in Instructions for Using WRES#Available-Evaluation-Data. A simple format for the purpose of this guide has one or more timeseries per document (or blob).
As stated above, any posted data should be gzipped prior to sending. The WRES is able to read individually gzipped files and gzipped tarballs when ingesting data. Using gzip will reduce the amount of data sent "over the wire", sometimes dramatically if its plain ASCII such as with XML files, and that in turn could reduce time waiting for data to be posted.
No. You can post in any order and multiple files can be posted to each input “side”. For example you can post document A to the right, followed by document B to the left, followed by document C to the right, followed by document D to the left, etc.
Yes, but it should be limited to no more than 3 concurrently.
Post using MIME/Content-Type application/x-www-form-urlencoded
the parameter postInputDone
set to value true
to the job location URL input, .../job/{jobId}/input
. This tells the web-service that you have no more data to post to this job. For example,
curl -v --cacert [web-service CA .pem] -d postInputDone=true https://.../job/{jobId}/input
The response from the service will include the complete declaration YAML prepared for the evaluation project with the posted data included. XML declaration will be migrated to YAML as part of this process.
NOTE: In order to add your posted data sources
to the declaration, it must be validated and parsed. If the declaration fails to validate, then the evaluation will fail, the .../job/{jobId}/status
will be FAILED_BEFORE_IN_QUEUE
, and the validation failure explanation will be returned as the HTTP response for your request (e.g., curl
will output that response to the terminal for the user to read).
Proceed with the usual workflow documented at Instructions for Programmatic Interaction with a WRES web-service beginning with Step 6, “Monitor…”
If something went wrong in the previous step, and the job does not get worked on by a WRES worker instance, then it may have state FAILED_BEFORE_IN_QUEUE
, so you should also monitor for this state in addition to the COMPLETED...
states revealed at job/{jobId}/status
.
Yes. To do so, add the form parameter keepInput
with the value true
to the request when posting the declaration. By doing so, after the evaluation completes, the web-service will NOT remove any of the posted data locally. The user may then copy the sources
added for the posted data from the returned declaration in order to reuse them later. For example:
curl -v --cacert [web-service CA .pem] -d "projectConfig=$(cat test2_config.yml)&postInput=true@keepInput=true” https://.../job
The added sources will have file names that are randomly generated and stored in a local directory. For example (where the path is partially omitted and replaced with "..."):
sources: file:///.../input_data/815463329142625139_1556163733477951741
Simply copying that from the declaration returned in the response to a new evaluation declaration will allow that data to be reused in that evaluation.
Data will be kept so long as the web-service admin does not remove it OR the web-service does not remove it due to the disk space filling up. If disk space is filling up, the web-service will remove posted data to save space, starting with the oldest.
┌───────────┐
│ │
│ CREATED │
│ │
└───┬───┬───┘
│ └──────────────────────┐
│ │
│ ▼
│ ┌──────────────────────────┐
│ │ │
│ │ AWAITING_POSTS_OF_DATA │
│ │ │
│ └────────┬───────┬─────────┘
│ │ └──────────────────┐
│ │ │
│ ▼ │
│ ┌─────────────────────────┐ │
│ │ │ │
│ │ NO_MORE_POSTS_OF_DATA │ │
│ │ │ │
│ └────────┬──────┬─────────┘ │
│ ┌──────────────────┘ └───────────────┐ │
│ │ │ │
▼ ▼ ▼ ▼
┌────────────┐ ┌──────────────────────────┐
│ │ │ │
│ IN_QUEUE │ │ FAILED_BEFORE_IN_QUEUE │
│ │ │ │
└──────┬─────┘ └──────────────────────────┘
│
│
▼
┌───────────────┐
│ │
│ IN_PROGRESS │
│ │
└─────┬───┬─────┘
┌───┘ └────────────────────────────┐
│ │
▼ ▼
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ │ │ │
│ COMPLETED_REPORTED_SUCCESS │ │ COMPLETED_REPORTED_FAILURE │
│ │ │ │
└──────────────────────────────┘ └──────────────────────────────┘
The WRES Wiki
-
Options for Deploying and Operating the WRES
- Obtaining and using the WRES as a standalone application
- WRES Local Server
- WRES Web Service (under construction)
-
- Format Requirements for CSV Files
- Format Requirements for NetCDF Files
- Introductory Resources on Forecast Verification
- Instructions for Human Interaction with a WRES Web-service
- Instructions for Programmatic Interaction with a WRES Web-service
- Output Format Description for CSV2
- Posting timeseries data directly to a WRES web‐service as inputs for a WRES job
- WRES Scripts Usage Guide