-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As a user, I want the example scripts (bash and python) to demonstrate use of the data-direct/streaming capability #274
Comments
Original Redmine Comment I'm going to try to work up an example this morning, since I've already completed the meeting agenda. I'm going to work from @/home/ISED/wres/wresTestData/issue95993@ to develop and test the script, starting from the example that is in the repository. I'll then put my changes back into the repo once I'm done, but will leave the directory in place as an area to test it. I'm also going to modify the script to use the proxy URL, since that is where we are going to direct our users, eventually. Thanks, Hank |
Original Redmine Comment I note that I am going for this from #94510-148:
If implemented in a single script, this will complicate it significantly. Specifically, the .sh will need to include an if-clause that either loads the declaration from an internal statement or from a file. But more, the declaration will necessarily be different between the data-generated-in-memory (which I take to mean data posted directly) and data-from-file. So there are four total possibilities for specifying the declaration:
I'm going to start with 1 and 2 and see how convoluted it becomes to put two declarations in the XML. Hank |
Original Redmine Comment Jesse, When you have a few minutes, can you take a look at this script and give me your thoughts? I'm attempting to accomplish these two in one script:
The script is long, but that's primarily because of the verbose descriptions of each step. The script works when tested and it uses the proxy URL. Thanks, Hank |
Original Redmine Comment Oops... failed to provide the location of the script: @/home/ISED/wres/wresTestData/issue95993/wres_http_example.sh@ To be clear, this is not a high priority, so no rush. Thanks, Hank |
Original Redmine Comment Taking a look |
Original Redmine Comment For the multiple "right" dataset posts, I think it could be a list and iterated over. I would pick the same order for the conditionals regarding "posting data", e.g. always have @if [ $POST_DATA_DIRECTLY = "true" ]@ first and the @else@ second. What follows is a bit of a bigger departure, but I would eventually like to see the data in a variable or generated by the calling script on the fly to emphasize that files are not needed. This would be easier in python than bash. Maybe heredocs would be fine for that too. Also a bigger departure would be to have the conditional "use files" versus "use direct data", such that you could see the example of using the heredoc directly versus having it read the data, for both the project declaration and data files. |
Original Redmine Comment Jesse wrote:
I was wondering about that and whether its clearer to post them explicitly one at a time. Small change, regardless.
Good point. Thanks for catching.
Again, good point. When I move onto Python (which will be an adventure give my limited experience), I'll see if I can work up that example. I can also add a comment in the script mentioning that the data need not be posted from a file.
To make sure I'm understanding what you mean by heredoc in this context, is the script treatment of the declaration, where it is embedded in the script, considered a heredoc? From what I'm seeing when Googling the terminology, it is, but checking to make sure we have the same understanding. Anyway, including the data, itself, in the script would make it even longer, which might be fine given its already pretty long. Thanks, Hank |
Original Redmine Comment Hank wrote:
Changes have been made to the script to address the above. As for the other comments, I'm waiting to make sure I understand what is meant by heredocs in this context. I believe you are saying you want the content of the data files included directly in the script and referenced via a variable, following how the declaration is currently handled. This will make the script longer and uglier, but it will also make it completely self contained and not tied to external files (hopefully making it clear that data can be posted directly to the service without actually creating it as a local file, first). Thanks, Hank |
Original Redmine Comment I am pretty sure that is the intention, yes. I would just make the example use a tiny amount of data, say 2 pairs, or break the data generation into a separate data generating function - it could be fake data. |
Original Redmine Comment It currently is fake data. The files are ones we use in system testing: 1985043012_DRRC2FAKE1_forecast.xml I'd like to continue the theme of using data that is vetted, but that's 277 lines of data. I could shrink it, I guess. Another possibility would be to include the file contents at the bottom of the script, in an appendix so to speak, so its not in the way. But I'm not sure if @bash@ allows for that. I'll look it up. Hank |
Original Redmine Comment I see, then I would probably just create a data generating function that creates that data, else choose a different example (agree that it is nice to re-use data, though). Anyway, I probably wouldn't inline all that crap into the function that does the api interaction. I don't think there's any way to return a string from a bash function, only an integer, so that creates some ugliness with setting a globally-scoped variable or something. Will be much cleaner in python. |
Original Redmine Comment James wrote:
I would just call a function to set a variable, just as is done with the current declaration, and then refer to that variable in the @curl@ calls. I have no problem with that approach if I can get the data out of the way, pushed to the bottom of the script. However, I think scripts are processed sequentially, top-to-bottom, so that isn't possible. Again, I'll do some internet searching to confirm, just haven't had time yet.
Agreed. Hank |
Original Redmine Comment Hmmm, perhaps I am not following, but don't you just want a composition of functions? I don't see where ordering comes in, providing you call the composition after it is defined. @script.sh@
I mean, you couldn't put the last @main@ upfront. |
Original Redmine Comment I would expect the above to produce:
|
Original Redmine Comment
That's the trick. If my goal is to move the variables holding the data out of the way, then I need those variables defined in a function at the bottom of the script. I would then refer to that function at the top of the script. So, indeed, I would need to put the main upfront or my goal (of pushing the clutter to the end) is not achieved. I'm just worried a viewer of the example will open it, see the clutter of tons of data, be annoyed, not want to scroll down to the find the start of the actual example, and then close it. Perhaps that worry is unfounded. Hank |
Original Redmine Comment Oh, I see what you are saying. Put the interesting stuff in the main, then define the data, then call main at the bottom. Got it, Hank |
Original Redmine Comment Let me edit the script to follow that design and include the data as a heredoc. Hank |
Original Redmine Comment Right, the last call that kicks off the sequence is just a detail, not really important for understanding, although you could make a note upfront, if that helps. The body of work is inside each function. With suitably named functions that describe what they are doing, I think it would work. |
Original Redmine Comment Having a hard time structuring the @curl@ command. Before, this command would past the contents of the referenced file:
Great. Now I'm trying to post the content of a variable. I defined the variable @observation_data@ and tried this:
I know the contents of the file is within the variable, because I see the following message (note XML snippet which is the beginning of the content of the file):
I'm not sure why its trying to open it, since I don't use '@'. I need to figure out how to post the contents of the variable to the COWRES. I've already done quite a bit of internet searching, but will continue. If anyone spots the problem, let me know. Thanks, Hank |
Original Redmine Comment Might be a single-quote/double-quote thing. Using double qoutes, the @curl@ command tried to process the content of ${observation_data} which starts with the opening XML '<'. That tells it to load data from a file; hence the reported file not found error. When I switch to single-quotes, a different error occurs that I am now investigating. Hank |
Original Redmine Comment Ah, with single-quotes, nothing is evaluated, including the content of the variable. The file, production side, looks like this:
So how to get it to process ${observation_data} without @curl@ then attempting to interpret the contents of the variable? Hmmm... Gotta love bash! Hank |
Original Redmine Comment If I include a space before the content of the variable,
the data is posted correctly, but that space causes WRES to not recognize it:
So I need the first character in the file to be '<' but in such a way that @curl@ doesn't attempt to interpret the '<'. Hank |
Original Redmine Comment Found it. Sheesh. Had to change @-F@ to @--form-string@: @post_result=$(curl -i --cacert Use of --form-string prevents curl from parsing the contents. I'm going to clean it up a bit and then wait for Jesse to review. I'm not sure I should post it to this ticket given some host information embedded in it. Thanks, Hank |
Original Redmine Comment Jesse: Can you please review the example, again? /home/ISED/wres/wresTestData/issue95993/wres_http_example.sh I believe it satisfies what you recommended here: Jesse wrote:
However, it does not satisfy this and I don't know that I want it to:
I like having the example script be completely self-contained. I know I originally went with files, but that was because I wasn't comfortable including the data in the script. Now that I've included it at the end of the script, not the beginning, I'm more comfortable. I mention that files can be referred to, instead, but I don't think having it run from files is necessary. Thoughts? Hank |
Original Redmine Comment Thanks for helping with this example, Hank. I am taking another look. |
Original Redmine Comment I think the current leading comment could be removed or re-worked, and the original leading comment promoted to the top. The contents of the main function are important to the example so to say otherwise is confusing. Indentation. The @for@ loop doesn't need a counter, it can be @for timeseries in ${array}@ or whatever, then refer to @${timeseries}@ in the body. I agree that it's become pretty long. And I agree when things become long then it is nicer to split them up into functions. Taking that to the logical conclusion, there are probably other blocks that can be split out into functions, such that the steps have names, and then the function calls at the bottom are an outline of the steps in order. I don't know if that's necessary though. |
Original Redmine Comment I'll make the comment change and checkout options for structuring the for loop, though I don't think avoiding indexing is critical. As long as it works, is documented, and is somewhat understandable given its in @bash@. I assume the indentation comment is a reference to the stuff in main not being indented an additional level. I still view main() as just noise, to be frank, and something I would rather not have done to avoid confusing the reader. That's why I didn't indent: I was hoping to deemphasize it. I'll go ahead and add it; just not sure it adds value for most readers. I'd rather not break everything into functions. Just enough to move the data to the end is good enough, imho. Thanks for looking it over! Getting closer, Hank |
Original Redmine Comment Changes made. Can't test it due to #96161. Hank |
Original Redmine Comment Almost forgot about this... Proposed final version of the example script can be found here: @/home/ISED/wres/wresTestData/issue95993/wres_http_example.sh@ If there are no objections, I'll push it to the repo tomorrow. I'll also use this as the example script that the RFCs use to confirm access. Note that its written to use the proxy and a .pem file located in a cacerts directory (as is the case in the repo, I believe). I'll remove the cacerts directory when I share it with the field, and just give them instructions to have the .pem file located in the same directory as the script. Something like that. Thanks, Hank |
Original Redmine Comment Pushed in commit:28cc52f7165131d9fcb94ef7fbf6a201afd8e1ac. Checking the box. Leaving the ticket to the Backlog since there is still a Python script to update. I might take this as an opportunity to learn some Python, but not until after the training, I'm guessing, when I'll have some time to play around with it. Thanks, Hank |
Original Redmine Comment The wiki was updated to include a link to the bash example: Hank |
Original Redmine Comment Hank: Change the default hostname in the example to localhost. Hank |
Original Redmine Comment Another thing I remembered is maybe it should have only the csv2 output selected. Then the curl command can be piped to gunzip or something to display results. |
Author Name: Jesse (Jesse)
Original Redmine Issue: 95993, https://vlab.noaa.gov/redmine/issues/95993
Original Date: 2021-09-08
Original Assignee: Hank
None
The text was updated successfully, but these errors were encountered: