Skip to content

Commit

Permalink
Merge branch 'issue934_2' into development
Browse files Browse the repository at this point in the history
  • Loading branch information
petersilva committed Feb 17, 2024
2 parents 5124ee7 + 1b43aac commit 0fd462c
Show file tree
Hide file tree
Showing 47 changed files with 236 additions and 231 deletions.
2 changes: 1 addition & 1 deletion docs/source/Contribution/AMQPprimer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ Topic-based Exchanges
~~~~~~~~~~~~~~~~~~~~~

Topic-based exchanges are used exclusively. AMQP supports many other types of exchanges,
but sr_post have the topic sent in order to support server side filtering by using topic
but sr3_post have the topic sent in order to support server side filtering by using topic
based filtering. At AMQP 1.0, topic-based exchanges (indeed all exchanges, are no
longer defined.) Server-side filtering allows for much fewer topic hierarchies to be used,
and for much more efficient subsciptions.
Expand Down
22 changes: 14 additions & 8 deletions docs/source/Contribution/Design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,13 +125,13 @@ as is provided by many free brokers, such as rabbitmq, often referred to as 0.8,
0.9 brokers are also likely to inter-operate well.

In AMQP, many different actors can define communication parameters. To create a clearer
security model, sarracenia constrains that model: sr_post clients are not expected to declare
security model, sarracenia constrains that model: sr3_post clients are not expected to declare
Exchanges. All clients are expected to use existing exchanges which have been declared by
broker administrators. Client permissions are limited to creating queues for their own use,
using agreed upon naming schemes. Queue for client: qc_<user>.????

Topic-based exchanges are used exclusively. AMQP supports many other types of exchanges,
but sr_post have the topic sent in order to support server side filtering by using topic
but sr3_post have the topic sent in order to support server side filtering by using topic
based filtering. The topics mirror the path of the files being announced, allowing
straight-forward server-side filtering, to be augmented by client-side filtering on
message reception.
Expand Down Expand Up @@ -492,6 +492,12 @@ interaction with many layers, including the application. Disks are either dedic
or a cluster file system is to be used. The application is expected to deal with those two
cases.

most of the cluster management is taken care of by the sr3_tools project:

https://github.com/MetPX/sr3_tools

A review of that project to manage deployments regardless of topology, would be helpful.

Some document short-hand:

Bunny
Expand Down Expand Up @@ -610,25 +616,25 @@ Broker clustering is considered mature technology, and therefore relatively trus
DD: Data Dissemination Configuration (AKA: Data Mart)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The sr deployment configuration is more of an end-point configuration. Each node is expected to
have a complete copy of all the data downloaded by all the nodes. Giving a unified view makes
The sr3 deployment configuration is more of an end-point configuration. Each node is expected to
have a complete copy of all the data downloaded by all the nodes. Giving a unified view makes
it much more compatible with a variety of access methods, such as a file browser (over http,
or sftp) rather than being limited to AMQP notification messages. This is the type of view presented by
or sftp) rather than being limited to AMQP notification messages. This is the type of view presented by
dd.weather.gc.ca.

Given this view, all files must be fully reassembled on receipt, prior to announcing downstream
availability. files may have been fragmented for transfer across intervening pumps.
availability. Files may have been fragmented for transfer across intervening pumps.

There are multiple options for achieving this end user visible effect, each with tradeoffs.
In all cases, there is a load balancer in front of the nodes which distributes incoming
connection requests to a node for processing.

- multiple server nodes. Each standalone.

- sr - load balancer, just re-directs to a sr node?
- sr3 - load balancer, just re-directs to a sr3 node?
dd1,dd2,

broker on sr node has connection thereafter.
broker on sr3 node has connection thereafter.


Independent DD
Expand Down
30 changes: 15 additions & 15 deletions docs/source/Contribution/Development.rst
Original file line number Diff line number Diff line change
Expand Up @@ -638,7 +638,7 @@ Install a minimal localhost broker and configure rabbitmq test users::
sudo wget http://localhost:15672/cli/rabbitmqadmin
sudo chmod 755 rabbitmqadmin

sr --users declare
sr3 --users declare

.. Note::

Expand Down Expand Up @@ -766,7 +766,7 @@ and defines some fixed test clients that will be used during self-tests::
Starting flow_post on: /home/peter/sarra_devdocroot, saving pid in .flowpostpid
Starting up all components (sr start)...
done.
OK: sr start was successful
OK: sr3 start was successful
Overall PASSED 4/4 checks passed!
blacklab%

Expand Down Expand Up @@ -814,7 +814,7 @@ Then check show it went with flow_check.sh::
test 4 success: max shovel (1022) and subscriber t_f30 (1022) should have about the same number of items
test 5 success: count of truncated headers (1022) and subscribed messages (1022) should have about the same number of items
test 6 success: count of downloads by subscribe t_f30 (1022) and messages received (1022) should be about the same
test 7 success: downloads by subscribe t_f30 (1022) and files posted by sr_watch (1022) should be about the same
test 7 success: downloads by subscribe t_f30 (1022) and files posted by watch (1022) should be about the same
test 8 success: posted by watch(1022) and sent by sr_sender (1022) should be about the same
test 9 success: 1022 of 1022: files sent with identical content to those downloaded by subscribe
test 10 success: 1022 of 1022: poll test1_f62 and subscribe q_f71 run together. Should have equal results.
Expand All @@ -835,7 +835,7 @@ thorough, it is good to know the flows are working.

Note that the *fclean* subscriber looks at files in and keeps files around long enough for them to go through all the other
tests. It does this by waiting a reasonable amount of time (45 seconds, the last time checked.) then it compares the file
that have been posted by sr_watch to the files created by downloading from it. As the *sample now* count proceeds,
that have been posted by watch to the files created by downloading from it. As the *sample now* count proceeds,
it prints "OK" if the files downloaded are identical to the ones posted by sr_watch. The addition of fclean and
the corresponding cfclean for the cflow_test, are broken. The default setup which uses *fclean* and *cfclean* ensures
that only a few minutes worth of disk space is used at a given time, and allows for much longer tests.
Expand Down Expand Up @@ -877,9 +877,9 @@ between each run of the flow test::
2018-02-10 14:17:34,353 [INFO] info: report option not implemented, ignored.
2018-02-10 09:17:34,837 [INFO] sr_poll f62 cleanup
2018-02-10 09:17:34,845 [INFO] deleting exchange xs_tsource_poll (tsource@localhost)
2018-02-10 09:17:35,115 [INFO] sr_post shim_f63 cleanup
2018-02-10 09:17:35,115 [INFO] sr3_post shim_f63 cleanup
2018-02-10 09:17:35,122 [INFO] deleting exchange xs_tsource_shim (tsource@localhost)
2018-02-10 09:17:35,394 [INFO] sr_post test2_f61 cleanup
2018-02-10 09:17:35,394 [INFO] sr3_post test2_f61 cleanup
2018-02-10 09:17:35,402 [INFO] deleting exchange xs_tsource_post (tsource@localhost)
2018-02-10 09:17:35,659 [INFO] sr_report tsarra_f20 cleanup
2018-02-10 09:17:35,659 [INFO] AMQP broker(localhost) user(tfeed) vhost(/)
Expand Down Expand Up @@ -941,7 +941,7 @@ between each run of the flow test::
2018-02-10 09:17:39,927 [INFO] deleting queue q_tsource.sr_subscribe.u_sftp_f60.81353341.03950190 (tsource@localhost)
2018-02-10 09:17:40,196 [WARNING] option url deprecated please use post_base_url
2018-02-10 09:17:40,196 [WARNING] use post_broker to set broker
2018-02-10 09:17:40,197 [INFO] sr_watch f40 cleanup
2018-02-10 09:17:40,197 [INFO] watch f40 cleanup
2018-02-10 09:17:40,207 [INFO] deleting exchange xs_tsource (tsource@localhost)
2018-02-10 09:17:40,471 [INFO] sr_winnow t00_f10 cleanup
2018-02-10 09:17:40,471 [INFO] AMQP broker(localhost) user(tfeed) vhost(/)
Expand Down Expand Up @@ -1043,7 +1043,7 @@ While it is running one can run flow_check.sh at any time::
test  4 success: max shovel (100008) and subscriber t_f30 (99953) should have about the same number of items
test  5 success: count of truncated headers (100008) and subscribed messages (100008) should have about the same number of items
test  6 success: count of downloads by subscribe t_f30 (99953) and messages received (100008) should be about the same
test  7 success: same downloads by subscribe t_f30 (199906) and files posted (add+remove) by sr_watch (199620) should be about the same
test  7 success: same downloads by subscribe t_f30 (199906) and files posted (add+remove) by watch (199620) should be about the same
test  8 success: posted by watch(199620) and subscribed cp_f60 (99966) should be about half as many
test  9 success: posted by watch(199620) and sent by sr_sender (199549) should be about the same
test 10 success: 0 messages received that we don't know what happenned.
Expand Down Expand Up @@ -1092,14 +1092,14 @@ Sometimes flow tests (especially for large numbers) get stuck because of problem
To recover from this state without discarding the results of a long test, do::

^C to interrupt the flow_check.sh 100000
blacklab% sr stop
blacklab% sr3 stop
blacklab% cd ~/.cache/sarra
blacklab% ls */*/*retry*
shovel/pclean_f90/sr_shovel_pclean_f90_0001.retry shovel/pclean_f92/sr_shovel_pclean_f92_0001.retry subscribe/t_f30/sr_subscribe_t_f30_0002.retry.new
shovel/pclean_f91/sr_shovel_pclean_f91_0001.retry shovel/pclean_f92/sr_shovel_pclean_f92_0001.retry.state
shovel/pclean_f91/sr_shovel_pclean_f91_0001.retry.state subscribe/q_f71/sr_subscribe_q_f71_0004.retry.new
blacklab% rm */*/*retry*
blacklab% sr start
blacklab% sr3 start
blacklab%
blacklab% ./flow_check.sh 100000
Sufficient!
Expand Down Expand Up @@ -1129,9 +1129,9 @@ To recover from this state without discarding the results of a long test, do::
test 4 success: sr_subscribe (98068) should have the same number of
items as sarra (98075)
| watch routing |
test 5 success: sr_watch (397354) should be 4 times subscribe t_f30 (98068)
test 5 success: watch (397354) should be 4 times subscribe t_f30 (98068)
test 6 success: sr_sender (392737) should have about the same number
of items as sr_watch (397354)
of items as watch (397354)
test 7 success: sr_subscribe u_sftp_f60 (361172) should have the same
number of items as sr_sender (392737)
test 8 success: sr_subscribe cp_f61 (361172) should have the same
Expand All @@ -1142,11 +1142,11 @@ To recover from this state without discarding the results of a long test, do::
test 10 success: sr_subscribe q_f71 (195406) should have about the
same number of items as sr_poll test1_f62(195408)
| flow_post routing |
test 11 success: sr_post test2_f61 (193541) should have half the same
test 11 success: sr3_post test2_f61 (193541) should have half the same
number of items of sr_sender(196368)
test 12 success: sr_subscribe ftp_f70 (193541) should have about the
same number of items as sr_post test2_f61(193541)
test 13 success: sr_post test2_f61 (193541) should have about the same
same number of items as sr3_post test2_f61(193541)
test 13 success: sr3_post test2_f61 (193541) should have about the same
number of items as shim_f63 195055
| py infos routing |
test 14 success: sr_shovel pclean_f90 (97019) should have the same
Expand Down
4 changes: 2 additions & 2 deletions docs/source/Contribution/on_part_assembly.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ File Re-assembling
Components
----------

**sr_watch:** You can use sr_watch to watch a directory for incoming partition files (.Part) from sr_subscribe or sr_sender, both have the ability to send a file in partitions. In the config file for sr_watch the important parameters to include are:
**sr_watch:** You can use sr3_watch to watch a directory for incoming partition files (.Part) from sr_subscribe or sr_sender, both have the ability to send a file in partitions. In the config file for sr3_watch the important parameters to include are:

- path <path of directory to watch>
- on_part /usr/lib/python3/dist-packages/sarra/plugins/part_file_assemble.py
Expand Down Expand Up @@ -45,7 +45,7 @@ After being triggered by a downloaded part file:
Testing
-------

Create an sr_watch config file according to the template above
Create an sr3_watch config file according to the template above
Start the process by typing the following command: ```sr_watch foreground path/to/config_file.cfg```

Then create a subcriber config file and include ```inplace off``` so the file will be downloaded in parts
Expand Down
16 changes: 8 additions & 8 deletions docs/source/Contribution/v03.rst
Original file line number Diff line number Diff line change
Expand Up @@ -487,7 +487,7 @@ With shovel and winnow replaced by new implementations, it passes
the dynamic flow test, including the Retry module ported to sr3, and
a number of v2 modules used as-is.

Completed an initial version of the sr_post component now (in sr3: flowcb.gather.file.File)
Completed an initial version of the sr3_post component now (in sr3: flowcb.gather.file.File)
Now working on sr_poll, which will take a while because it involve refactoring: sr_file, sr_http,
sr_ftp, sr_sftp into the transfer module

Expand Down Expand Up @@ -705,7 +705,7 @@ Probably need to be settled before having anyone else dive in.
likely equivalent to async, and multi-gather.

* think about API by sub-classing flow... and having it auto-integrate
with sr entry point... hmm... likely look at this when updating
with sr3 entry point... hmm... likely look at this when updating
Programmer's Guide.

* more worklists? rename failed -> retry or deferred. Add a new failed
Expand All @@ -726,7 +726,7 @@ FIXME are things left to the side that need to be seen to.


* **RELEASE BLOCKER** hairy. #403
sr_watch does not batch things. It just dumps an entire tree.
watch does not batch things. It just dumps an entire tree.
This will need to be re-wored before release into an iterator style approach.
so if you start in a tree with a million files, it will scan the entire million
and present them as a single in memory worklist. This will have performance
Expand All @@ -737,7 +737,7 @@ FIXME are things left to the side that need to be seen to.
impact and delay to producing the first file is still there, but at least
returns one batch at a time.

* **RELEASE BLOCKER** logs of sr_poll and sr_watch tend to get humungous way too quickly. #389
* **RELEASE BLOCKER** logs of sr_poll and watch tend to get humungous way too quickly. #389

* try out jsonfile for building notification messages to post. can build json incrementally, #402
so you do not need to delete the _deleteOnPost elements (can just skip over them)
Expand Down Expand Up @@ -784,7 +784,7 @@ Name the package metpx-sarra3 and have the python class directory be sarra3 (ins
retry files have different formats? validate. ) So one can copy configurations from old to
new and run both versions in parallel. The central entry point would be sr3 (rather than
sr), and to avoid confusion the other entry points (sr_subscribe etc...) would be omitted
so that v2 code would work unchanged. Might require some tweaks to have the sr classes
so that v2 code would work unchanged. Might require some tweaks to have the sr3 classes
ignore instances from the other versions.

This is similar to python2 to python3 transition. Allows deployment of sr3 without having
Expand Down Expand Up @@ -889,11 +889,11 @@ Features

* properties/options for classes are now hierarchical, so can set debug to specific classes within app.

* sr ability to select multiple components and configurations to operate on.
* sr3 ability to select multiple components and configurations to operate on.

* sr list examples is now used to display examples separate from the installed ones.
* sr3 list examples is now used to display examples separate from the installed ones.

* sr show is now used to display the parsed configuration.
* sr3 show is now used to display the parsed configuration.

* notification messages are acknowledged more quickly, should help with throughput.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/Explanation/CommandLineGuide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1138,7 +1138,7 @@ The pollUrl option specify what is needed to connect to the remote server

The *pollUrl* should be set with the minimum required information...
**sr_poll** uses *pollUrl* setting not only when polling, but also
in the sr_post notification messages produced.
in the sr3_post notification messages produced.

For example, the user can set :

Expand Down
2 changes: 1 addition & 1 deletion docs/source/Explanation/DeploymentConsiderations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@ Subscribers


Post, Notice, Notification, Advertisement, Announcement
These are AMQP messages build by sr_post, sr_poll, or sr_watch to let users
These are AMQP messages build by sr_post, sr_poll, or sr3_watch to let users
know that a particular file is ready. The format of these AMQP messages is
described by the `sr_post(7) <../Reference/sr3.1.html#post>`_ manual page. All of these
words are used interchangeably. Advertisements at each step preserve the
Expand Down
Loading

0 comments on commit 0fd462c

Please sign in to comment.