This repository contains a set of Bash scripts designed to automate the retrieval and organization of event data from SEL-735 meters, the synchronization of SCADA data between directories, and the archival of data to a dedicated remote server.
Each of the following scripts are executed seperately and have their own config file.
-
data_pipeline.sh
Handles fetching and organizing raw event data from SEL-735 meters via FTP:
- Connects to the meter
- Downloads new event data
- Organizes directory structure and creates metadata
- Adds checksums
- Compresses raw data into
.zip
- Generates
.message
file to be ingest by data-streams-das-mqtt-pub
-
sync-scada-data.sh
Synchronizes SCADA data from a source directory to a destination directory:
- Supports syncing data over a configurable number of past months
- TO DO: Exclude current days data to avoid syncing partially written files.
-
archive_pipeline.sh
Transfers downloaded and processed meter data to a dedicated server:
- Uses
rsync
to transfer data to remote server - Automatically triggers a cleanup script if enabled via config
- Uses
-
Ensure you have the following before running the pipeline:
- Unix-like environment (Linux, macOS, or a Unix-like Windows terminal)
- FTP credentials for the meter
- Meter Configuration
- Must have installed:
lftp
,yq
,zip
,rsync
,jq
-
Clone the repository:
git clone [email protected]:acep-uaf/camio-meter-streams.git cd camio-meter-streams/cli_meter
Each script uses its own YAML configuration file located in the config/
directory.
-
Navigate to the config directory and copy the example configuration files:
cd config cp config.yml.example config.yml cp archive_config.yml.example archive_config.yml cp scada_config.yml.example scada_config.yml
-
Update each configuration file
config.yml
— used bydata_pipeline.sh
archive_config.yml
— used byarchive_pipeline.sh
scada_config.yml
— used bysync-scada-data.sh
-
Secure the configuration files
chmod 600 config.yml archive_config.yml scada_config.yml
This pipeline can be used in two ways:
- Manually, by executing the scripts directly from the command line
- Automatically, by running it as a scheduled systemd service managed through Chef
In production environments, each pipeline script is run automatically using a dedicated systemd
service and timer pair, configured through custom default attributes defined in the Chef cookbook.
Each configuration file has a corresponding Chef data bag that defines its values. All configuration data is centrally managed through Chef data bags and vaults. To make changes, update the appropriate Chef-managed data bags and cookbooks.
Cookbooks:
- acep-camio-streams - installs and configures the server.
- acep-devops-chef
To run the data pipeline and then transfer data to the target server:
-
Data Pipeline (Event Data)
./data_pipeline.sh -c config/config.yml
-
Sync SCADA Data
./sync-scada-data.sh -c config/scada_config.yml
-
Archive Pipeline
./archive_pipeline.sh -c config/archive_config.yml
Note:
rsync
uses the--exclude
flag to exclude theworking/
directory to ensure only complete files are transfered. -
Run the Cleanup Process (Conditional) The cleanup script removes outdated event files based on the retention period specified in the configuration file.
If
enable_cleanup
is set totrue
inarchive_config.yml
,cleanup.sh
runs automatically afterarchive_pipeline.sh
.Otherwise, you can run it manually:
./cleanup.sh -c config/archive_config.yml
Note: Ensure
archive_config.yml
specifies retention periods for each directory.
When you need to stop the pipeline:
- To Stop Safely/Pause Download:
- Use
Ctrl+C
to interrupt the process. - If interupting the proccess doesn't work try
Ctrl+\
to quit. - If you would like to resume the download, rerun the
data_pipeline
command.The download will resume from where it left off, provided the same config file (-c
)is used.
- Use
- Avoid Using
Ctrl+Z
:- Do not use
Ctrl+Z
to suspend the process, as it may cause the pipeline to end without properly closing the FTP connection.
- Do not use
This repository includes automated tests for the scripts using Bats (Bash Automated Testing System) along with helper libraries: bats-assert
, bats-mock
, and bats-support
. The tests are located in the test
directory and are automatically run on all pull requests using Github Actions to ensure code quality and functionality.
Ensure you have cloned the repository with its required submodules, they should be located under the test
and test/test_helper
directories:
bats-core
bats-assert
bats-mock
bats-support
-
Clone the repository with submodules:
git clone --recurse-submodules [email protected]:acep-uaf/camio-meter-streams.git
If you have already cloned the repository without submodules, you can initialize and update them with:
git submodule update --init --recursive
-
Navigate to the project directory:
cd /path/to/camio-meter-streams/cli_meter
-
Run all the tests:
bats test
When making changes to the pipeline, it is essential to add or update tests to cover the new or modified functionality. Follow these steps to add tests:
-
Locate the appropriate test file:
Navigate to the
test
directory and identify the test file that corresponds to the functionality you're modifying. If no such file exists, create a new test file using the.bats
extension (e.g.,my_script_test.bats
). -
Write your tests:
Use
bats-assert
,bats-mock
, andbats-support
helper libraries to write comprehensive tests. Refer to the bats-core documentation.If your tests require shared variables or helper functions, define them in
test/test-helper/commons.bash
to ensure consistency and reusability across multiple test files. For example:# commons.bash MY_VARIABLE="common value" function my_helper_function { echo "This is a helper function" }
Example test structure:
@test "description of the test case" { # Arrange # Set up any necessary environment or input data. # Act result=$(command-to-test) # Assert assert_success assert_output "expected output" }
-
Run your tests locally:
Ensure your new tests pass locally by running
bats test
. -
Commit and push your changes:
Include your test updates in the same pull request as the code changes.
All tests in the repository are automatically executed through GitHub Actions on every pull request. This ensures that all contributions meet quality and functionality standards before merging. Ensure your pull request passes all tests to avoid delays in the review process.