Skip to content

Commit

Permalink
fixing report, README updating
Browse files Browse the repository at this point in the history
  • Loading branch information
rizac committed Jun 25, 2023
1 parent c42c98d commit b85e840
Show file tree
Hide file tree
Showing 2 changed files with 61 additions and 93 deletions.
7 changes: 0 additions & 7 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip setuptools wheel
# pip install flake8 pytest
pip install -e ".[dev]"
# - name: Lint with flake8
# run: |
# stop the build if there are Python syntax errors or undefined names
# flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
# flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
pytest -xvvv ./test
147 changes: 61 additions & 86 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,15 @@
# me-compute

**DISCLAIMER (06-2023): this project is still undergoing a big refactoring
and migration from a private repository, please DO NOT CLONE or USE. In case of info, contact
me or open an issue**

Program to compute energy Magnitude (Me) from downloaded seismic events. The download
must be performed via [stream2segment](https://github.com/rizac/stream2segment)
(shipped with this package) into a custom SQLite or Postgres database (in this case,
Program to compute energy Magnitude (Me) from downloaded seismic events.

The download is performed via [stream2segment](https://github.com/rizac/stream2segment)
(included in this package) into a custom SQLite or Postgres database (in this case,
the database has to be setup beforehand).

Once downloaded, events and their data are fetched to compute each event Me (Me = mean
Once downloaded, events and their data can be fetched to compute each event Me (Me = mean
of all stations energy magnitudes in the 5-95 percentiles). The computed Me are available
in several formats: **CSV** (parametric table summarizing all events in rows),
**HTML** (report to visualize all events and their Me on a map) and
**QuakeMl** (one file per event, updated with their computed Me).
in several formats: CSV, HDF, HTML and QuakeMl (see Usage below for details)


## Installation:
Expand Down Expand Up @@ -68,7 +64,7 @@ and dataselect (`data_url`) FDSN web services, and all other parameters, if need
The download routine downloads data and metadata from the configured FDSN
event and dataselect web services into the database (Sqlite or Postgres using
[stream2segment](https://github.com/rizac/stream2segment) (with Postgres,
[stream2segment](https://github.com/rizac/stream2segment). With Postgres,
the db has to be setup beforehand) . Open `download.yaml`
(or a copy of it) and cconfigure `dburl` (ideally, you might want to setup also
`start`, `end`, `events_url` and `data_url`):
Expand All @@ -84,73 +80,70 @@ To compute the energy magnitude of events within a certain time range from the
data downloaded in the database
```bash
me-compute -s [START] -e [END] -d [download.yaml] [OUTPUT_DIR]
me-compute -s [START] -e [END] -d download.yaml [OUTPUT_DIR]
```
An excerpt of the program usage is available below (type `me-compute --help` for more
details):
(Type `me-compute --help` for more details)
OUTPUT_DIR is the destination root directory. You can use the special characters %S%
and %E% that will be replaced with the start and end time in ISO format, computed
from the given parameters. The output directory and its parents will be created if
they do not exist
In the output directory, the following files will be saved:
- **station-energy-magnitude.hdf** A tabular files where each row represents a
station/waveform and each column the station computed data and metadata,
including the station energy magnitude.
Note that the program assumes that a single channel (the vertical) is
downloaded per station, so that 1 waveform <=> 1 station
OUTPUT_DIR: the destination root directory. You can use the special characters %S%
and %E% that will be replaced with the start and end time in ISO format, computed
from the given parameters. The output directory and its parents will be created if
they do not exist
In the output directory, the following files will be saved:
- **energy-magnitude.csv** A tabular file (one row per event) aggregating the result
of the previous file into the final event energy magnitude. The event Me
is the mean of all station energy magnitudes within the 5-95 percentiles
- station-energy-magnitude.hdf A tabular files where each row represents a
station/waveform and each column the station computed data and metadata,
including the station energy magnitude.
Note that the program assumes that a single channel (the vertical) is
downloaded per station, so that 1 waveform <=> 1 station
- energy-magnitude.csv A tabular file (one row per event) aggregating the result
of the previous file into the final event energy magnitude. The final event Me
is the mean of all station energy magnitudes within the 5-95 percentiles
- **energy-magnitude.html** A report that can be opened in the user browser to
visualize the computed energy magnitudes on maps and HTML tables
- energy-magnitude.html A report that can be opened in the user browser to
visualize the computed energy magnitudes on maps and HTML tables
- [eventid1].xml, ..., [eventid1].xml All processed events saved in QuakeMl
format, updated with the information on their energy magnitude
- **[eventid1].xml, ..., [eventid1].xml** All processed events saved in QuakeMl
format, updated with the information of their energy magnitude
- energy-magnitude.log the log file where the info, errors and warnings
of the routine are stored. The core energy magnitude computation at station
level (performed via stream2segment utilities) has a separated and more
detailed log file (see below)
- station-energy-magnitude.log the log file where the info, errors and warnings
of the station energy magnitude computation have been stored
- **energy-magnitude.log** the log file where the info, errors and warnings
of the routine are stored. The core energy magnitude computation at station
level (performed via stream2segment utilities) has a separated and more
detailed log file (see below)
<!--
### Cron job (schedule downloads+process+report regularly)
- **station-energy-magnitude.log** the log file where the info, errors and warnings
of the station energy magnitude computation have been stored
### Cron job (schedule downloads+ Me computation)
Assuming your Python virtualenv is at `[VEN_PATH]`
With your python virtualenv activated (`source [VENV_PATH]/bin/activate`),
type `which me-compute`. You should see something like
`[VENV_PATH]/bin/me-compute`
`[VENV_PATH]/bin/me-compute` (same for `which s2s`).
Then you can set up cron jobs to schedule all above routines.
With the paths above, you can set up cron jobs to schedule all above routines.
For instance, below an example file that can be edited via
`crontab -e` (https://linux.die.net/man/1/crontab) and represents
a currently working example on a remote server
(you might need to change it according to your needs):
`crontab -e` (https://linux.die.net/man/1/crontab) and is taken from
a currently working example on a remote server.
It downloads events and data of the
previous day each day at midnight and 5 minutes (the download time span are set in
the download.yaml file), and after the download is completed (estimated in 5 hours)
it computes the energy magnitude in a
specified directory with start and end time encoded in the directory name:
```bash
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
...
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
Expand All @@ -159,39 +152,21 @@ a currently working example on a remote server
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h dom mon dow command
5 0 * * * [VENV_PATH]/bin/python [VENV_PATH]/bin/me-compute download -c /home/download.private.yaml [ROOT_DIR]
0 4 * * * [VENV_PATH]/bin/python [VENV_PATH]/bin/me-compute process -d [DOWNLOAD_YAML] [START] [END]
30 7 * * * [VENV_PATH]/bin/python [VENV_PATH]/bin/me-compute report /home/me/mecompute/mecomputed/
5 0 * * * [VENV_PATH]/bin/python [VENV_PATH]/bin/s2s download -c /home/download.private.yaml
0 5 * * * [VENV_PATH]/bin/python [VENV_PATH]/bin/me-compute -d [DOWNLOAD_YAML] -s [START] -e [END] "[ROOT_DIR]/me-result_%S%_%E%"
```
-->
<!--
## Misc
#### Generate test HTML report (to inspect visually):
Run `test_workflow::test_report_fromfile` and inspect
`test/data/process.result.multievent.html` `test/data/process.result.singleevent.html`
### Misc
#### Change the event URLs (for developers only)
This is not a foreseen change in the short run but better keep track of it to save a lot
of time in case.
#### Generating tests
For the download and process part, where the program delegates `stream2segment`,
you can change the event web service by simply changing the parameter `eventws` in the
`download.private.yaml` file with any valid FDSN event URL.
The problem is the HTML report: currently, we hard code in the Jinja template (`report.template.html`)
two URLs, related but not equal to `eventws`:
1) In each table row, an URL redirects to the event source page
2) In the map, an URL is queried to get the Moment tensor beach ball (which
is used as event icon on the map)
Run:
```commandline
pytest ./me-compute/test
```
Ideally, one should remove the hard coded URLs and implement Python-side a class
that, given the `eventws` URL in `download.private.yaml` and an `event_id`,
returns the two URLs 1) and 2) above, considering the case that any of those URLs
might not exist, and thus think about fallbacks for the missing anchor in the table
and the missing icon in the map
-->
Note that there is only one test routine generating files in a `test/tmp` directory
(git-ignored). The directory is **not** deleted automatically in order to leave
developers the ability to perform a further visual test on the generated output
(e.g. HTML report)

0 comments on commit b85e840

Please sign in to comment.