Skip to content

Commit 8504305

Browse files
Update Elavon sync DAG READMEs with more details (#2864)
1 parent 2f97ba5 commit 8504305

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

airflow/dags/parse_elavon/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,6 @@
22

33
Type: [Now / Scheduled](https://docs.calitp.org/data-infra/airflow/dags-maintenance.html)
44

5-
This DAG orchestrates the parsing of Elavon data. Even though this is a parse job, it handles all of history so it is a "now" type DAG.
5+
This DAG orchestrates the parsing of Elavon data, part two in the pipeline whose part 1 is the `sync_elavon` DAG. Starting with the partitioned, zipped, pipe-separated text files that the `sync_elavon` DAG transfered to GCS from Elavon's source SFTP server, elavon_to_gcs_jsonl.py produces JSONL files to be read into external tables, which then are used by downstream dbt models.
6+
7+
Even though this is a parse job, it handles all of history (since each new timestamped partition this DAG picks up contains the full contents of the source SFTP server, mirrored by the `sync_elavon` DAG) so it is a "now" type DAG.

airflow/dags/sync_elavon/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,8 @@
22

33
Type: [Now / Scheduled](https://docs.calitp.org/data-infra/airflow/dags-maintenance.html)
44

5-
This DAG orchestrates the syncing of raw Elavon data.
5+
This DAG orchestrates the syncing of raw Elavon data. It is part one of the two-part Elavon data pipeline - the second part is the parse_elavon DAG.
6+
7+
elavon_to_gcs_raw.py is a very simple script that mirrors the entire contents of the `data` subfolder in Elavon's provided SFTP server whenver the DAG is run, partitioned by the timestamp at the time of the run. The SFTP server is administered by Elavon, outside the direct control of our ecosystem.
8+
9+
This DAG is context-insensitive (a "now" type DAG), since the source files are not partitioned in any way and prior context can't be accurately reconstructed from the source files in the SFTP server. It will always load the world as it exists at runtime.

0 commit comments

Comments
 (0)