Skip to content

Ingestion Workflow

Stephanie Hong edited this page Jul 14, 2020 · 3 revisions

Adeptia Workflow Steps:

  • Spin through SFTP Sites specific Directory to recognize new data
  • Unzip dataPackets
  • Queue up - Setup dataSet processing queue in order not to over write each other CDM Native data at the staging area
  • Load manifest.csv & datacount.csv located at the root directory
  • Load dataSets in the datafiles sub-folder.
  • Load data to the CDM Native tables - Note, data load will fail if the data does not conform to the CDM table structure or data types.
  • Do Comparison - DataCount.Rowcount if counts mis-match generate status in the manifest table and update datacount table with diff counts.
  • Native CDM Data COVID19 Loinc Code correction, set valid LOINC codes to un-assigned null lab_loinc using lab text names.
  • Populate global domain id table with N3C ids - using siteid,domain, domain_sourceid, n3cid (siteid prepend sourceid), create date to prevent data colliding from multiple-sites.
  • Map native CDM to OMOP5.3.1 target tables, fields and build valuesSets map tables using static and dynamic mapping cross walk tables.
  • Locate and Fix/remap failed mappings
  1. **a. correct Mappings / try to re-map for those fields with 0 concept ids and nulls for COVID related fields.
  2. **b. address 0 concept_id when possible.
  3. **c. contains no overtly obsolete or invalid data. e. provide feedback loop with step 10
  • OMOP Approved Data Quality Checks - generate DQOMOP_report - deploy json report file to a known server.
  • Contribute current instance of dataSet to the N3C data store
  • Create a Safe Harbor Data Store using dateShift, timeShift, and time random fuzzy factor per patient.
Clone this wiki locally