-
Notifications
You must be signed in to change notification settings - Fork 13
Ingestion Workflow
Stephanie Hong edited this page Jul 14, 2020
·
3 revisions
Adeptia Workflow Steps:
- Spin through SFTP Sites specific Directory to recognize new data
- Unzip dataPackets
- Queue up - Setup dataSet processing queue in order not to over write each other CDM Native data at the staging area
- Load manifest.csv & datacount.csv located at the root directory
- Load dataSets in the datafiles sub-folder.
- Load data to the CDM Native tables - Note, data load will fail if the data does not conform to the CDM table structure or data types.
- Do Comparison - DataCount.Rowcount if counts mis-match generate status in the manifest table and update datacount table with diff counts.
- Native CDM Data COVID19 Loinc Code correction, set valid LOINC codes to un-assigned null lab_loinc using lab text names.
- Populate global domain id table with N3C ids - using siteid,domain, domain_sourceid, n3cid (siteid prepend sourceid), create date to prevent data colliding from multiple-sites.
- Map native CDM to OMOP5.3.1 target tables, fields and build valuesSets map tables using static and dynamic mapping cross walk tables.
- Locate and Fix/remap failed mappings
- **a. correct Mappings / try to re-map for those fields with 0 concept ids and nulls for COVID related fields.
- **b. address 0 concept_id when possible.
- **c. contains no overtly obsolete or invalid data. e. provide feedback loop with step 10
- OMOP Approved Data Quality Checks - generate DQOMOP_report - deploy json report file to a known server.
- Contribute current instance of dataSet to the N3C data store
- Create a Safe Harbor Data Store using dateShift, timeShift, and time random fuzzy factor per patient.
- The Enclave API now supports authority property. This property is used to indicate the authority source of the concept set.
DataSet/Manifest Contents and Structures
- CDM minimum required tables
- OMOP Vocabulary Updates
- Process Overview
- DataSet Submission Format
- N3CID Global Domain ID
- Terminology Value Set Mapping Table Structure Explained
- PCORnet terms cross walk table
- ACT term cross walk table
- ACT: Additional Mapping Notes
- TrinetX: Mapping Notes
- Qualitative Results VS Mapping
- Corrected LOINC when COVID test LOINC is missing
- DEMOGRAPHIC VALUSETS
- COVID-19 LOINC corrected
- Safe Harbor Requirement
- type_concept_id default values
- Immunization value-set mappings
- Lab confirmed positive cohort definition
- Long COVID Specialty clinic visit
- Categorical Answer Concept Maps
- BulkImportConceptSetCreation
- Viral Variant Mapping Information
- CMS Data Integration Guidelines
- N3C Data Ingestion
- N3C Data Transformation
- Value Set Mapping table structure
- measurement unit harmonization
- Data Quality Public Worksheet
- PPRL N3C - CMS Release Notes
- N3C Custom Concept Extensions
- Mitigate Concept ID Misalignment in the OMOP CDM Pipeline and Resolve Terminology Drift
Includes Overview, PreRelease Variable Punchlist, References for creating phenotypes.