Skip to content

ysmx-github/phs_pilot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 

Repository files navigation

PHS Pilot Project

Deployment instructions

  1. Create and start (or use existing one) standard interactive cluster, no Photon (Optional). Install pyyaml and colorama libraries from PyPi
  2. Create (or use existing one) 2X-Small Serverless warehouse, 1 Min 1 Max, Preview channel (Optional)
  3. Workspace -> Home -> Create -> Git folder
  4. Git repository URL: https://github.com/ysmx-github/phs_pilot.git -> Create Git Folder
  5. Open SQL notebook /Workspace/Users/[email protected]/phs_pilot/src/depl/schema_sql
  6. Connect to Serverless
  7. Run Cell 1
  8. Fill the widgets with the catalog name and target folder name (dbr_ssa_clinical and dbr_ddl_clinical are used in this example)
  9. Run all
  10. Open the volume: Catalog explorer -> dbr_ssa_clinical -> clinical_raw -> clinical_data_volume
  11. Download from the shared folder and unzip data.zip and emr_ddl_clinical.zip. Manually upload folders data and emr_ddl_clinical to the volume
  12. Open notebook /Workspace/Users/[email protected]/phs_pilot/src/depl/wf1_create, connect to cluster or Serverless, Run all
  13. Open notebook /Workspace/Users/[email protected]/phs_pilot/src/depl/wf2_create, connect to cluster or Serverless, Run all
  14. Open notebook /Workspace/Users/[email protected]/phs_pilot/src/depl/wf3_create, connect to cluster or Serverless, Run all
  15. Open YAML file /Workspace/Users/[email protected]/phs_pilot/src/wf_common/config.yaml, edit db_catalog and volume parameters as needed
  16. Open Workflows
  17. Run phs_wf1 workflow, review workflow and results
  18. Run phs_wf2 workflow, review workflow and results
  19. Run phs_wf3 workflow, review workflow and results
  20. Open /Workspace/Users/[email protected]/phs_pilot/src/wf3/wf3_dlt_test.sql
  21. Select all
  22. Copy
  23. Open SQL Editor -> New query
  24. Paste
  25. Select catalog dbr_ssa_clinical and schema clinical_bronze
  26. Run CDC tests on the wf3_dlt pipeline

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages