- Introduction: https://forums.ohdsi.org/t/call-for-volunteers-apac-community-wide-etl-project/22044
- PASAR: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10834714/
- Version:
5.4
- DDL Artifacts: https://github.com/OHDSI/CommonDataModel/tree/main/ddl/5.4/postgresql
- Specific Git Commit: https://github.com/OHDSI/CommonDataModel/commit/c1c8e6a4f04e588d72fa9ae5df56b1631559548b
- Copied files to
etl/db
and removed prefix@cdmDatabaseSchema.
in those files (since ingestion will happen via SqlAlchemy)
- bash
- Python >=
v3.10
For Windows users please adapt the following steps accordingly! Recommend to install linux on windows: https://learn.microsoft.com/en-us/windows/wsl/install
-
Navigate to under
etl
folder -
Run the following commands:
- Create and activate virtual environment
python3 -m venv pypasarenv
source pypasarenv/bin/activate
- Install python packages
pip install -r requirements.txt
- Clone .env file from example
cp .env.example .env
- Update
.env
file with credentials
- Create and activate virtual environment
Postgres sql scripts are now stored at this point in time etl/pypasar/db/sql
and will likely to change based on usage later
- Ensure docker is installed
- Export envs
source .env
/ Copy the valuePOSTGRES_PORT
from .env and replace in the next line - Run Postgres as docker container
docker run -v pg-pasar-data:/var/lib/postgresql/data --env-file .env -d --name pasar-postgres -p ${POSTGRES_PORT}:5432 postgres:16-alpine
If you have an existing R Setup and familiar with OHDSI Packages then setup the OMOP using https://github.com/OHDSI/CommonDataModel/blob/main/README.md
-
Ensure
- Environment variables are setup accordingly in
.env
- Current working directory is under
etl
folder
- Environment variables are setup accordingly in
-
Create omop schema and tables
- Run
python . db create_omop_schema
. - Schema defined as
POSTGRES_OMOP_SCHEMA
in.env
will be created and OMOP tables populated. - Verify through PGAdmin / psql client
- Run
-
Drop omop schema and tables
python . db drop_omop_schema
.- Schema defined as
POSTGRES_OMOP_SCHEMA
in.env
will be dropped
- To begin contributing transformation to the various OMOP tables, go to
etl/pypasar/omop
and choose the appropriate python file - SQL can be used as well in the python class
- Example is available for cdm_source table at
etl/pypasar/omop/cdm_source.py
-
Please feel free to implement in whichever way you choose. The only mandatory requirement is that the
execute
must be the entrypoint to the respective omop class. Becauseexecute
method will be called for each class from__main__.py
file -
Current working directory is under
etl
folder -
Run
python . etl <omop_table_name>
.- Example
python . etl cdm_source
- Multiple tables for cdm_source and concept
python . etl cdm_source,concept
. NO SPACES BETWEEN COMMA SEPARTED OMOP Tables
- Example
- Run
deactivate
- Under
etl
, Runrm -rf pypasarenv
- Remove container, Run
docker rm -f pasar-postgres
- Remove volume (CAUTION - ALL DATA WILL BE LOST!!), Run
docker volume rm pg-pasar-data
In the below snippet at Step 3
- Replace
<IP>
Based on group in the below snippet - Replace
<username>
(filename of the private key / mentioned in ETL Development sheet) - Copy the snippet to
~/.ssh/config
and save the file
Host pypasar
HostName <IP>
User <username>
IdentityFile ~/.ssh/<username>
ControlMaster auto
ControlPath ~/.ssh/control-%C
ControlPersist yes
-
Copy public and private key files to
~/.ssh
folder -
Test on terminal
ssh pypasar
. Should be able to login to home folder, runpwd
. -
For remote development https://code.visualstudio.com/docs/remote/ssh
-
Once you are in your home folder, refer to this document https://ohdsiorg.sharepoint.com/:p:/s/OHDSIAPAC/ESUGOh6Lza9FvxH1TyaoO7oBlMv_9Iq57tLQ-41V2HUFtA?e=QdjNOP and fork this repo to your org account.
-
Once the repo clone is done, navigate to the
<username>-pasar
repo -
Enter the following configuration (No GLOBAL!!)
git config user.name "<username>"
git config user.email <email>
- Follow the document in step 1 to create a new branch and push to repo.
- How to browse and Install VS Code extensions: https://code.visualstudio.com/docs/editor/extension-marketplace#_browse-for-extensions
- Recommended Postgres GUI extension: https://marketplace.visualstudio.com/items?itemName=ckolkman.vscode-postgres