eICU Knowledge Graph

This repository contains the code and resources for building a Knowledge Graph from the eICU Collaborative Research Database. The project aims to extract meaningful relationships and insights from electronic health records (EHR) and represent them in a structured graph format for advanced analysis and querying.

Introduction

The eICU Knowledge Graph project transforms raw eICU data into a structured knowledge graph, enabling advanced querying, analysis, and visualization of patient health records. The knowledge graph captures relationships between patients, diagnoses, treatments, medications, and other medical entities, making it a powerful tool for healthcare analytics and research.

Features

Data Extraction: Extract and preprocess data from the eICU database.
Entity Recognition: Identify key medical entities (e.g., patients, diagnoses, medications).
Relationship Extraction: Define and extract relationships between entities.
Graph Construction: Build a knowledge graph using a graph database (e.g., Neo4j).
Query Interface: Query the knowledge graph for insights and patterns.
Visualization: Visualize the graph structure and relationships.

Installation

Prerequisites

Python 3.8+
pip installed
eICU Collaborative Research Database access

Steps

Clone the repository:

git clone https://github.com/itsNavinSingh/eicuKnowledgeGraph.git
cd eicuKnowledgeGraph

Move dataset directory to eicuKnowledgeGraph directory

For windows

Run the execute_windows.bat file
```
execute_windows.bat
```

for Linux/macOS

Make the execute_linux_mac.sh File Executable and Run the Script
```
chmod +x execute_linux_mac.sh
./execute_linux_mac.sh
```

Common command

It will ask to Enter the input directory path
Enter the Relative path of dataset directory.
It will generate all the .ttl file in result directory.

Usage

The script will generate .ttl (Turtle) files in the result directory. These files represent the knowledge graph and can be imported into a graph database like Neo4j for further querying and analysis.
Once the .ttl files are generated, you can load them into a graph database such as Neo4j. Use Cypher queries to explore relationships between medical entities (e.g., patients, diagnoses, medications). This enables advanced analytics and can help uncover patterns in healthcare data.

Data Pipeline

The eicu Knowledge Graph uses a data pipline to extract, preprocess, and transform raw eICU data into a structured knowledge graph. This section outlines the steps and components involved in the data pipeline.

`Data Ingestion`

The first step in the pipeline is the ingestion of raw data from the eICU Collaborative Research Database. The data consists of various tables containing information on patient demographics, diagnoses, treatments, medications, vital signs, and more. This raw data is in CSV format.

Source Data : The eICU dataset includes multiple files that contain structured data across several tables.
Data Access : To access the eICU data, users need to register with the eICU Research Institute and obtain the dataset.

`Data Preprocessing`

Once the raw data is ingested, it needs to be cleaned and preprocessed. This step involves the following tasks:

Data Cleaning : Handling missing values, removing duplicate records, and converting data types to ensure consistency.
Data Transformation : Converting the raw data into a structured format suitable for graph construction. This includes normalizing and categorizing various medical entities. The preprocessing is done using the provided scripts. It process the data and prepare it for next step.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
README.md		README.md
admissionDrug.py		admissionDrug.py
admissionDx.py		admissionDx.py
allergy.py		allergy.py
apacheApsVar.py		apacheApsVar.py
apachePatientResult.py		apachePatientResult.py
apachePredVar.py		apachePredVar.py
carePlanCareProvider.py		carePlanCareProvider.py
carePlanEOL.py		carePlanEOL.py
carePlanGeneral.py		carePlanGeneral.py
carePlanGoal.py		carePlanGoal.py
carePlanInfectiousDisease.py		carePlanInfectiousDisease.py
customLab.py		customLab.py
diagnosis.py		diagnosis.py
execute_linux_mac.sh		execute_linux_mac.sh
execute_windows.bat		execute_windows.bat
hospital.py		hospital.py
infusionDrug.py		infusionDrug.py
intakeOutput.py		intakeOutput.py
lab.py		lab.py
medication.py		medication.py
microLab.py		microLab.py
note.py		note.py
nurseAssessment.py		nurseAssessment.py
nurseCare.py		nurseCare.py
nurseCharting.py		nurseCharting.py
pastHistory.py		pastHistory.py
patient.py		patient.py
physicalExam.py		physicalExam.py
requirements.txt		requirements.txt
respiratoryCare.py		respiratoryCare.py
respiratoryCharting.py		respiratoryCharting.py
treatment.py		treatment.py
vitalAperiodic.py		vitalAperiodic.py
vitalPeriodic.py		vitalPeriodic.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

eICU Knowledge Graph

Table of Contents

Introduction

Features

Installation

Prerequisites

Steps

For windows

for Linux/macOS

Common command

Usage

Data Pipeline

`Data Ingestion`

`Data Preprocessing`

About

Releases

Packages

Languages

itsNavinSingh/eicuKnowledgeGraph

Folders and files

Latest commit

History

Repository files navigation

eICU Knowledge Graph

Table of Contents

Introduction

Features

Installation

Prerequisites

Steps

For windows

for Linux/macOS

Common command

Usage

Data Pipeline

Data Ingestion

Data Preprocessing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`Data Ingestion`

`Data Preprocessing`

Packages