Bio-Relationship Extraction

A system for extracting and visualizing relationships between bio-medical entities using RAG pipelines, Neo4j graph database, and embeddings-based search.

Project Overview

Undergraduate thesis at UTS focusing on:

Extracting relationships between organisms, chemicals, and bio-medical entities from PubMed articles
Building a graph database to store and query relationships
Providing a search and visualization interface
Using local LLMs via Ollama for processing

Architecture

Backend: FastAPI server with LlamaIndex for RAG pipelines
Vector Database: PostgreSQL with pgvector for document embeddings
Graph Database: Neo4j for storing extracted relationships
LLM Integration: Ollama for local large language model inference
Frontend: Simple web interface for querying and visualization

Technical Requirements

Python 3.12 for Backend
Python 3.11 for Frontend
Poetry (v2.1.2 or newer)
Docker and Docker Compose
NVIDIA GPU with CUDA support (recommended)
WSL2 for Windows users

MAC

Python 3.12
Poetry (v2.1.2 or newer)
Docker and Docker Compose

Setup Instructions

Install Prerequisites
- Pyenv (Python version management) although you can just have python3.11 installed, don't have to use Pyenv
- Poetry
- Docker
- Docker Compose

Set up Python environment

# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -

# Navigate to project directory
cd bio-relationship-extraction

# Install dependencies
poetry install

# Activate the virtual environment
poetry shell

Start database services
```
docker-compose up -d
```
Configure Application
- Update src/backend/services/llm.py for Ollama instance
- Update database configurations if defaults changed

MAKE SURE YOU'RE IN THE VIRTUALENV/POETRY ENV 5. Start backend server

python3 ./main.py

Start frontend (if applicable)
```
python3 ./main.py
```

Using the Application

Search PubMed: Upload PubMed articles to extract relationships
Query Relationships: Ask about relationships between biological entities
Visualize Results: View and explore extracted relationships
API Testing with Postman:
- Import the provided Postman collection from docs/postman/bio-relationship-api.json
- Use Postman to test API endpoints at http://localhost:8000
- Available endpoints include document upload, relationship queries, and entity searches
Alternative LLM Options:
- While Ollama is recommended for local inference, you can use other models
- LlamaIndex supports various LLM providers through connectors
- Simply install the appropriate connector package (e.g., pip install llama-index-llms-openai)
- Modify src/backend/services/llm.py to use your preferred model
- Options include OpenAI, Anthropic, Hugging Face, and many others if you can't run models locally

Example Queries

"What metabolites are produced by Starfish"

Development

Project Structure

Data Flow

Documents retrieved from PubMed or uploaded
Text embedded and stored in PostgreSQL with pgvector
LLM extracts relationships using structured output
Relationships stored in Neo4j as a graph
User queries processed via RAG pipeline

License

MIT License - see the LICENSE file for details.

Authors

Jayden Lee - Graduate @ University of Technology Sydney and Machine Learning Engineer @ AMP

Get in Contact

Add me on LinkedIn - https://www.linkedin.com/in/jayden-l33/ But please read all of the docs before doing so!

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bio-Relationship Extraction

Project Overview

Architecture

Technical Requirements

MAC

Setup Instructions

Using the Application

Example Queries

Development

Project Structure

Data Flow

License

Authors

Get in Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

JaydenL33/bio-relationship-extraction

Folders and files

Latest commit

History

Repository files navigation

Bio-Relationship Extraction

Project Overview

Architecture

Technical Requirements

MAC

Setup Instructions

Using the Application

Example Queries

Development

Project Structure

Data Flow

License

Authors

Get in Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages