Skip to content
/ GuPT Public

GuPT is a RAG chatbot system providing accurate and quick answers about Gothenburg University's courses and programs to help students access academic information effortlessly.

Notifications You must be signed in to change notification settings

faerazo/GuPT

Repository files navigation

GuPT

GuPT is the name of the project developed by a student group for the course Machine Learning for Natural Language Processing (DIT247). The system leverages extracted information from Gothenburg University’s (GU) bachelor’s and master’s courses (~590) and programs (~90), including relevant details from their websites and syllabus PDFs. This data is used as input to GuPT, which then employs a Retrieval-Augmented Generation (RAG) approach to respond to user queries.

GuPT’s RAG model is built using LangChain, OpenAI embeddings, and ChatGPT4o-mini. By utilizing multi-querying and logic routing, GuPT can handle ambiguous questions and provide both specific and general answers regarding GU courses and programs. The goal is to offer a tool that efficiently provides information on entry requirements, learning objectives, and assessment methods, thereby reducing confusion and administrative workload.


🚀 Try It Out

Hugging Face Spaces

Access our interactive demo and start asking questions about GU courses and programs.


Table of Contents

  1. Features
  2. Getting Started
  3. Installation
  4. Usage
  5. Data Collection
  6. Architecture
  7. Evaluation
  8. Technologies Used
  9. Video Presentation

Features

  • Natural Language Querying: Ask questions about GU courses and programs in plain English.
  • Contextual RAG System: Retrieves relevant information from a local database of course and program details.
  • Multi-Querying and Logic Routing: Handles ambiguous queries and routes them through various queries to get precise answers.
  • Scalable: Built to handle a large volume of course and program data.
  • Efficient Retrieval: Reduces time spent searching for course or program information manually.

Getting Started

These instructions will help you set up a local copy of GuPT for development and testing purposes.

Prerequisites

  • Python 3.8+: Ensure you have Python installed.
  • pip: Python package manager.
  • OpenAI API Key: Required for embedding and text generation. Obtain one from OpenAI's website.

Installation

  1. Clone the Repository
git clone https://github.com/faerazo/DIT247-NLP-Final-Project.git
cd DIT247-NLP-Final-Project
  1. Set Up Your .env File

Create a file named .env in the project root and include your OpenAI API Key:

OPENAI_API_KEY=[YOUR_API_KEY]
  1. Install Required Libraries
pip install -r requirements.txt

Usage

Once you have the environment set up and the necessary dependencies installed, you can run GuPT and interact with the RAG Chatbot.

  1. Start the GuPT RAG Chatbot
python rag.py
  1. Ask Your Questions

Simply type your question or query into the chatbot interface or use one of the provided template questions.


Data Collection

Data from the GU courses and programs is crawled from the GU website and stored in the data folder. The process is summarized in the following diagram:

Data Collection


Architecture

The architecture of GuPT is shown in the following diagram:

Architecture


Evaluation

To evaluate GuPT’s responses on the test set (or a subset of it), run the following command:

python run_evaluation.py --subset 3

Where --subset 3 indicates the subset of the test data you want to evaluate. Adjust this value as needed.


Technologies Used


Video Presentation

GuPT Video Presentation

About

GuPT is a RAG chatbot system providing accurate and quick answers about Gothenburg University's courses and programs to help students access academic information effortlessly.

Topics

Resources

Stars

Watchers

Forks

Languages