FinAI: LLM-based Equity Research Engine

Introduction 📄

FinAI is a Retrieval-Augmented Generation (RAG) based equity news analyzer that simplifies information retrieval for investors, analysts, and financial researchers. Built using LangChain, OpenAI, Gemini, FAISS, and local LLM backends, it allows users to input article URLs and query them in natural language.

Problem Statement 🚫

Equity research is manual, fragmented, and time-consuming.
Analysts must manually browse multiple sources and interpret insights.
LLMs like ChatGPT alone cannot handle multi-source, large documents, or real-time querying efficiently.
Need for a tool that can ingest articles, process them intelligently, and provide accurate, real-time answers.

Working (Pipeline Stages) ⚙️

Data Ingestion
- News article URLs are fetched using SeleniumURLLoader.
Text Splitting
- Articles are chunked using LangChain's RecursiveCharacterTextSplitter to fit LLM token limits.
Embeddings & Vector Store
- Embeddings created via OpenAI or HuggingFace models (like all-MiniLM-L6-v2).
- FAISS stores and retrieves similar content based on queries.
Querying via LLMs
- User queries are answered using OpenAI/Gemini/GPT4All/LLama-2 LLMs via RetrievalQAWithSourcesChain.
- Local models (llama-cpp-python, gpt4all) ensure offline support.
Answer + Source Display
- Source-linked responses shown via Streamlit.

Results 🔄

Tool providing real-time updates on each stage’s progress
Tool providing exact accurate answer on straightforward (direct) queries
Customized same pipeline using both online APIs (OpenAI, Gemini) and offline models (LLaMA, GPT4All).

Benchmark Table

Backend	Avg Latency	QA Relevance	Token Cost	Use-Case Fit
OpenAI (gpt-3.5-turbo)	~2.1s	96.4%	High (Paid)	Best for fast, high-quality responses
Gemini Pro	~2.8s	92.1%	Free (limited)	Good fallback; prone to hallucination
Local LLaMA (7B)	~5.3s	93.2%	None	Reliable offline QA; requires setup
GPT4All (q4_0)	~7.2s	86.5%	None	Works offline; lower accuracy in deep QA

Files & Structure 📁

app_versions/: Contains different Streamlit app versions based on LLMs — OpenAI, Gemini, GPT4All, and LLaMA.
data_files/: Includes sample article text files and URL lists used during experimentation.
notebooks/: Jupyter notebooks demonstrating individual components of the RAG pipeline (e.g., vector store testing, embeddings, chunking).
test/: Debugging and testing scripts for Gemini and LLaMA-based app flows.
.env: Stores environment variables like API keys for OpenAI and Gemini.
faiss-store-hf.pkl: Vector store generated using HuggingFace embeddings.
faiss-store-openai.pkl: Vector store generated using OpenAI embeddings.
vector-index.pkl: Sample vector index created using notebook for FAISS validation.
main.py: Primary file containing final UI code after experimentation.
requirements.txt: Python dependencies required for running the project.
README.md: Documentation and usage guide for the project.
models/: 🔐 Not uploaded — should contain downloaded local LLMs (refer to Installation)

Installation 🚧

Clone the repository

git clone https://github.com/your-username/FinAI.git
cd FinAI

Install dependencies

pip install -r requirements.txt

Set up API keys Create a .env file just like the reference being provided and add:

OPENAI_API_KEY=your-key-here
GOOGLE_API_KEY=your-google-studio-api-key

OpenAI Key: https://platform.openai.com/account/api-keys
Gemini Key: https://aistudio.google.com/app/apikey (enable Gemini API on Google Cloud Console)

For local LLM usage

Download .gguf models from LLaMA HF or GPT4All.
Create a models/ directory and place them inside.
Update model_path in corresponding app files (e.g. app_local_llama.py).

Run the tool

streamlit run main.py

Tech Stack 🚀

LangChain: Orchestration of RAG pipeline
Streamlit: Interactive web interface
OpenAI & Gemini APIs: Cloud-based LLMs
LLaMA / GPT4All: Local LLMs
HuggingFace Embeddings: SentenceTransformers (all-MiniLM-L6-v2)
FAISS: Vector similarity search and store
Python + Selenium: Document scraping + automation

References 📚

Future Scope 🔮

Real-time financial API integration (e.g. stock prices, reports)
LLM-based summarization for multi-source insights
Domain-tuned custom LLMs for financial jargon
Globalization support via multi-language ingestion

Contributing 🤝

We welcome contributions! Feel free to:

Fork the repo
Create a new branch
Submit PR with changes or improvements

Thanks for Visiting 😊!

We hope FinAI helps you gain actionable insights with less effort. If you like it, give the repo a ⭐ and feel free to reach out for suggestions or ideas!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FinAI: LLM-based Equity Research Engine

Table of Contents

Introduction 📄

Problem Statement 🚫

Working (Pipeline Stages) ⚙️

Results 🔄

Files & Structure 📁

Installation 🚧

Tech Stack 🚀

References 📚

Future Scope 🔮

Contributing 🤝

Thanks for Visiting 😊!

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
app_versions		app_versions
data_files		data_files
notebooks		notebooks
test		test
.env		.env
.gitignore		.gitignore
README.md		README.md
faiss-store-hf.pkl		faiss-store-hf.pkl
faiss-store-openai.pkl		faiss-store-openai.pkl
main.py		main.py
requirements.txt		requirements.txt
vector-index.pkl		vector-index.pkl

krishnaura45/FinAI

Folders and files

Latest commit

History

Repository files navigation

FinAI: LLM-based Equity Research Engine

Table of Contents

Introduction 📄

Problem Statement 🚫

Working (Pipeline Stages) ⚙️

Results 🔄

Files & Structure 📁

Installation 🚧

Tech Stack 🚀

References 📚

Future Scope 🔮

Contributing 🤝

Thanks for Visiting 😊!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages