Skip to content

Qualitative Data Analysis done by AI (or LLMs). πŸ–₯️ Streamlit & πŸ”— Langchain

Notifications You must be signed in to change notification settings

Gamma-Software/llm_qualitative_data_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

33 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Qualitative Data Analysis πŸ“

This is a tool to help you do your qualitative data analysis. 🧐 This can for instance take your transcripts and generate codes and themes for you πŸ’‘. It summarizes your data and can help you get insights on your data. πŸ“Š

πŸ”¬ Tech stack

The Qualitative Data Analysis uses LLMs or Large Language Models to generate the summary / codes / themes and classify them. πŸ€– The application is developped with Python.

Python Libraries

This tool is powered by libraries:

  • Streamlit: For the User Interface πŸ–₯️
  • Langchain: For creating LLMs applications πŸ”—
  • OpenAI: The LLMs provider. For now we only integrated this LLM.

Getting started 🏁

Requirements

You need to have Python installed on your computer. Choose the latest version of Python 3. 🐍. The version tested is 3.8.10.

Configuration

Rename the .streamlit/secrets_template.toml file to .streamlit/secrets.toml and edit it to add your own configuration about langchain, langsmith and openai api key.

Installation

Clone the repository and install the dependencies:

git clone
cd qualitative-data-analysis
pip install -r requirements.txt

Run the application

streamlit run source/qualitative_analyse_agent.py

Usage πŸ“–

The usage is pretty simple. πŸ€“

  1. Upload your transcripts: You can upload your transcripts from the sidebar πŸ“‚.
    • Generate transcripts summary: In the Raw data section, you can generate a summary of your data individually.
  2. Enter your research question: You can enter your research question. This will be used to generate codes and themes. ❓
  3. Generate codes and themes: You can now click on the button to generate codes and themes. This will generate codes and themes based on your research question. πŸ’‘

Langsmith integration πŸ”—

You can use langsmith to monitor your application and get insights on how it is used. πŸ“Š

Edit the .streamlit/secrets.toml file and add the following lines:

[langsmith]
tracing = true
api_url = "https://api.smith.langchain.com"
api_key = "your key here"
project = "your project here"

Diagrams

Libs

TODO Show a diagram with the interaction between libs

LLM chain

TODO show the LLM Chaining

Features ✨

  • Upload your transcripts πŸ“‚
  • Generate Summary on all data or on a specific data πŸ“Š
  • Based on a research question generate a summary of the data, generate codes and themes. ❓
  • Update Qualitative Analysis Data parameters πŸ”„
  • Generate a Qualitative Data Analysis report and download it πŸ“„

Limitations ⚠️

  • For now, the tool cannot perform a Qualitative Data Analysis on large datasets as the LLM used is limited to 16000 tokens. 🚫
  • The data is not cached and the report as well. So if you reset the page the data will have to be uploaded and the report regenerated again. πŸ”„

Improvements πŸš€

  • Upload voice transcripts and convert them to text and perform a Qualitative Data Analysis πŸ—£οΈ
  • Connect to Qualitative Data softwares 🀝
  • (double check) Do intermediates checkings on the results to avoid LLM bias πŸ€”
  • Perform map-refine summary on the data
  • Handle large data transcripts

Background πŸ§‘β€πŸŽ“

My name is Valentin Rudloff and I'm a Engineer. I make stuff in various fields. πŸ‘¨β€πŸ”§ For my wife's memoire, she needed a tool to help her do a Qualitative Data Analysis on transcripts she conducted. πŸ“š LLMs are really good at understanding human semantics and thus perform a Qualitative Data Analysis. 🧠 This application helped her get almost an instant result, and I'm pretty sure this can help you as well. πŸ‘

Acknowledgements πŸ™

The application and the LLM prompt are greatly inspired by Dr. Philip Adu, Ph.D video Master Qualitative Data Analysis with ChatGPT: An 18-Minute Guide.

Made with ❀️ by Valentin Rudloff If you want to help me create other stuff like this you can buy me β˜•