This is a tool to help you do your qualitative data analysis. π§ This can for instance take your transcripts and generate codes and themes for you π‘. It summarizes your data and can help you get insights on your data. π
The Qualitative Data Analysis uses LLMs or Large Language Models to generate the summary / codes / themes and classify them. π€ The application is developped with Python.
This tool is powered by libraries:
- Streamlit: For the User Interface π₯οΈ
- Langchain: For creating LLMs applications π
- OpenAI: The LLMs provider. For now we only integrated this LLM.
You need to have Python installed on your computer. Choose the latest version of Python 3. π. The version tested is 3.8.10.
Rename the .streamlit/secrets_template.toml
file to .streamlit/secrets.toml
and edit it to add your own configuration about langchain, langsmith and openai api key.
Clone the repository and install the dependencies:
git clone
cd qualitative-data-analysis
pip install -r requirements.txt
streamlit run source/qualitative_analyse_agent.py
The usage is pretty simple. π€
- Upload your transcripts: You can upload your transcripts from the sidebar π.
- Generate transcripts summary: In the Raw data section, you can generate a summary of your data individually.
- Enter your research question: You can enter your research question. This will be used to generate codes and themes. β
- Generate codes and themes: You can now click on the button to generate codes and themes. This will generate codes and themes based on your research question. π‘
You can use langsmith to monitor your application and get insights on how it is used. π
Edit the .streamlit/secrets.toml
file and add the following lines:
[langsmith]
tracing = true
api_url = "https://api.smith.langchain.com"
api_key = "your key here"
project = "your project here"
TODO Show a diagram with the interaction between libs
TODO show the LLM Chaining
- Upload your transcripts π
- Generate Summary on all data or on a specific data π
- Based on a research question generate a summary of the data, generate codes and themes. β
- Update Qualitative Analysis Data parameters π
- Generate a Qualitative Data Analysis report and download it π
- For now, the tool cannot perform a Qualitative Data Analysis on large datasets as the LLM used is limited to 16000 tokens. π«
- The data is not cached and the report as well. So if you reset the page the data will have to be uploaded and the report regenerated again. π
- Upload voice transcripts and convert them to text and perform a Qualitative Data Analysis π£οΈ
- Connect to Qualitative Data softwares π€
- (double check) Do intermediates checkings on the results to avoid LLM bias π€
- Perform map-refine summary on the data
- Handle large data transcripts
My name is Valentin Rudloff and I'm a Engineer. I make stuff in various fields. π¨βπ§ For my wife's memoire, she needed a tool to help her do a Qualitative Data Analysis on transcripts she conducted. π LLMs are really good at understanding human semantics and thus perform a Qualitative Data Analysis. π§ This application helped her get almost an instant result, and I'm pretty sure this can help you as well. π
The application and the LLM prompt are greatly inspired by Dr. Philip Adu, Ph.D video Master Qualitative Data Analysis with ChatGPT: An 18-Minute Guide.
Made with β€οΈ by Valentin Rudloff If you want to help me create other stuff like this you can buy me β