This project, "Emotion Feedback," aims to recommend conversation topics for video chats based on real-time emotion analysis.
이준범 | 이원재 | 권영우 | 심재호 |
Backend, Signaling Server | Model Server | Frontend | Modeling |
ss7622 | Lee-wonjae | kwonup | JaehoSim98 |
- 2024-03-02 ~ 2024-06-14
"Emotion Feedback" is a system designed to alleviate awkward atmospheres and lack of conversation topics during video chats by recommending topics based on real-time emotion analysis.
- Awkward atmosphere
- Lack of conversation topics
- Emotion-based conversation topic recommendation
- Real-time Emotion Analysis: Analyze and store the emotions of the participants in real-time.
- Emotion-based Topic Recommendation: Recommend conversation topics based on the stored emotions and conversation content.
- Understanding Partner’s Favorability: Analyze and display the favorability graph of the conversation partner post-conversation.
-
Video Call: Users start a video call through the system.
-
Data Collection and Analysis:
- Image Analysis: Analyze the user's facial expressions captured from the video.
- Text Analysis: Convert the conversation into text and analyze the sentiment.
- Audio Analysis: Analyze the tone and speed of the user's voice to determine the emotional state.
-
Data Processing
- Preprocessing images to fit CNN models.
- Converting and preprocessing audio and text data for analysis.
-
Favorability Detection
- Combining data from text, image, and audio analysis to evaluate the favorability between users.
-
Real-time Topic Recommendation
- Using stored data and GPT API to recommend conversation topics in real-time.
-
Data Storage and Feedback
- Storing analyzed text data and favorability scores in a database.
- Providing feedback to users based on stored data after the conversation ends.
- Frontend: React, Figma, WebRTC, MediaStream API
- Backend: FastAPI, SpringBoot, WebSocket, PostgreSQL, AWS
- Modeling: CNN for image, audio, and text analysis, LangChain for natural language processing
Despite achieving significant accuracy for emotion detection, continuous improvements are planned, including expanding the dataset, refining models, and exploring additional applications such as enhancing conversational flow and user engagement.
- STT Delay: Mitigated by adding a 3-second delay to accommodate processing time.
- Model Accuracy: Improved by modifying preprocessing steps and model parameters, achieving 78% accuracy for image models and 82% for audio/text models.
- Real-time Processing: Achieved using asynchronous processing with FastAPI to handle model server communications.
- Increase the amount of quality data to improve model performance.
- exploring additional applications such as enhancing conversational flow and user engagemen