This is a fullstack application that records users talking about their interests and uses AI to categorize them. The app records a short audio clip, transcribes it, and analyzes the content to understand the user's interests. The frontend is built with React + TypeScript + Vite, and the backend is built with Flask.
You have 2.5 hours to improve the application by fixing bugs and adding standard features. The basic recording and mock transcription functionality is implemented but needs work. We'll evaluate your technical implementation, feature prioritization, and problem-solving approach. Please document your thought process and decisions in notes.md
as you work.
Time is limited, and there is a lot here, so please focus on the tasks that seem most important to you. If you want to do something, or would do something in production but don't think it's a good use of time in this challenge, write about it in notes.md
.
We're an AI startup, we love AI, we use it all the time. Use it for everything but your notes.md
file. This document should be entirely in your own words and doesn't need to be formal. If your notes.md
is full of AI-generated content, we'll know.
frontend/
: React TypeScript applicationbackend/
: Flask Python application
-
Navigate to the frontend directory:
cd frontend
-
Install dependencies:
npm install
-
Start the development server:
npm run dev
The frontend will be available at http://localhost:5174
-
Navigate to the backend directory:
cd backend
-
Create and activate a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows, use: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Start the Flask server:
python app.py
The backend will be available at http://localhost:8000
You have 2.5 hours to improve this application by solving the following challenges:
The audio recorder component has a bug where the recording timer doesn't increment and the recording doesn't automatically stop after reaching the maximum time limit of 10 seconds.
The application currently provides no feedback while audio is being transcribed, which can take a while. Add a loading indicator that:
- Shows when transcription processing begins
- Displays while waiting for the transcription result
- Gracefully transitions when the transcription completes
In frontend/services/APIService
, there is a stubbed out version compatibility system. Let's finish implementing it, as well as make some changes to the backend to ensure compatibility:
- Modify the frontend code so that all communication with the backend is handled by APIService
- Add version tracking between frontend and backend
- Implement a mechanism to detect version mismatches
- Prompt users to refresh when the backend version changes
- Gracefully reject requests to the backend when frontend version is stale
Currently, the application can only process one audio transcription at a time. Modify the system to:
- Allow multiple recordings to be processed simultaneously without blocking other api requests
- Show progress/status for each transcription job on the frontend
- Handle errors gracefully
Note: In production, this concurrency would be handled by a more sophisticated system. For this challenge, feel free to use in-memory state storage and simple threading to simulate concurrent processing of jobs. Discuss in your notes what you might do to improve this in a production environment.
We currently have no way to identify users. Add a very simple identity system that:
- Allows users to be identified by a unique id that is generated by the backend
- Stores the user id in local storage on the frontend
- Sends the user id with every request
- Logs the user id with every request
Note: Keep this super simple, and don't worry about things like user authentication or authorization or security.
It's 2024, and even a simple application like this isn't cool until we sprinkle some AI on it. Add a simple categorization system that:
- Uses the
get_user_model_from_db
function to get the user's preferred LLM provider - Uses the LLM to categorize the transcription
- Encourages the LLM to return its response in JSON format, and somehow validates that it did so
Note: You can just mock this if don't have access to either of these APIs right now, but we want to see how you approach the prompt engineering and validation, so make sure you mock it robustly. If you do have access and actually implement it, make sure to include documentation of how the API keys should be stored in the .env file.
People keep saying the same things, we don't really need to make LLM calls with the same transcripts over and over again. Also, the genius engineers that created the database which takes 8 seconds to return the user's preferred LLM provider are annoyed that we keep calling their expensive function over and over again. Let's cache that too.
Implement a simple in-memory caching system on the backend that saves us money and time.
Make the frontend look slightly less ugly.
- Code quality and organization
- Problem-solving approach and speed
- Discussion of production considerations and optimizations and pitfalls
- TypeScript/Python best practices
- Clone this repository to your own GitHub account into a new PRIVATE repository (don't want other candidates seeing your code)
- Complete the challenges
- Submit a PR with your changes and your
notes.md
file - Invite
jbierfeldt
as a collaborator to your repository and email [email protected] with the link
Good luck!