Most of the information in this README is still a work in progress
Before building the server with Docker, you need to create a private key and a certificate. This is required to enable HTTPS for secure communication. A key and certificate are required, even for testing purposes.
Use the following command to generate a private key:
openssl genrsa -out key.pem 2048
key.pem
: File containing the private key.2048
: Length of the key in bits (2048 is a secure standard).
View the generated key with:
cat key.pem
Generate a self-signed certificate using the private key (for testing purposes):
openssl req -new -x509 -key key.pem -out cert.pem -days 365
During execution, you will be asked to enter some information:
- Country Name (2 letter code): e.g.,
US
- State or Province Name: e.g.,
California
- Locality Name: e.g.,
San Francisco
- Organization Name: Your organization or project name
- Common Name: Use
localhost
for local testing
View the self-signed certificate with:
cat cert.pem
For production use, a valid certificate issued by a trusted Certificate Authority (CA) is required to ensure a secure connection. Certificate check is disabled on the client for testing.
Make sure Docker is installed. Follow the official Docker Installation Guide if needed.
Clone the repository:
git clone https://github.com/dariopellegrino00/whisper_realtime_server.git
cd whisper_realtime_server
-
Navigate to the project root directory:
cd whisper_realtime_server
-
Build the Docker image:
docker build -t whisper_realtime_server .
-
Run the Docker container with GPU support and port mapping:
docker run --gpus all -p 8000-8050:8000-8050 --name whisper_server whisper_realtime_server
- You can change the port range
8000-8050
if needed. - The server is now running and ready to accept connections. You can access it at port 8000 using the
client.py
script.
- You can change the port range
-
To stop the Docker container:
docker stop whisper_server
-
To restart the Docker container:
docker start whisper_server
For now, avoid modifying the config.json
file. If you need to experiment, it is advisable to only adjust the model size parameter.
The Nvidia Developer Kit is required for GPU support. The server has been tested with CUDA 12.X and cuDNN 9, as specified in the Dockerfile. The Whisper Streaming project has been tested with CUDA 11.7 and cuDNN 8.5.0, so it is recommended to use at least CUDA 11.7 and cuDNN 8.5.0.
Before setting up your own client, it's important to understand the server architecture. The client first connects to a layer server on the default port (8000). After connecting, the layer server assigns a port number to the client. The client then connects to the same host on the assigned port, streams audio data to this port, and receives real-time transcriptions.
-
Navigate to the
src
directory:cd src
-
Run the server directly with Python:
python3 layer_server.py
-
To use a microphone for audio input:
python3 client.py
-
To simulate audio streaming from a file:
python3 client.py <filepath>
- This project uses parts of the Whisper Streaming project. Other projects involved in whisper streaming are credited in their repo, check it out: whisper streaming
- Credits also to: faster whisper
This project is still in an early stage of development, and there may be significant bugs or issues. All contributions are welcome and greatly appreciated! If you'd like to contribute, here's how you can help:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Submit a pull request with a clear description of your changes.
For major changes, please open an issue first to discuss what you'd like to change. Thank you for helping improve this project and making it better for everyone!
- Rapidfuzz token confirmation
- Custom enviroment setup
- grpc implementation
- remove unused packages in Dockerfile and requirements
- Rarely other client words can end in others buffer