fix

doppeltilde · Jun 9, 2024 · 517fe84 · 517fe84
1 parent 2b2543d
commit 517fe84
Show file tree

Hide file tree

Showing 4 changed files with 61 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,57 @@
+# Automatic Speech Recognition utilizing Faster Whisper.
+
+## Stack:
+- [FastAPI](https://fastapi.tiangolo.com)
+- [Python](https://www.python.org)
+- [Docker](https://docker.com)
+
+## Installation
+
+- For ease of use it's recommended to use the provided [docker-compose.yml](https://github.com/doppeltilde/automatic_speech_recognition/blob/main/docker-compose.yml).
+```yml
+services:
+  automatic_speech_recognition:
+    image: ghcr.io/doppeltilde/automatic_speech_recognition:latest
+    ports:
+      - "8000:8000"
+    volumes:
+      - models:/root/.cache/huggingface/hub:rw
+    environment:
+      - DEFAULT_SUMMARIZATION_MODEL_NAME
+      - DEFAULT_TRANSLATION_MODEL_NAME
+      - USE_API_KEYS
+      - API_KEYS
+    restart: unless-stopped
+
+volumes:
+  models:
+```
+
+- Create a `.env` file and set the preferred values.
+```sh
+DEFAULT_ASR_MODEL_NAME=base
+COMPUTE_TYPE=float16
+
+# False == Public Access
+# True == Access Only with API Key
+USE_API_KEYS=False
+
+# Comma seperated api keys
+API_KEYS=abc,123,xyz
+```
+
+## Models
+Any model designed and compatible with faster-whisper should work.
+
+## Usage
+
+> [!NOTE]
+> Please be aware that the initial process may require some time, as the model is being downloaded.
+
+> [!TIP]
+> Interactive API documentation can be found at: http://localhost:8000/docs
+
+---
+
+_Notice:_ _This project was initally created to be used in-house, as such the
+development is first and foremost aligned with the internal requirements._
diff --git a/requirements.txt b/requirements.txt
@@ -1,6 +1,5 @@
 fastapi[standard]==0.111.0
 Pillow==10.3.0
-transformers==4.41.2
 torch==2.3.1
 python-dotenv==1.0.1
 python-multipart==0.0.9

diff --git a/src/routes/api/asr.py b/src/routes/api/asr.py
@@ -50,3 +50,6 @@ async def asr(
     except Exception as e:
         print("Something went wrong: ", e)
         return {"error": str(traceback.format_exc())}
+
+    finally:
+        audio_file.close()
diff --git a/src/shared/shared.py b/src/shared/shared.py
@@ -10,7 +10,7 @@
 default_asr_model_name = os.getenv("DEFAULT_ASR_MODEL_NAME", "base")
 
 device = "cuda" if torch.cuda.is_available() else "cpu"
-compute_type = os.getenv("COMPUTE_TYPE", "float16")
+compute_type = os.getenv("COMPUTE_TYPE", "int8")
 
 # API KEY
 api_keys_str = os.getenv("API_KEYS", "")