Web app for transcribing audio file (.wav format) to text usingGoogle Cloud Speech API.
-
Updated
Jun 22, 2020 - HTML
Web app for transcribing audio file (.wav format) to text usingGoogle Cloud Speech API.
I have used the Google Cloud Vision API to transcript the audio file and extract the text from the image.
AWS Lambda Function which creates a transcribe job, that reads mp3 file and converts it into text format in a json file.
Implemented some of the models and techniques learned in NLP to help build systems that help in daily life.
inter-convert between audio & text, easy to use with GUI desktop application by PaddleSpeech and PySide6.
Extract textual meaning and knowledge from all videos of a YouTube user's playlists
Chrome Extension to capture captions of ongoing meetings by using webkitspeechrecognition api for all the web video conferencing platforms (for google meet, it directly extracts the captions) and sends to flask api for summarization.
Generate text captions for audio files & youtube video using OpenAI Whisper on Google Colab. Multiple languages support.
core shell functions building blocks for advanced AI pipelines
A SwiftUI App For People Who Need To Take Down Important Information Quickly.
Transcribe Audio to Text with node.js using the Whisper model from OpenAI.
Persian ASR dataset
Simple web application, which can be used to convert audio to subtitles by OpenAI's Whisper model
AudioTextPro: Convert audio to text accurately in real-time using our advanced AI speech recognition technology.
TranscriptGen is an application for transcribing audio and video files. Transcription output is .txt or .srt. Most audio and video formats supported (with ffmpeg).
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. In this template, we will import the Whisper model on Inferless Platform.
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. In this template, we will import the Whisper model on Inferless Platform.
Whisper Large V3 is a pre-trained model developed by OpenAI and designed for tasks like automatic speech recognition (ASR), speech translation and language identification.
Add a description, image, and links to the audio-to-text topic page so that developers can more easily learn about it.
To associate your repository with the audio-to-text topic, visit your repo's landing page and select "manage topics."