-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Currently, the timing of the subtitles appearing is not accurate. #2
Comments
|
Hello nangong, I just tested whisper.cpp with a few audio clips again and discovered the issue of subtitles appearing before the actual voice. Thank you for reaching out, I will test the original Python version of Whisper later to see if the issue exists there as well. I'll also take a look at the implementation of whisper.cpp. Previously, I was directly using the executable file of whisper.cpp without checking the cpp code in detail. However, currently, this application still relies on the C++ version to run fastly. For Macs (without cuda), whisper.cpp's processing speed is at least 5 times faster than the native Python version by OpenAI. Thank you for suggesting the use of ffmpeg for silent detection. I would like to see if adding some offset can align the subtitles perfectly with the audio. I will also research other methods. Once I figure it out, I will provide an update. You mentioned that you are working on a Node.js implementation, and for the conversion of SRT files to editable subtitle style FCPXML files. In addition to the Swift code in this app, I have previously developed a CLI tool called srt2subtitles using Node.js (https://github.com/shaishaicookie/srt2subtitles-cli). I hope this tool can be helpful for your app. |
Hi there, thank you very much for releasing this version. Currently, I have also implemented a Node version based on your code. However, there is a problem that the subtitles appear before the speaker starts talking, which is quite troubling. I noticed that whisper_cpp itself has this problem, and someone has already implemented a Python version that seems to have fixed this issue.
Currently, the solution I can think of is to use ffmpeg's silent detection and dynamically adjust the timing of the srt subtitles.
The text was updated successfully, but these errors were encountered: