Merge pull request #51 from Huanshere/dev_1.0

Dev 1.0
Huanshere · Sep 21, 2024 · 16d5733 · 16d5733
2 parents a555e32 + e8a7a38
commit 16d5733
Show file tree

Hide file tree

Showing 232 changed files with 1,193 additions and 301,456 deletions.
diff --git a/README.en.md b/README.en.md
@@ -15,24 +15,24 @@
 
 ## 🌟 Project Introduction
 
-VideoLingo is an all-in-one video translation and localization tool designed to generate Netflix-quality subtitles, eliminating stiff machine translations and multi-line subtitles, enabling knowledge sharing across language barriers worldwide. Through an intuitive Streamlit web interface, you can complete the entire process from video link to embedded high-quality bilingual subtitles with just a few clicks, easily creating localized videos with Netflix-quality subtitles.
+VideoLingo is an all-in-one video translation and localization dubbing tool, aimed at generating Netflix-quality subtitles, eliminating stiff machine translations and multi-line subtitles, while also adding high-quality dubbing. It enables knowledge sharing across language barriers worldwide. Through an intuitive Streamlit web interface, you can complete the entire process from video link to embedded high-quality bilingual subtitles and even dubbing with just a few clicks, easily creating Netflix-quality localized videos.
 
 Key features and functionalities:
-- Uses yt-dlp to download videos from YouTube links
+- 🎥 Uses yt-dlp to download videos from YouTube links
 
-- Uses WhisperX for word-level timeline subtitle recognition
+- 🎙️ Uses WhisperX for word-level timeline subtitle recognition
 
-- Uses NLP and GPT for subtitle segmentation based on sentence meaning
+- 📝 Uses NLP and GPT for subtitle segmentation based on sentence meaning
 
-- GPT summarizes intelligent terminology knowledge base for context-aware translation
+- 📚 GPT summarizes intelligent terminology knowledge base for context-aware translation
 
-- Three-step direct translation, reflection, and paraphrasing to eliminate awkward machine translations
+- 🔄 Three-step direct translation, reflection, and paraphrasing to eliminate awkward machine translations
 
-- Netflix-standard single-line subtitle length and translation quality checks
+- ✅ Netflix-standard single-line subtitle length and translation quality checks
 
-- One-click integrated package launch, one-click video production in Streamlit
+- 🗣️ Uses GPT-SoVITS for high-quality aligned dubbing
 
-🚧 VideoLingo is also actively developing voice cloning technology, which will soon support video dubbing, further enhancing the localization experience.
+- 🚀 One-click integrated package launch, one-click video production in Streamlit
 
 ## 🎥 Demo
 
@@ -53,46 +53,50 @@ https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7
 
 Currently supported input languages and examples:
 
-| Input Language | Support Level | Example Video |
-|----------------|---------------|----------------|
-| 🇬🇧🇺🇸 English | 🤩 | [English to Chinese demo](https://github.com/user-attachments/assets/127373bb-c152-4b7a-8d9d-e586b2c62b4b) |
-| 🇷🇺 Russian | 😊 | [Russian to Chinese demo](https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7) |
-| 🇫🇷 French | 🤩 | [French to Japanese demo](https://github.com/user-attachments/assets/3ce068c7-9854-4c72-ae77-f2484c7c6630) |
-| 🇩🇪 German | 🤩 | [German to Chinese demo](https://github.com/user-attachments/assets/07cb9d21-069e-4725-871d-c4d9701287a3) |
-| 🇮🇹 Italian | 🤩 | [Italian to Chinese demo](https://github.com/user-attachments/assets/f1f893eb-dad3-4460-aaf6-10cac999195e) |
-| 🇪🇸 Spanish | 🤩 | [Spanish to Chinese demo](https://github.com/user-attachments/assets/c1d28f1c-83d2-4f13-a1a1-859bd6cc3553) |
-| 🇯🇵 Japanese | 😐 | [Japanese to Chinese demo](https://github.com/user-attachments/assets/856c3398-2da3-4e25-9c36-27ca2d1f68c2) |
-| 🇨🇳 Chinese | 😖 | ❌ |
+| Input Language | Support Level | Translation Demo | Dubbing Demo |
+|----------------|---------------|-------------------|--------------|
+| 🇬🇧🇺🇸 English | 🤩 | [English to Chinese](https://github.com/user-attachments/assets/127373bb-c152-4b7a-8d9d-e586b2c62b4b) | TODO |
+| 🇷🇺 Russian | 😊 | [Russian to Chinese](https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7) | TODO |
+| 🇫🇷 French | 🤩 | [French to Japanese](https://github.com/user-attachments/assets/3ce068c7-9854-4c72-ae77-f2484c7c6630) | TODO |
+| 🇩🇪 German | 🤩 | [German to Chinese](https://github.com/user-attachments/assets/07cb9d21-069e-4725-871d-c4d9701287a3) | TODO |
+| 🇮🇹 Italian | 🤩 | [Italian to Chinese](https://github.com/user-attachments/assets/f1f893eb-dad3-4460-aaf6-10cac999195e) | TODO |
+| 🇪🇸 Spanish | 🤩 | [Spanish to Chinese](https://github.com/user-attachments/assets/c1d28f1c-83d2-4f13-a1a1-859bd6cc3553) | TODO |
+| 🇯🇵 Japanese | 😐 | [Japanese to Chinese](https://github.com/user-attachments/assets/856c3398-2da3-4e25-9c36-27ca2d1f68c2) | TODO |
+| 🇨🇳 Chinese | 😖 | ❌ | TODO |
 
-Output language supports all languages that Claude can handle.
+Translation languages support all languages that the large language model can handle, while dubbing languages depend on the chosen TTS method.
 
 ## 🚀 Quick Start
 
 ### One-Click Package Installation
 
-1. Download the `v0.8.2` one-click package (700M): [Direct Link](https://vip.123pan.cn/1817874751/8101255) | [Baidu Backup](https://pan.baidu.com/s/1H_3PthZ3R3NsjS0vrymimg?pwd=ra64)
+1. Download the `v1.0.0` one-click package (750M): [CPU Version Download](https://vip.123pan.cn/1817874751/8117948) | [Baidu Backup](https://pan.baidu.com/s/1H_3PthZ3R3NsjS0vrymimg?pwd=ra64)
 
 2. After extracting, double-click `OneKeyStart.bat` in the folder
 
-3. In the opened browser window, make necessary configurations in the sidebar, then create your video with one click!
+3. In the opened web interface, configure the API in the sidebar, then create your video with one click!
 
-> 💡 Note: This project requires API keys for large language models and Replicate's API for transcription 🌩️ <br> For application and configuration of api_keys, please read the [Local Installation Guide](./docs/install_locally_en.md)
+> 💡 Note: This project requires configuration of large language models, WhisperX, and TTS. Please carefully read the [Local Installation Guide](./docs/install_locally_zh.md)
 
-### Source Code Installation Method
+## 🛠️ Source Code Installation
 
-For detailed installation guide, including source code installation and development environment configuration, please refer to the [Local Installation Guide](./docs/install_locally_en.md).
+For a detailed installation guide, including source code installation and development environment configuration, please refer to the [Local Installation Guide](./docs/install_locally_zh.md).
 
-## 📚 Documentation
+This project uses structured module development. You can run `core\step__.py` files in sequence. Technical documentation: [Chinese](./docs/README_guide_zh.md) | [English](./docs/README_guide_en.md)
 
-- This project uses structured module development, you can run `core\step__.py` in sequence. Technical documentation: [Chinese](./docs/README_guide_zh.md) | [English](./docs/README_guide_en.md)
+## 📄 License
 
-## 🙏 Acknowledgements
+This project is licensed under the MIT License. When using this project, please follow these rules:
 
-- [whisper-timestamped](https://github.com/linto-ai/whisper-timestamped), [whisperX](https://github.com/m-bain/whisperX), [yt-dlp](https://github.com/yt-dlp/yt-dlp), [json_repair](https://github.com/mangiucugna/json_repair)
+1. Credit VideoLingo for subtitle generation when publishing works.
+2. Follow the terms of the large language models and TTS used for proper attribution.
 
-## 📄 License
+We sincerely thank the following open-source projects for their contributions, which provided important support for the development of VideoLingo:
 
-This project is licensed under the MIT License. Please credit VideoLingo for subtitle generation when publishing works.
+- [whisperX](https://github.com/m-bain/whisperX)
+- [yt-dlp](https://github.com/yt-dlp/yt-dlp)
+- [json_repair](https://github.com/mangiucugna/json_repair)
+- [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)
 
 ## 📬 Contact Us
 

diff --git a/README.md b/README.md
@@ -15,24 +15,24 @@
 
 ## 🌟 项目简介
 
-VideoLingo 是一站式视频翻译本地化工具，旨在生成 Netflix 级别的高质量字幕，告别生硬机翻，告别多行字幕，让全世界的知识能够跨越语言的障碍共享。通过直观的 Streamlit 网页界面，只需点击两下就能完成从视频链接到内嵌高质量双语字幕的整个流程，轻松创建出具有 Netflix 品质字幕的本地化视频。
+VideoLingo 是一站式视频翻译本地化配音工具，旨在生成 Netflix 级别的高质量字幕，告别生硬机翻，告别多行字幕，还能加上高质量的配音，让全世界的知识能够跨越语言的障碍共享。通过直观的 Streamlit 网页界面，只需点击两下就能完成从视频链接到内嵌高质量双语字幕甚至带上配音的整个流程，轻松创建 Netflix 品质的本地化视频。
 
 主要特点和功能：
-- 使用 yt-dlp 从 Youtube 链接下载视频
+- 🎥 使用 yt-dlp 从 Youtube 链接下载视频
 
-- 使用 WhisperX 进行单词级时间轴字幕识别
+- 🎙️ 使用 WhisperX 进行单词级时间轴字幕识别
 
-- 使用 NLP 和 GPT 根据句意进行字幕分割
+- 📝 使用 NLP 和 GPT 根据句意进行字幕分割
 
-- GPT 总结智能术语知识库，实现上下文感知翻译
+- 📚 GPT 总结智能术语知识库，上下文感知翻译
 
-- 三步直译、反思、意译，告别诡异机翻
+- 🔄 三步直译、反思、意译，告别诡异机翻
 
-- 按照 Netflix 标准检查单行字幕长度与翻译质量
+- ✅ 按照 Netflix 标准检查单行字幕长度与翻译质量
 
-- 一键整合包启动，在 streamlit 中一键出片
+- 🗣️ 使用 GPT-SoVITS 进行高质量的对齐配音
 
-🚧 VideoLingo 还在积极开发声音克隆技术，很快将支持视频配音，进一步提升视频的本地化体验。
+- 🚀 整合包一键启动，在 streamlit 中一键出片
 
 ## 🎥 效果演示
 
@@ -53,46 +53,50 @@ https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7
 
 当前支持的所有输入语言和示例：
 
-| 输入语言 | 支持程度 | 示例视频 |
-|---------|---------|---------|
-| 🇬🇧🇺🇸 英语 | 🤩 | [英转中](https://github.com/user-attachments/assets/127373bb-c152-4b7a-8d9d-e586b2c62b4b) |
-| 🇷🇺 俄语 | 😊 | [俄转中](https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7) |
-| 🇫🇷 法语 | 🤩 | [法转日](https://github.com/user-attachments/assets/3ce068c7-9854-4c72-ae77-f2484c7c6630) |
-| 🇩🇪 德语 | 🤩 | [德转中](https://github.com/user-attachments/assets/07cb9d21-069e-4725-871d-c4d9701287a3) |
-| 🇮🇹 意大利语 | 🤩 | [意转中](https://github.com/user-attachments/assets/f1f893eb-dad3-4460-aaf6-10cac999195e) |
-| 🇪🇸 西班牙语 | 🤩 | [西转中](https://github.com/user-attachments/assets/c1d28f1c-83d2-4f13-a1a1-859bd6cc3553) |
-| 🇯🇵 日语 | 😐 | [日转中](https://github.com/user-attachments/assets/856c3398-2da3-4e25-9c36-27ca2d1f68c2) |
-| 🇨🇳 中文 | 😖 | ❌ |
+| 输入语言 | 支持程度 | 翻译demo | 配音demo |
+|---------|---------|---------|----------|
+| 🇬🇧🇺🇸 英语 | 🤩 | [英转中](https://github.com/user-attachments/assets/127373bb-c152-4b7a-8d9d-e586b2c62b4b) | TODO |
+| 🇷🇺 俄语 | 😊 | [俄转中](https://github.com/user-attachments/assets/25264b5b-6931-4d39-948c-5a1e4ce42fa7) | TODO |
+| 🇫🇷 法语 | 🤩 | [法转日](https://github.com/user-attachments/assets/3ce068c7-9854-4c72-ae77-f2484c7c6630) | TODO |
+| 🇩🇪 德语 | 🤩 | [德转中](https://github.com/user-attachments/assets/07cb9d21-069e-4725-871d-c4d9701287a3) | TODO |
+| 🇮🇹 意大利语 | 🤩 | [意转中](https://github.com/user-attachments/assets/f1f893eb-dad3-4460-aaf6-10cac999195e) | TODO |
+| 🇪🇸 西班牙语 | 🤩 | [西转中](https://github.com/user-attachments/assets/c1d28f1c-83d2-4f13-a1a1-859bd6cc3553) | TODO |
+| 🇯🇵 日语 | 😐 | [日转中](https://github.com/user-attachments/assets/856c3398-2da3-4e25-9c36-27ca2d1f68c2) | TODO |
+| 🇨🇳 中文 | 😖 | ❌ | TODO |
 
-输出语言支持 Claude 能处理的所有语言。
+翻译语言支持大模型会的所有语言，配音语言取决于选取的TTS方法。
 
 ## 🚀 快速开始
 
-### 一键整合包安装
+### 一键整合包
 
-1. 下载 `v0.8.2` 一键整合包(650M): [直达链接](https://vip.123pan.cn/1817874751/8101255) | [度盘备用](https://pan.baidu.com/s/1H_3PthZ3R3NsjS0vrymimg?pwd=ra64)
+1. 下载 `v1.0.0` 一键整合包(750M): [CPU版下载](https://vip.123pan.cn/1817874751/8117948) | [度盘备用](https://pan.baidu.com/s/1H_3PthZ3R3NsjS0vrymimg?pwd=ra64)
 
 2. 解压后双击运行文件夹中的 `一键启动.bat`
 
-3. 在打开的浏览器窗口中，在侧边栏进行必要配置，然后一键出片！
+3. 在弹出的 web 中，在侧边栏配置 api，然后一键出片！
 
-> 💡 提示: 本项目需要大模型的 API 以及 Replicate转录 的 API 🌩️ <br> 申请及配置 api_key 请阅读 [本地安装教程](./docs/install_locally_zh.md)
+> 💡 提示: 本项目需要配置大模型、WhisperX、TTS，请仔细阅读 [本地安装教程](./docs/install_locally_zh.md)
 
-### 源码安装方法
+## 🛠️ 从源码安装
 
 详细的安装指南，包括源码安装和开发环境配置，请参考 [本地安装教程](./docs/install_locally_zh.md)。
 
-## 📚 文档
+本项目采用结构化模块开发，可按顺序逐个运行 `core\step__.py`，技术文档: [中文](./docs/README_guide_zh.md) ｜ [英文](./docs/README_guide_en.md)
 
-- 本项目采用结构化模块开发，可按顺序逐个运行 `core\step__.py`，技术文档: [中文](./docs/README_guide_zh.md) ｜ [英文](./docs/README_guide_en.md)
+## 📄 许可证
 
-## 🙏 致谢
+本项目采用 MIT 许可证。使用本项目时，请遵循以下规定：
 
-- [whisper-timestamped](https://github.com/linto-ai/whisper-timestamped), [whisperX](https://github.com/m-bain/whisperX), [yt-dlp](https://github.com/yt-dlp/yt-dlp), [json_repair](https://github.com/mangiucugna/json_repair)
+1. 发表作品时请标注字幕由 VideoLingo 生成。
+2. 遵循使用的大模型和TTS条约进行备注。
 
-## 📄 许可证
+我们衷心感谢以下开源项目的贡献，它们为 VideoLingo 的开发提供了重要支持：
 
-本项目采用 MIT 许可证，发表作品请标注字幕由 VideoLingo 生成。
+- [whisperX](https://github.com/m-bain/whisperX)
+- [yt-dlp](https://github.com/yt-dlp/yt-dlp)
+- [json_repair](https://github.com/mangiucugna/json_repair)
+- [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS)
 
 ## 📬 联系我们