📃 vtt2txt2docx (now with added .srt)

This Python script generates cleaned up versions of .txt and .docx files from an MS Stream .vtt caption file. It strips out content such as lines beginning with WEBVTT, NOTE, a timestamp, or a reference such as 3dc72631-b191, leaving only the text generated from the speaker's voice. As of September 2023, the script can now handle .srt files too.

🤔 Rationale

Last year (during COVID-19 times), I recorded some of my lectures without using a text script to guide me. For consistency, this year, I would like to use a text script for all recorded lectures. To help me write them up, I have developed this tool to take a .vtt caption file from my old lectures, and convert these into different formats, helping me to create new text scripts.

⚙️ Requirements

To run the script, the following packages are required:

python-docx - Allows Python scripts to generate Word .docx files
cowsay - Generates ASCII art pictures of a cow with a message (optional)

Install these via pip:

pip install python-docx cowsay

⌨️ Usage

python vtt2txt2docx.py fileYouWantToConvert.vtt

🔨 Testing Notes

The script has been tested on MacOS Catalina version 10.15.7 with Python 3.6.10. Your mileage may vary.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
gfx		gfx
ExampleCaptionFile.vtt		ExampleCaptionFile.vtt
README.md		README.md
vtt2txt2docx.py		vtt2txt2docx.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📃 vtt2txt2docx (now with added .srt)

🤔 Rationale

⚙️ Requirements

⌨️ Usage

🔨 Testing Notes

About

Languages

Lynsay/vtt2txt2docx

Folders and files

Latest commit

History

Repository files navigation

📃 vtt2txt2docx (now with added .srt)

🤔 Rationale

⚙️ Requirements

⌨️ Usage

🔨 Testing Notes

About

Topics

Resources

Stars

Watchers

Forks

Languages