Skip to content

Commit

Permalink
Merge pull request #30 from ieasybooks/remove_whisper_jax
Browse files Browse the repository at this point in the history
إزالة اعتمادية whisper_jax من تفريغ بشكل كامل للتمكن من نشر تفريغ على PyPI
  • Loading branch information
AliOsm authored Aug 22, 2023
2 parents ce6dcb4 + 88a9140 commit fed1d03
Show file tree
Hide file tree
Showing 9 changed files with 4 additions and 72 deletions.
26 changes: 3 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,6 @@
</li>
<li>اللغة: يمكنك تحديد لغة الصوت من خلال الاختيار <code dir="ltr">--language</code>. على سبيل المثال، لتحديد اللغة العربية قم بتمرير <code dir="ltr">ar</code>. إذا لم يتم تحديد اللغة، سيتم التعرف عليها تلقائيا</li>
<li>استخدام نسخة أسرع من نماذج Whisper: من خلال تمرير الاختيار <code dir="ltr">--use_faster_whisper</code> سيتم استخدام النسخة الأسرع من نماذج Whisper</li>
<li>إطار عمل JAX: يمكنك استخدام النموذج المكتوب باستخدام إطار عمل JAX من خلال تمرير الاختيار <code dir="ltr">--use_whisper_jax</code>، لكنك ستحتاج لتثبيت إطار عمل JAX يدويا على حاسبك من خلال اتباع الخطوات الموجودة <a href="https://github.com/google/jax#installation">هنا</a></li>
<li>حجم نطاق البحث: يمكنك تحسين النتائج باستخدام اختيار <code dir="ltr">--beam_size</code> والذي يسمح لك بإجبار النموذج على البحث في نطاق أوسع من الكلمات أثناء إنشاء النص. القيمة الإفتراضية هي <code>5</code></li>
<li>
طريقة ضغط النموذج: يمكنك تحديد الطريقة التي تم بها ضغط النموذج أثناء تحويله باستخدام أداة <a href="https://opennmt.net/CTranslate2/guides/transformers.html"><code>ct2-transformers-converter</code></a> من خلال تمرير الاختيار <code dir="ltr">--ct2_compute_type</code>. الطرق المتوفرة:
Expand Down Expand Up @@ -144,10 +143,9 @@
➜ tafrigh --help
usage: tafrigh [-h] [--skip_if_output_exist | --no-skip_if_output_exist] [--playlist_items PLAYLIST_ITEMS] [--verbose | --no-verbose] [-m MODEL_NAME_OR_PATH] [-t {transcribe,translate}]
[-l {af,am,ar,as,az,ba,be,bg,bn,bo,br,bs,ca,cs,cy,da,de,el,en,es,et,eu,fa,fi,fo,fr,gl,gu,ha,haw,he,hi,hr,ht,hu,hy,id,is,it,ja,jw,ka,kk,km,kn,ko,la,lb,ln,lo,lt,lv,mg,mi,mk,ml,mn,mr,ms,mt,my,ne,nl,nn,no,oc,pa,pl,ps,pt,ro,ru,sa,sd,si,sk,sl,sn,so,sq,sr,su,sv,sw,ta,te,tg,th,tk,tl,tr,tt,uk,ur,uz,vi,yi,yo,zh}]
[--use_faster_whisper | --no-use_faster_whisper] [--use_whisper_jax | --no-use_whisper_jax] [--beam_size BEAM_SIZE] [--ct2_compute_type {default,int8,int8_float16,int16,float16}]
[-w WIT_CLIENT_ACCESS_TOKENS [WIT_CLIENT_ACCESS_TOKENS ...]] [--max_cutting_duration [1-17]] [--min_words_per_segment MIN_WORDS_PER_SEGMENT]
[--save_files_before_compact | --no-save_files_before_compact] [--save_yt_dlp_responses | --no-save_yt_dlp_responses] [--output_sample OUTPUT_SAMPLE]
[-f {all,txt,srt,vtt,none} [{all,txt,srt,vtt,none} ...]] [-o OUTPUT_DIR]
[--use_faster_whisper | --no-use_faster_whisper] [--beam_size BEAM_SIZE] [--ct2_compute_type {default,int8,int8_float16,int16,float16}] [-w WIT_CLIENT_ACCESS_TOKENS [WIT_CLIENT_ACCESS_TOKENS ...]]
[--max_cutting_duration [1-17]] [--min_words_per_segment MIN_WORDS_PER_SEGMENT] [--save_files_before_compact | --no-save_files_before_compact] [--save_yt_dlp_responses | --no-save_yt_dlp_responses]
[--output_sample OUTPUT_SAMPLE] [-f {all,txt,srt,vtt,none} [{all,txt,srt,vtt,none} ...]] [-o OUTPUT_DIR]
urls_or_paths [urls_or_paths ...]
options:
Expand All @@ -171,8 +169,6 @@ Whisper:
Language spoken in the audio, skip to perform language detection.
--use_faster_whisper, --no-use_faster_whisper
Whether to use Faster Whisper implementation. (default: False)
--use_whisper_jax, --no-use_whisper_jax
Whether to use Whisper JAX implementation. Make sure to have JAX installed before using this option. (default: False)
--beam_size BEAM_SIZE
Number of beams in beam search, only applicable when temperature is zero.
--ct2_compute_type {default,int8,int8_float16,int16,float16}
Expand Down Expand Up @@ -248,22 +244,6 @@ tafrigh "https://youtu.be/3K5Jh_-UYeA" \
--output_formats txt srt
```

<h4 dir="rtl">تسريع عملية التفريغ أكثر (غير مختبر بشكل جيد)</h4>

<p dir="rtl">يمكنك استخدام مكتبة <code><a href="https://github.com/sanchit-gandhi/whisper-jax">whisper-jax</a></code> لتحصيل سرعة أكبر في تفريغ المواد من سرعة نماذج Whisper الأصلية من شركة OpenAI تصل إلى 70 ضعفًا، ولكن يجب أن يتم تثبيت إطار عمل JAX على حاسبك كما هو موضح <a href="https://github.com/google/jax#installation">هنا</a> لتتمكن من استخدام هذه المكتبة.</p>

<p dir="rtl">لاستخدام المكتبة، تحتاج فقط لتمرير الاختيار <code dir="ltr">--use_whisper_jax</code> إلى أمر التفريغ كالتالي:</p>

```
tafrigh "https://youtu.be/Di0vcmnxULs" \
--model_name_or_path small \
--task transcribe \
--language ar \
--use_whisper_jax \
--output_dir . \
--output_formats txt srt
```

<h3 dir="rtl">التفريغ باستخدام تقنية wit.ai</h3>

<h4 dir="rtl">تفريغ مقطع واحد</h4>
Expand Down
1 change: 0 additions & 1 deletion colab_notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,6 @@
" model_name_or_path=model,\n",
" task='transcribe',\n",
" language=language,\n",
" use_whisper_jax=False,\n",
" use_faster_whisper=True,\n",
" beam_size=5,\n",
" ct2_compute_type='default',\n",
Expand Down
3 changes: 0 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,9 @@ wit = [
"scipy==1.11.1",
]
whisper = [
"cached-property==1.5.2",
"faster-whisper==0.7.1",
"openai-whisper==20230314",
"stable-ts==2.8.1",
"transformers==4.31.0",
"whisper-jax@git+https://github.com/sanchit-gandhi/whisper-jax.git",
]

[project.urls]
Expand Down
1 change: 0 additions & 1 deletion tafrigh/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ def main():
task=args.task,
language=args.language,
use_faster_whisper=args.use_faster_whisper,
use_whisper_jax=args.use_whisper_jax,
beam_size=args.beam_size,
ct2_compute_type=args.ct2_compute_type,
#
Expand Down
4 changes: 0 additions & 4 deletions tafrigh/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ def __init__(
task: str,
language: str,
use_faster_whisper: bool,
use_whisper_jax: bool,
beam_size: int,
ct2_compute_type: str,
wit_client_access_tokens: list[str],
Expand All @@ -33,7 +32,6 @@ def __init__(
task,
language,
use_faster_whisper,
use_whisper_jax,
beam_size,
ct2_compute_type,
)
Expand Down Expand Up @@ -66,7 +64,6 @@ def __init__(
task: str,
language: str,
use_faster_whisper: bool,
use_whisper_jax: bool,
beam_size: int,
ct2_compute_type: str,
):
Expand All @@ -78,7 +75,6 @@ def __init__(
self.task = task
self.language = language
self.use_faster_whisper = use_faster_whisper
self.use_whisper_jax = use_whisper_jax
self.beam_size = beam_size
self.ct2_compute_type = ct2_compute_type

Expand Down
27 changes: 0 additions & 27 deletions tafrigh/recognizers/whisper_recognizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

import faster_whisper
import whisper
import whisper_jax

from tqdm import tqdm

Expand All @@ -29,8 +28,6 @@ def recognize(
whisper_generator = self._recognize_stable_whisper(file_path, model, whisper_config)
elif isinstance(model, faster_whisper.WhisperModel):
whisper_generator = self._recognize_faster_whisper(file_path, model, whisper_config)
elif isinstance(model, whisper_jax.FlaxWhisperPipline):
whisper_generator = self._recognize_jax_whisper(file_path, model, whisper_config)

while True:
try:
Expand Down Expand Up @@ -105,27 +102,3 @@ def _recognize_faster_whisper(
}

return converted_segments

def _recognize_jax_whisper(
self,
audio_file_path: str,
model: whisper_jax.FlaxWhisperPipline,
whisper_config: Config.Whisper,
) -> Generator[dict[str, float], None, list[dict[str, Union[str, float]]]]:
yield {'progress': 0.0, 'remaining_time': None}

segments = model(
audio_file_path,
task=whisper_config.task,
language=whisper_config.language,
return_timestamps=True,
)['chunks']

return [
{
'start': segment['timestamp'][0],
'end': segment['timestamp'][1],
'text': segment['text'].strip(),
}
for segment in segments
]
2 changes: 0 additions & 2 deletions tafrigh/types/whisper/type_hints.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,10 @@

import faster_whisper
import whisper
import whisper_jax


WhisperModel = TypeVar(
'WhisperModel',
whisper.Whisper,
faster_whisper.WhisperModel,
whisper_jax.FlaxWhisperPipline,
)
7 changes: 0 additions & 7 deletions tafrigh/utils/cli_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,13 +85,6 @@ def parse_args(argv: list[str]) -> argparse.Namespace:
help='Whether to use Faster Whisper implementation.',
)

whisper_group.add_argument(
'--use_whisper_jax',
action=argparse.BooleanOptionalAction,
default=False,
help='Whether to use Whisper JAX implementation. Make sure to have JAX installed before using this option.',
)

whisper_group.add_argument(
'--beam_size',
type=int,
Expand Down
5 changes: 1 addition & 4 deletions tafrigh/utils/whisper/whisper_utils.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,12 @@
import faster_whisper
import stable_whisper
import whisper_jax

from tafrigh.config import Config
from tafrigh.types.whisper.type_hints import WhisperModel


def load_model(whisper_config: Config.Whisper) -> WhisperModel:
if whisper_config.use_whisper_jax:
return whisper_jax.FlaxWhisperPipline(f'openai/whisper-{whisper_config.model_name_or_path}')
elif whisper_config.use_faster_whisper:
if whisper_config.use_faster_whisper:
return faster_whisper.WhisperModel(
whisper_config.model_name_or_path,
compute_type=whisper_config.ct2_compute_type,
Expand Down

0 comments on commit fed1d03

Please sign in to comment.