Windows process crashes when the GPU model is unloaded #71

satisl · 2023-03-23T14:10:12Z

Thanks for your work first. It's useful.
Howerver, there's still something wrong. It returns -1073740791 (0xC0000409) when dealing with a audio file in chinese. I have defined a function in which the variable 'result' is used to accept the 'segments' returned by the fast-whisper. It's normal in this function, but abnormal after being returned by the function.

The line 'print(result)' works.

But after the result is returned, python returns -1073740791 (0xC0000409) and terminates
When changing the model or the language, it went properly.
Confused.

guillaumekln · 2023-03-23T14:15:26Z

The same error is reported in #64 and is related to the cuDNN installation. Can you check that?

satisl · 2023-03-23T14:26:08Z

Really? But I have already installed Cudnn according to the Nivida documentation, and if I use medium-ct2 instead of large-v2-ct2, switch languages, or process other files, this type of problem will not occur.

guillaumekln · 2023-03-23T14:29:29Z

How much VRAM does your GPU have?

satisl · 2023-03-23T14:30:41Z

about 6000Mb and 4000-5000Mb when running

guillaumekln · 2023-03-23T14:37:33Z

Possibly you are running out of memory for this specific file. Can you try using compute_type="int8_float16"?

satisl · 2023-03-23T14:43:02Z

It still didn't work.
When I tried using compute_type = "float32", it failed and returned "cuda out of memory".
Howerver, this time it returned nothing but -1073740791 (0xC0000409).
Seems like that it's not the reason.

I convert the audio file's format from aac to mp3. But it still didn't work.
Seems like that it's not the format that's wrong.

guillaumekln · 2023-03-23T19:47:16Z

Is it possible for you to share this audio file?

satisl · 2023-03-24T01:45:04Z

the video file
with command
"ffmpeg -i "video file" -vn -ar 16000 "aac file""
"ffmpeg -i "video file" -vn -c:a libmp3lame -ar 16000 "mp3 file""
i extract the audio from the video

guillaumekln · 2023-03-25T18:17:10Z

I don't reproduce the error on Windows 11 with CUDA 11.8.0 and cuDNN 8.8.1.

Are you using the latest version of faster-whisper?

satisl · 2023-03-26T01:11:10Z

The first thing I do when this error occurred is to update the faster-whisper

satisl · 2023-03-26T02:12:13Z

Seems like that it's difficult to reproduce the error. Since this problem only occurs with this unique file under certain conditions (I have also processed various other files since then, and the result is normal operation). Perhaps this issue can be put on hold?
I will reopen this issue if the same error occurs in other files. Thank you for your help in so many days.

satisl · 2023-03-27T03:05:38Z

By coincidence, I discovered the cause of the error. When the model is unloaded, the program will crash.
Previously, model was a local variable and would automatically unload the model when the function ended. And the program crashes.
Now, model is a global variable and would automatically unload the model when the program ended.

Though the program will crash when unloading the model, it will happen after all things are finished now.

Howerver, I have no idea why the program will crash when unloading the model.

guillaumekln · 2023-03-27T12:11:25Z

Does it also crash when you manually unload the model with del model?

satisl · 2023-03-27T12:24:44Z

No, it doesn't.

It will when processing certain files (five out of approximately ninety files). If this error occurs when it processes a file, it will occur no matter how many times it processes the file.
My environment is rtx 3060 laptop, windows 10, cuda11.7, cudnn8.8.0, python3.10.10, model large-v2, language 'zh'.
Attempts have been made to reinstall the python environment or change the python version to 3.9.16, but this type of issue still exists.
The model have been redownloaded, but issue exits.

satisl · 2023-03-27T12:35:42Z

If the model is manually unloaded after processing the file, the program will crash

ProjectEGU · 2023-04-10T08:06:28Z

I found a way to run the transcription in a separate process so that even though it exits that child process, it doesn't exit your main script. Here is a working example:

from multiprocessing import Process, Manager
from faster_whisper import WhisperModel

def work_log(argsList, returnList):
    model_size = "large-v2"
    model = WhisperModel(model_size, device="cuda", compute_type="float16")
    segments, info = model.transcribe(*argsList, beam_size=5)
    returnList[0] = [list(segments), info]

# workaround to deal with a termination issue: https://github.com/guillaumekln/faster-whisper/issues/71
def runWhisperSeperateProc(*args):
    with Manager() as manager:
        returnList = manager.list([None])
        p = Process(target=work_log, args=[args, returnList])  # add return target to end of args list
        p.start()
        p.join()
        p.close()
        return returnList[0]

if __name__ == '__main__':
    segments, info = runWhisperSeperateProc("audio.mp3")
    print(segments, info)

yslion · 2023-04-23T16:16:50Z

same issues

DoodleBears · 2023-04-27T18:30:58Z

I found a way to run the transcription in a separate process so that even though it exits that child process, it doesn't exit your main script. Here is a working example:

from multiprocessing import Process, Manager
from faster_whisper import WhisperModel

def work_log(argsList, returnList):
    model_size = "large-v2"
    model = WhisperModel(model_size, device="cuda", compute_type="float16")
    segments, info = model.transcribe(*argsList, beam_size=5)
    returnList[0] = [list(segments), info]

# workaround to deal with a termination issue: https://github.com/guillaumekln/faster-whisper/issues/71
def runWhisperSeperateProc(*args):
    with Manager() as manager:
        returnList = manager.list([None])
        p = Process(target=work_log, args=[args, returnList])  # add return target to end of args list
        p.start()
        p.join()
        p.close()
        return returnList[0]

if __name__ == '__main__':
    segments, info = runWhisperSeperateProc("audio.mp3")
    print(segments, info)

same issue and open a Process to run works for me

guillaumekln · 2023-04-27T18:39:35Z

@ProjectEGU @yslion @DoodleBears Are you all using the library on Windows?

guillaumekln · 2023-04-27T19:36:13Z

I can now reproduce the issue on Windows.

It is somehow related to the temperature fallback. Can you try setting temperature=0?

satisl · 2023-04-28T04:25:36Z

Glad to know that the reason has been detected. With this setting, the program runs nomally.
However, maybe it will produce slightly worsen result? I don't know.

DoodleBears · 2023-04-28T04:41:02Z

@ProjectEGU @yslion @DoodleBears Are you all using the library on Windows?

Yes, I am using the library on Windows 10, I will try temperature=0 this evening

guillaumekln · 2023-04-28T12:49:23Z

I have a possible fix for this issue in OpenNMT/CTranslate2#1201, but I can't test on my Windows machine today. Can you help testing?

Go to the build page
Download the artifact "python-wheels"
Extract the archive
Install the Windows wheel matching your Python version with pip install --force-reinstall <wheel file>

DoodleBears · 2023-04-28T14:19:03Z

I have a possible fix for this issue in OpenNMT/CTranslate2#1201, but I can't test on my Windows machine today. Can you help testing?

Go to the build page

Download the artifact "python-wheels"

Extract the archive

Install the Windows wheel matching your Python version with pip install --force-reinstall <wheel file>

Yes, I will try it now, by the way I try temperature=0, it works (process did not exit)

DoodleBears · 2023-04-28T15:17:38Z

I have a possible fix for this issue in OpenNMT/CTranslate2#1201, but I can't test on my Windows machine today. Can you help testing?

Go to the build page

Download the artifact "python-wheels"

Extract the archive

Install the Windows wheel matching your Python version with pip install --force-reinstall <wheel file>

I install the wheel.

try to run without temperature=0 —— same issue (process still exit)
try to run with temperature=0 —— works

DoodleBears · 2023-04-28T15:36:14Z

when using temperature=0:
I met segmentation fault BEFORE installing the wheel sometimes when I call the function below many times, don't know why

Sorry for did not keep the log, I remember it mentioned: cuBLAS and CUDA ...... segments fault, once I reproduce it, I will share the error log

def transcribe_speeches(self):
    log.init_logging(debug=True)
    # NOTE: 读取音频文件
    logger.info(f"开始语音转文字")
    whisper = WhisperModel(WHISPER_MODEL, device="cuda", compute_type="float16")
    speeches_num = len(self.speeches)
    for index, speech in enumerate(self.speeches):
        logger.debug(f"开始识别 {speech.audio_path}")
        speech_text = ''
        # NOTE: 识别音频文件
        segments, _ = whisper.transcribe(
            audio=speech.audio_path,
            language='zh',
            vad_filter=False,
            temperature=0,
            initial_prompt='以下是普通话的句子。'
            )
        segments = list(segments)
        if len(segments) == 0:
            logger.warning(f"识别结果为空: {speech.audio_path}")
        else:
            speech_text = '，'.join([segment.text for segment in segments])
            logger.info(f"识别结果({index+1}/{speeches_num}): {speech_text}")
        self.speeches[index].text = speech_text
        
    
    logger.info(f"结束语音转文字: {self.speeches}")
    # queue.put(self.speeches)
    # FIXME: 卸载模型后会导致程序终止
    del whisper

fquirin · 2023-05-06T12:54:54Z

I'm not sure if this is directly related, but I get Segmentation fault error from time to time when I start to analyze the transcription segments via for segment in segments: .... I can't really pin down the precise location but it must be somewhere in WhisperModel.generate_segments and it happens only when my program tries to handle some remaining chunks at the end of a stream that are basically background noise.
Since I set temperature=0 it hasn't happened again.

guillaumekln · 2023-05-10T06:53:30Z

@fquirin Are you also running on Windows with a GPU? If not, I’m not sure your issue is related. You can open another issue if you can share the audio and options triggering the crash.

fquirin · 2023-05-10T11:29:10Z

@guillaumekln I'm running it on Windows + CPU.
The problem is I can't reproduce it with audio files so far, only with my live-streaming server, but a pretty reliable way to get the segmentation fault is coughing 🤔. First I thought it was a problem with my code but it never happens with temperature=0 and so far it never happend on Linux Aarch64 as well (with or without temp=0). Notably another difference to my Linux Aarch64 system is that my x86 CPU is much much faster (maybe a race condition?).

Btw, when I run my "coughing" test WAV files I noticed that Whisper can start to hallucinate pretty extensively with temperature != 0.

I'll try to pin down the segmentation fault by adding some debug info to WhisperModel.generate_segments

Keith-Hon · 2023-05-16T05:50:04Z

I have the same error when running the script in windows 10 WSL (ubuntu)

edit: i installed all the deps and tried again and it worked now

SYSTRAN/faster-whisper#71

hoonlight · 2023-06-01T00:42:45Z

same issue with windows 11

hoonlight · 2023-06-07T12:25:06Z

I was able to avoid that error with the temperature=0 setting. Will this setting adversely affect the transcribe results? I searched the whisper repo, but couldn't find a satisfactory answer.

guillaumekln · 2023-07-07T15:55:00Z

Yes disabling the temperature fallback can affect the results. The fallback is mostly useful to recover from cases where the model generates the same token in a loop.

hoonlight · 2023-07-08T01:10:32Z

Yes disabling the temperature fallback can affect the results. The fallback is mostly useful to recover from cases where the model generates the same token in a loop.

Thank you. My test results were the same as you said.

JamePeng · 2023-07-14T17:12:51Z

My runtime environment is Python 3.11.4, CUDA 11.8.0, graphics card driver 522.06, and cudnn-windows-x86_64-8.9.3.28. I am using the faster-whisper project, and when I try to load the model using GPU, Python returns -1073740791 (0xC0000409) error. However, when I use CPU, the error does not occur.

I have tried various solutions, including the ones you mentioned above, such as installing CUDA environment, adding system variables, and modifying the temperature to 0. None of them have worked.

Whenever I iterate over the segments, CUDA crashes, and the program terminates.

Finally, when I test and print print(torch.cuda.is_available()) to check if CUDA device is recognized as True, the program runs without any issues.

My personal estimation is that there might be an issue with the initialization and release of CUDA in CT2.

zh-plus · 2023-07-14T19:25:10Z

My runtime environment is Python 3.11.4, CUDA 11.8.0, graphics card driver 522.06, and cudnn-windows-x86_64-8.9.3.28. I am using the faster-whisper project, and when I try to load the model using GPU, Python returns -1073740791 (0xC0000409) error. However, when I use CPU, the error does not occur.

I have tried various solutions, including the ones you mentioned above, such as installing CUDA environment, adding system variables, and modifying the temperature to 0. None of them have worked.

Whenever I iterate over the segments, CUDA crashes, and the program terminates.

Finally, when I test and print print(torch.cuda.is_available()) to check if CUDA device is recognized as True, the program runs without any issues.

My personal estimation is that there might be an issue with the initialization and release of CUDA in CT2.

Go check if you install zlib refer to https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-zlib-windows

guillaumekln · 2023-07-15T10:31:59Z

@JamePeng This is a different issue. The issue described in this thread is a crash when unloading the model.

The error you get generally means that the program cannot locate the cuDNN and/or Zlib libraries. There are already several discussions about this.

JamePeng · 2023-07-16T08:56:23Z

@guillaumekln ok, it worked now, thanks for your help

zh-plus · 2023-08-11T07:59:55Z

Do we have any updates on resolving this issue? Currently, using the workaround of setting temperature=0 is an option, but it could potentially impact the model's performance.

CheshireCC · 2023-08-14T07:06:24Z

Does it also crash when you manually unload the model with del model?

I compileted my app with nuitka, and then run it as Administrastor User , it will not crash when unload model.

sanek11591 · 2023-11-07T19:51:29Z

I have the same problem. My config python 3.10.7 CUDA ToolKit 11.8 cuDNN 8.9.6 and add to PATH. If i change temperature=0, i get looping

Dadangdut33 · 2023-12-10T18:51:59Z

i had the same problem and i think i fixed it in my case by moving the faster whisper import inside the function that needs/uses it.

But, keep in mind that I am using faster whisper through stable whisper, and i need to import some stuff from the faster whisper library. I previously imported it globally in the top and found that my app will sometimes crashes after loading and reloading different model, but then after moving it to only inside the function that uses it somehow the crash is gone

1Wayne1 · 2024-05-09T14:33:55Z

I have the same issue. I reinstall pytorch with this command conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia and solve the problem.

nebehr · 2024-07-18T10:32:01Z

I can consistently reproduce it with the latest master, Python 3.11.1 and Cuda 12.5 on Windows 10, 3 minutes of audio and tiny model, with the following simple code:

from faster_whisper import WhisperModel
model = WhisperModel("tiny", device="cuda", compute_type="auto")
segments, info = model.transcribe("js.wav")
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

I only installed Cuda Toolkit 12.5, not cuDNN. At no point does the system max out on CPU, GPU or memory.

If device="cpu" is forced, the issue does not occur, nor does it with temperature=0.0 as stated above. Curiously, it also does not occur if I don't iterate to the end of segments generator: in my case, if the iteration is stopped roughly half way through, there is no crash.

If I put del model at the end, on some occasions the crash comes on that instruction, but sometimes after it.

jianchang512 · 2024-09-10T08:47:07Z

Windows 10 using only cpu, data type int8 / float32, both may crash. Same feedback as above.

Possible reproductions: Multiple larger audios (10+ minutes/16k/ac 1/wav), for loop continuous recognition, in the last few audio file tasks, at the end of the segments iteration, regardless of whether del model or not, and regardless of whether multiple recognition tasks share a single model, or each task creates a single model, all may crash.

Temperature has been set to 0, condition_on_previous_text has been set to false, beam_size best_of has been set to 1

Single task execution, even with large audio, rarely crashes. Crashes mostly occur when multiple tasks in a row continue.

Tried creating a process for each task, and when one process finishes and then starts another, it still crashes!

Looking at the dump crash info for windows.

The thread tried to read from or write to a virtual address for which it does not have the appropriate access.

Not as long as the continuous running of multiple tasks will necessarily crash, there is a certain probability of crash, sometimes more than a dozen tasks in a row to execute without error, sometimes three or five tasks may crash!

TechInterMezzo · 2024-09-16T17:35:13Z

Did anyone find out yet if this is a bug in faster-whisper or in ctranslate2?

usernotnull · 2024-12-06T09:45:59Z

#71 (comment)

This workaround worked for me as well.

Windows 11, Python 3.13.0, CUDA 12.4

satisl changed the title ~~Maybe just a little bug~~ A file can't be dealt with on large-v2-ct model in Chinese. Mar 25, 2023

satisl closed this as completed Mar 26, 2023

satisl reopened this Mar 27, 2023

guillaumekln mentioned this issue Apr 27, 2023

No outputs Softcatala/whisper-ctranslate2#11

Closed

fquirin mentioned this issue May 10, 2023

segmentation fault in 'generate_with_fallback' when temperature != 0 #223

Open

seriousm4x added a commit to seriousm4x/wubbl0rz-archiv-transcribe that referenced this issue May 28, 2023

use temperature=0 as workaround for model crash

2bd1932

SYSTRAN/faster-whisper#71

sngazm mentioned this issue Dec 16, 2023

Faster Whisper sometimes stops running suddenly #620

Open

1Wayne1 mentioned this issue May 9, 2024

程序异常退出 ultrasev/stream-whisper#25

Closed

thewh1teagle mentioned this issue May 25, 2024

Bug: Crash on loading model thewh1teagle/vibe#79

Closed

Ladbaby mentioned this issue Jun 5, 2024

【修复】关于Windows下faster-whisper使用GPU推理导致崩溃的处理方法 Chenyme/Chenyme-AAVT#31

Closed

jianchang512 mentioned this issue Jun 9, 2024

识别完字幕后闪退 jianchang512/pyvideotrans#430

Closed

nebehr mentioned this issue Jul 17, 2024

Crash in subtitle generation - IndexError: list index out of range Purfview/whisper-standalone-win#282

Closed

TechInterMezzo mentioned this issue Sep 17, 2024

Python process crashes on exit under Windows with CUDA OpenNMT/CTranslate2#1782

Open

Windows process crashes when the GPU model is unloaded #71

Windows process crashes when the GPU model is unloaded #71

Comments

satisl commented Mar 23, 2023

guillaumekln commented Mar 23, 2023

satisl commented Mar 23, 2023

guillaumekln commented Mar 23, 2023

satisl commented Mar 23, 2023 • edited Loading

guillaumekln commented Mar 23, 2023

satisl commented Mar 23, 2023 • edited Loading

guillaumekln commented Mar 23, 2023

satisl commented Mar 24, 2023 • edited Loading

guillaumekln commented Mar 25, 2023

satisl commented Mar 26, 2023

satisl commented Mar 26, 2023 • edited Loading

satisl commented Mar 27, 2023

guillaumekln commented Mar 27, 2023

satisl commented Mar 27, 2023 • edited Loading

satisl commented Mar 27, 2023

ProjectEGU commented Apr 10, 2023

yslion commented Apr 23, 2023

DoodleBears commented Apr 27, 2023

guillaumekln commented Apr 27, 2023

guillaumekln commented Apr 27, 2023 • edited Loading

satisl commented Apr 28, 2023

DoodleBears commented Apr 28, 2023 • edited Loading

guillaumekln commented Apr 28, 2023

DoodleBears commented Apr 28, 2023

DoodleBears commented Apr 28, 2023 • edited Loading

DoodleBears commented Apr 28, 2023 • edited Loading

fquirin commented May 6, 2023

guillaumekln commented May 10, 2023

fquirin commented May 10, 2023

Keith-Hon commented May 16, 2023 • edited Loading

hoonlight commented Jun 1, 2023 • edited Loading

hoonlight commented Jun 7, 2023

guillaumekln commented Jul 7, 2023

hoonlight commented Jul 8, 2023

JamePeng commented Jul 14, 2023

zh-plus commented Jul 14, 2023

guillaumekln commented Jul 15, 2023

JamePeng commented Jul 16, 2023

zh-plus commented Aug 11, 2023

CheshireCC commented Aug 14, 2023

sanek11591 commented Nov 7, 2023 • edited Loading

Dadangdut33 commented Dec 10, 2023

1Wayne1 commented May 9, 2024

nebehr commented Jul 18, 2024

jianchang512 commented Sep 10, 2024 • edited Loading

TechInterMezzo commented Sep 16, 2024

usernotnull commented Dec 6, 2024

satisl commented Mar 23, 2023 •

edited

Loading

satisl commented Mar 23, 2023 •

edited

Loading

satisl commented Mar 24, 2023 •

edited

Loading

satisl commented Mar 26, 2023 •

edited

Loading

satisl commented Mar 27, 2023 •

edited

Loading

guillaumekln commented Apr 27, 2023 •

edited

Loading

DoodleBears commented Apr 28, 2023 •

edited

Loading

DoodleBears commented Apr 28, 2023 •

edited

Loading

DoodleBears commented Apr 28, 2023 •

edited

Loading

Keith-Hon commented May 16, 2023 •

edited

Loading

hoonlight commented Jun 1, 2023 •

edited

Loading

sanek11591 commented Nov 7, 2023 •

edited

Loading

jianchang512 commented Sep 10, 2024 •

edited

Loading