MemoryError #57

joingreat · 2021-12-28T12:44:16Z

I m on windows 10 and jupyter environment, the audio file lasted 30 minutes, so I cut the file in 10 seconds each and then continue, on the first chunk0 file came across MemoryError.

Is the file still large for the situation? I followed the link :
https://colab.research.google.com/github/timsainb/noisereduce/blob/master/notebooks/1.0-test-noise-reduction.ipynb#scrollTo=E5UkLtmT3xy3, the sample only lasted four seconds.

Or the paramter tuning would help for this ?

myaudio = AudioSegment.from_file(('myaudio.wav') , "wav") 
chunk_length_ms = 10000 # pydub calculates in millisec
chunks = make_chunks(myaudio, chunk_length_ms) #Make chunks of one sec

#Export all of the individual chunks as wav files

for i, chunk in enumerate(chunks):
    chunk_name = "chunk{0}.wav".format(i)
    print ("exporting", chunk_name)
    chunk.export(chunk_name, format="wav")

data, rate = sf.read('chunk0.wav')
data = data

reduced_noise = nr.reduce_noise(y = data, sr=rate, n_std_thresh_stationary=1.5,stationary=True)

MemoryError: Unable to allocate 197. GiB for an array with shape (441000, 60002) and data type float64

The text was updated successfully, but these errors were encountered:

vaastav · 2021-12-31T22:05:33Z

I am getting the same error with a 800KB file that lasts 5 seconds in duration.
This is the error I get:

MemoryError: Unable to allocate array with shape (199898, 60002) and data type float64

timsainb · 2021-12-31T23:54:48Z

What is the shape of the input data?

vaastav · 2022-01-01T00:34:47Z

The shape of my input data is (199898, 2) and sample rate is 44100

timsainb · 2022-01-01T22:58:52Z

There should be no problem with data size, I've run the algorithm on 30 mins+ files (there are params to do chunking for you).

Can you try transposing the array? It might be that the input is channels x samples rather than samples x channels. If so I should make a PR to give a warning if a greater number of channels than samples are present.

joingreat · 2022-01-04T05:04:24Z

@timsainb ,Thanks for the reply

What is the shape of the input data?

data[:3]

array([[ 0.00000000e+00, 0.00000000e+00],
[ 0.00000000e+00, 0.00000000e+00],
[-3.05175781e-05, 0.00000000e+00]])

data.shape
(441000, 2)

rate
44100

There should be no problem with data size, I've run the algorithm on 30 mins+ files (there are params to do chunking for you).

Can you try transposing the array? It might be that the input is channels x samples rather than samples x channels. If so I should make a PR to give a warning if a greater number of channels than samples are present.

reduced_noise = nr.reduce_noise(y = data.T, sr=rate, n_std_thresh_stationary=1,stationary=True)
IPython.display.Audio(data=reduced_noise, rate=rate)

After transposing the data to data.T for the input as above, the bug fixed,
however the result looks mess and even more noisy.

vinodhian · 2022-01-04T05:52:32Z

@joingreat
Your audio is not mono. Please try after converting your audio to mono channel.

timsainb · 2022-01-04T19:38:45Z

The package should be able to handle long and multi-channel data fine. Examples:

Long:
https://colab.research.google.com/github/timsainb/noisereduce/blob/master/notebooks/1.0-test-noise-reduction.ipynb#scrollTo=a2stIgrUlX2h

Multichannel:
https://colab.research.google.com/github/timsainb/noisereduce/blob/master/notebooks/1.0-test-noise-reduction.ipynb#scrollTo=9G0YvxDplX2k

Can either of you try to reproduce your error in a colab notebook so I can take a look.

jd907 · 2022-01-05T22:29:03Z

I had the same issue and I fixed it by using .T

I found it from these two stack posts

https://stackoverflow.com/questions/57137050/error-passing-wav-file-to-ipython-display/57137391
https://stackoverflow.com/questions/40822877/scipy-io-cant-write-wavfile

joingreat · 2022-01-07T11:49:43Z

import IPython
from scipy.io import wavfile
import noisereduce as nr
from pydub import AudioSegment
from tinytag import TinyTag
import soundfile as sf
from noisereduce.generate_noise import band_limited_noise
import matplotlib.pyplot as plt
import urllib.request
import numpy as np
import io
%matplotlib inline

chunk0.zip

data, rate = sf.read('chunk0.wav')
data = data
data.shape

reduced_noise = nr.reduce_noise(y = data.T, sr=rate, n_std_thresh_stationary=1,stationary=True)
IPython.display.Audio(data=reduced_noise, rate=rate)

The chunk0.wav was uploaded above as chunk0.zip, reduced_noise would be ok after the .T transpose, the result looks a litter bit harsh especially at the beginning.

May be the split action hurt the file?

hananell · 2022-02-23T09:50:55Z

I am not sure that it's related, but I too had that problem, and after transposing the data, the function reduce_noise() seemed too work well but I got this error afterwards in wavfile.write():
struct.error: ushort format requires 0 <= number <= (0x7fff * 2 + 1)

The code: (example.wav is inside zipped.zip)
zipped.zip

from scipy.io import wavfile
import noisereduce as nr
# load data
rate, data = wavfile.read("example.wav")
# perform noise reduction
reduced_noise = nr.reduce_noise(y=data.T, sr=rate)
wavfile.write("reduced.wav", rate, reduced_noise)

hananell · 2022-02-23T15:15:41Z

Update:
My wav file was stereo, I noticed that with mono everything works fine, so I seperated the data to two monos, run the algorithm on both, then stitched them back together. worked perfectly.

# load data
rate, data = wavfile.read("example.wav")
data1 = data[:,0]
data2 = data[0:,1]
# perform noise reduction
reduced_noise1 = nr.reduce_noise(y=data1, sr=rate)
reduced_noise2 = nr.reduce_noise(y=data2, sr=rate)
reduced_noise = np.stack((reduced_noise1, reduced_noise2), axis=1)
wavfile.write("reduced.wav", rate, reduced_noise)

netotz · 2023-04-27T04:04:19Z

@hananell why does it work with mono? and does it affect that the audio is being split into 2 then merged back?

lixinghe1999 · 2023-06-01T06:07:07Z

@hananell why does it work with mono? and does it affect that the audio is being split into 2 then merged back?

seems Stereo input needs the format of (n_channles, frames), which is different from the output of soundfile. So you need to do something

avelican · 2023-06-26T23:49:44Z

Had the same issue, converting the input wav to mono fixed it.
EDIT: Found a one-liner to just take the left channel:

rate, data = wavfile.read(fname)
data = data[:, 0]

data.shape (1757089, 2)
~40 seconds, stereo 44100Hz

Traceback (most recent call last):
  File "C:\Users\MSI\Desktop\loo_test\fix.py", line 7, in <module>
    reduced_noise = nr.reduce_noise(y=data, sr=rate)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MSI\AppData\Local\Programs\Python\Python311\Lib\site-packages\noisereduce\noisereduce.py", line 597, in reduce_noise
    return sg.get_traces()
           ^^^^^^^^^^^^^^^
  File "C:\Users\MSI\AppData\Local\Programs\Python\Python311\Lib\site-packages\noisereduce\noisereduce.py", line 235, in get_traces
    filtered_chunk = self.filter_chunk(start_frame=0, end_frame=end_frame)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MSI\AppData\Local\Programs\Python\Python311\Lib\site-packages\noisereduce\noisereduce.py", line 165, in filter_chunk
    padded_chunk = self._read_chunk(i1, i2)
                   ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\MSI\AppData\Local\Programs\Python\Python311\Lib\site-packages\noisereduce\noisereduce.py", line 157, in _read_chunk
    chunk = np.zeros((self.n_channels, i2 - i1))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 786. GiB for an array with shape (1757089, 60002) and data type float64

Cool project by the way! I used an AI powered online tool and it worked significantly better, but it introduced weird hallucinations (and they want to charge $500 to process all the audio xD)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MemoryError #57

MemoryError #57

joingreat commented Dec 28, 2021

vaastav commented Dec 31, 2021

timsainb commented Dec 31, 2021

vaastav commented Jan 1, 2022

timsainb commented Jan 1, 2022

joingreat commented Jan 4, 2022

vinodhian commented Jan 4, 2022

timsainb commented Jan 4, 2022

jd907 commented Jan 5, 2022

joingreat commented Jan 7, 2022

hananell commented Feb 23, 2022 •

edited

Loading

hananell commented Feb 23, 2022 •

edited

Loading

netotz commented Apr 27, 2023

lixinghe1999 commented Jun 1, 2023

avelican commented Jun 26, 2023 •

edited

Loading

MemoryError #57

MemoryError #57

Comments

joingreat commented Dec 28, 2021

vaastav commented Dec 31, 2021

timsainb commented Dec 31, 2021

vaastav commented Jan 1, 2022

timsainb commented Jan 1, 2022

joingreat commented Jan 4, 2022

vinodhian commented Jan 4, 2022

timsainb commented Jan 4, 2022

jd907 commented Jan 5, 2022

joingreat commented Jan 7, 2022

hananell commented Feb 23, 2022 • edited Loading

hananell commented Feb 23, 2022 • edited Loading

netotz commented Apr 27, 2023

lixinghe1999 commented Jun 1, 2023

avelican commented Jun 26, 2023 • edited Loading

hananell commented Feb 23, 2022 •

edited

Loading

hananell commented Feb 23, 2022 •

edited

Loading

avelican commented Jun 26, 2023 •

edited

Loading