Home

Welcome to the SoundNet-tensorflow wiki!

When read with torchfile, one can use instance._obj to fetch dict

mydict = o['modules'][0]._obj

Set value to variable and make it trainable

Link

def get_var(self, initial_value, name, idx, var_name):
    if self.data_dict is not None and name in self.data_dict:
        value = self.data_dict[name][idx]
    else:
        value = initial_value

    if self.trainable:
        var = tf.Variable(value, name=var_name)
    else:
        var = tf.constant(value, dtype=tf.float32, name=var_name)

    self.var_dict[(name, idx)] = var

    # print var_name, var.get_shape().as_list()
    assert var.get_shape() == initial_value.get_shape()

    return var

Load audio using librosa and torch audio

Librosa

librosa.core.load(path, sr=22050, mono=True, offset=0.0, duration=None, dtype=<class 'numpy.float32'>, res_type='kaiser_best')

librosa.core.load(path, sr=22050, mono=True, offset=0.0, duration=None, dtype=<class 'numpy.float32'>, res_type='kaiser_best')

# By default, librosa will resample the signal to 22050Hz. And range in (-1., 1.)
# If we want to load the signal with raw sample rate, then set sr=None
# Also, if we want to have stereo (two channels), then set mono=False
sound_sample, sr = librosa.load(audio_path, sr=None, mono=False)

audio

loads an audio file into a torch.Tensor
usage:
audio.load(
           string                              -- path to file
            )

returns torch.Tensor of size NSamples x NChannels, sample_rate

-- By default, audio will load the signal with raw sample rate, stereo (two channels). And range in (-2^31, 2^31)
sound_sample, sample_rate = audio.load(audio_path)
if sound:size(2) > 1 then sound = sound:select(2,1):clone() end -- select first channel
sound_sample:mul(2^-31)                                         -- make range [-1, 1]

NOTE: To keep their value difference small, convert all mp3 with sox input.mp3 output.mp3 trim 0. The different value mainly caused by different reading pattern, so a better solution is convert to wav files. For more comparison, please refer to info.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Welcome to the SoundNet-tensorflow wiki!

When read with torchfile, one can use instance._obj to fetch dict

Set value to variable and make it trainable

Load audio using librosa and torch audio

Clone this wiki locally