Separating mixture file ? #7

bubblegg · 2023-11-09T14:55:53Z

Is it possible to separate my provided input mixture file and how?
Thank you in advance!

PreFKim · 2024-06-25T11:02:52Z

I separated my custom mixture file, but the result did not meet my expectations.

Setting Params

import sys
from pathlib import Path
from typing import *
import torch

DEVICE = torch.device("cuda:0")
SAMPLE_RATE = 22050 # < IMPORTANT: do not change
STEMS = ["bass","drums","guitar","piano"] # < IMPORTANT: do not change
ROOT_PATH = Path("..").resolve().absolute()
CKPT_PATH = ROOT_PATH / "ckpts"
DATA_PATH = ROOT_PATH / "data"

sys.path.append(str(ROOT_PATH))
%load_ext autoreload
%autoreload 2

Load Model

from main.module_base import Model

# Load model
model = Model.load_from_checkpoint(CKPT_PATH / f"glorious-star-335/epoch=729-valid_loss=0.014.ckpt").to(DEVICE)
denoise_fn = model.model.diffusion.denoise_fn

Load mixture file

import soundfile as sf
import torch

audio, sr = sf.read('/.../music.wav')
audio = torch.from_numpy(audio.transpose(1,0).reshape(1,2,-1)) # seq_len, 2 -> 2, seq_len -> 1, 2, seq_len ( batch, stereo, seq_len)
print(audio.shape, sr) # If the audio's sampling rate is not 22050, you should adjust your audio file to match the target sampling rate.

Separate

from main.separation import separate_mixture
from audio_diffusion_pytorch import KarrasSchedule
# Generation hyper-parameters
s_churn = 20.0
num_steps = 150
num_resamples = 2

# Define timestep schedule
schedule = KarrasSchedule(sigma_min=1e-4, sigma_max=20.0, rho=7)(num_steps, DEVICE)

start_idx = 0
sources = audio[:,:, start_idx:start_idx + 262144].to(DEVICE)
sources = ((sources[:,0:1] + sources[:,1:2])/2).float() # Stereo to mono

separated = separate_mixture(
    mixture= sources,
    denoise_fn= denoise_fn,
    sigmas=schedule,
    noises= torch.randn(1, 4, 262144).to(DEVICE),
    s_churn=s_churn, # > 0 to add randomness
    num_resamples= num_resamples,
)    
separated.shape

Audio to file

import numpy as np
import soundfile as sf
separated = separated.detach().cpu().numpy().squeeze(0)

for i, stem in enumerate(STEMS):
    sf.write(
        f"./{stem}.wav",
        separated[i],
        22050,
        format="WAV"
    )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separating mixture file ? #7

Separating mixture file ? #7

bubblegg commented Nov 9, 2023

PreFKim commented Jun 25, 2024

Separating mixture file ? #7

Separating mixture file ? #7

Comments

bubblegg commented Nov 9, 2023

PreFKim commented Jun 25, 2024