Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating mixture file ? #7

Open
bubblegg opened this issue Nov 9, 2023 · 1 comment
Open

Separating mixture file ? #7

bubblegg opened this issue Nov 9, 2023 · 1 comment

Comments

@bubblegg
Copy link

bubblegg commented Nov 9, 2023

Is it possible to separate my provided input mixture file and how?
Thank you in advance!

@PreFKim
Copy link

PreFKim commented Jun 25, 2024

I separated my custom mixture file, but the result did not meet my expectations.

  1. Setting Params
import sys
from pathlib import Path
from typing import *
import torch

DEVICE = torch.device("cuda:0")
SAMPLE_RATE = 22050 # < IMPORTANT: do not change
STEMS = ["bass","drums","guitar","piano"] # < IMPORTANT: do not change
ROOT_PATH = Path("..").resolve().absolute()
CKPT_PATH = ROOT_PATH / "ckpts"
DATA_PATH = ROOT_PATH / "data"

sys.path.append(str(ROOT_PATH))
%load_ext autoreload
%autoreload 2
  1. Load Model
from main.module_base import Model

# Load model
model = Model.load_from_checkpoint(CKPT_PATH / f"glorious-star-335/epoch=729-valid_loss=0.014.ckpt").to(DEVICE)
denoise_fn = model.model.diffusion.denoise_fn
  1. Load mixture file
import soundfile as sf
import torch

audio, sr = sf.read('/.../music.wav')
audio = torch.from_numpy(audio.transpose(1,0).reshape(1,2,-1)) # seq_len, 2 -> 2, seq_len -> 1, 2, seq_len ( batch, stereo, seq_len)
print(audio.shape, sr) # If the audio's sampling rate is not 22050, you should adjust your audio file to match the target sampling rate.
  1. Separate
from main.separation import separate_mixture
from audio_diffusion_pytorch import KarrasSchedule
# Generation hyper-parameters
s_churn = 20.0
num_steps = 150
num_resamples = 2

# Define timestep schedule
schedule = KarrasSchedule(sigma_min=1e-4, sigma_max=20.0, rho=7)(num_steps, DEVICE)

start_idx = 0
sources = audio[:,:, start_idx:start_idx + 262144].to(DEVICE)
sources = ((sources[:,0:1] + sources[:,1:2])/2).float() # Stereo to mono

separated = separate_mixture(
    mixture= sources,
    denoise_fn= denoise_fn,
    sigmas=schedule,
    noises= torch.randn(1, 4, 262144).to(DEVICE),
    s_churn=s_churn, # > 0 to add randomness
    num_resamples= num_resamples,
)    
separated.shape
  1. Audio to file
import numpy as np
import soundfile as sf
separated = separated.detach().cpu().numpy().squeeze(0)

for i, stem in enumerate(STEMS):
    sf.write(
        f"./{stem}.wav",
        separated[i],
        22050,
        format="WAV"
    )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants