Separation in the browser #39

incidentist · 2024-07-04T22:58:57Z

Okay, we've got video creation in the browser. Can we do separation by loading models using onnxruntime-wasm? This will allow us to remove the costliest part of the server and make the project more sustainable, at the cost of (probably) increasing the time needed for video creation.

One option for this is to get the https://github.com/karaokenerds/python-audio-separator project to work in the browser using pyodide. However, currently it seems that pyodide doesn't work great with onnxruntime: pyodide/pyodide#4220 (not surprising given how complicated ML libraries usually are).

beveradb · 2024-07-04T23:09:28Z

FYI there are better models which use less resources and don't require ONNXruntime now!
E.g. the bs-roformer one or MDXC models - check out this post where I suggest some of the best ones I use currently!
nomadkaraoke/python-audio-separator#82 (comment)

incidentist · 2024-07-05T04:07:06Z

@beveradb I was going to pick your brain about this, actually. I'm less worried about resource usage right now, and more worried about which model is the easiest to get up and running in the browser. It seems like onnx models are the most portable and would be the easiest to get running in the browser, so onnx is actually an advantage. ONNXruntime code looks very similar in JS and python, so I would crib a lot from python-audio-separator. It seems like most of the complication in running these models is in the audio prep for input, and turning model output into normal files. But I am real new to this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separation in the browser #39

Separation in the browser #39

incidentist commented Jul 4, 2024

beveradb commented Jul 4, 2024

incidentist commented Jul 5, 2024

Separation in the browser #39

Separation in the browser #39

Comments

incidentist commented Jul 4, 2024

beveradb commented Jul 4, 2024

incidentist commented Jul 5, 2024