Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable audio model example #2188

Merged
merged 12 commits into from
Jul 18, 2024
9 changes: 9 additions & 0 deletions .ci/skipped_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -541,6 +541,15 @@
- ubuntu-20.04
- ubuntu-22.04
- windows-2019
- notebook: notebooks/stable-audio/stable-audio.ipynb
skips:
- python:
- '3.8'
- os:
- macos-12
- ubuntu-20.04
- ubuntu-22.04
- windows-2019
- notebook: notebooks/triposr-3d-reconstruction/triposr-3d-reconstruction.ipynb
skips:
- python:
Expand Down
2 changes: 2 additions & 0 deletions .ci/spellcheck/.pyspelling.wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -252,11 +252,13 @@ finetuned
finetuning
FLAC
floyd
foley
Formatter
formatter
fp
FP
FPN
Freesound
FreeVC
freevc
frisbee
Expand Down
26 changes: 26 additions & 0 deletions notebooks/stable-audio/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Sound Generation with Stable Audio Open and OpenVINO™

[Stable Audio Open](https://huggingface.co/stabilityai/stable-audio-open-1.0) is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts. The model was trained on data from Freesound and the Free Music Archive, respecting creator rights.

<img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/76171391/ed4aa0f2-0501-4519-8b24-c1c3072b4ef2" />

#### Key Takeaways:

- Stable Audio Open is an open source text-to-audio model for generating up to 47 seconds of samples and sound effects.
- Users can create drum beats, instrument riffs, ambient sounds, foley and production elements.
- The model enables audio variations and style transfer of audio samples.

This model is made to be used with the [stable-audio-tools](https://github.com/Stability-AI/stable-audio-tools) library for inference.

## Notebook contents
This tutorial consists of the following steps:
- Prerequisites
- Load the original model and inference
- Convert the model to OpenVINO IR
- Compiling models and inference
- Interactive inference

## Installation instructions
This is a self-contained example that relies solely on its own code.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).
652 changes: 652 additions & 0 deletions notebooks/stable-audio/stable-audio.ipynb

Large diffs are not rendered by default.

Loading