Merge branch 'latest' into ea/phi-3-vision

openvinotoolkit · Jul 19, 2024 · 0bca861 · 0bca861
2 parents 9feb3ef + 1f4e0fc
commit 0bca861
Show file tree

Hide file tree

Showing 10 changed files with 704 additions and 12 deletions.
diff --git a/.ci/skipped_notebooks.yml b/.ci/skipped_notebooks.yml
@@ -541,8 +541,10 @@
         - ubuntu-20.04
         - ubuntu-22.04
         - windows-2019
-- notebook: notebooks/phi-3-vision/phi-3-vision.ipynb
+- notebook: notebooks/stable-audio/stable-audio.ipynb
   skips:
+    - python:
+        - '3.8'
     - os:
         - macos-12
         - ubuntu-20.04
@@ -556,3 +558,10 @@
   skips:
     - python:
         - '3.8'
+- notebook: notebooks/phi-3-vision/phi-3-vision.ipynb
+  skips:
+      - os:
+        - macos-12
+        - ubuntu-20.04
+        - ubuntu-22.04
+        - windows-2019
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -252,11 +252,13 @@ finetuned
 finetuning
 FLAC
 floyd
+foley
 Formatter
 formatter
 fp
 FP
 FPN
+Freesound
 FreeVC
 freevc
 frisbee

diff --git a/.ci/validate_notebooks.py b/.ci/validate_notebooks.py
@@ -188,6 +188,7 @@ def get_openvino_version() -> str:
 
 def run_test(notebook_path: Path, root, timeout=7200, keep_artifacts=False, report_dir=".") -> Optional[Tuple[str, int, float, str, str]]:
     os.environ["HUGGINGFACE_HUB_CACHE"] = str(notebook_path.parent)
+    os.environ["HF_HUB_CACHE"] = str(notebook_path.parent)
     print(f"RUN {notebook_path.relative_to(root)}", flush=True)
     result = None
 

diff --git a/.docker/Pipfile.lock b/.docker/Pipfile.lock
diff --git a/notebooks/README.md b/notebooks/README.md
@@ -28,6 +28,7 @@
 - [Stable Diffusion with KerasCV and OpenVINO](./stable-diffusion-keras-cv/stable-diffusion-keras-cv.ipynb)
 - [Image Generation with Stable Diffusion and IP-Adapter](./stable-diffusion-ip-adapter/stable-diffusion-ip-adapter.ipynb)
 - [Image generation with Stable Cascade and OpenVINO](./stable-cascade-image-generation/stable-cascade-image-generation.ipynb)
+- [Sound Generation with Stable Audio Open and OpenVINO™](./stable-audio/stable-audio.ipynb)
 - [Sound Generation with AudioLDM2 and OpenVINO™](./sound-generation-audioldm2/sound-generation-audioldm2.ipynb)
 - [SoftVC VITS Singing Voice Conversion and OpenVINO™](./softvc-voice-conversion/softvc-voice-conversion.ipynb)
 - [Object masks from prompts with SAM and OpenVINO](./segment-anything/segment-anything.ipynb)
@@ -176,6 +177,7 @@
 - [Stable Diffusion with KerasCV and OpenVINO](./stable-diffusion-keras-cv/stable-diffusion-keras-cv.ipynb)
 - [Image Generation with Stable Diffusion and IP-Adapter](./stable-diffusion-ip-adapter/stable-diffusion-ip-adapter.ipynb)
 - [Image generation with Stable Cascade and OpenVINO](./stable-cascade-image-generation/stable-cascade-image-generation.ipynb)
+- [Sound Generation with Stable Audio Open and OpenVINO™](./stable-audio/stable-audio.ipynb)
 - [Text Generation via Speculative Sampling, KV Caching, and OpenVINO™](./speculative-sampling/speculative-sampling.ipynb)
 - [Sound Generation with AudioLDM2 and OpenVINO™](./sound-generation-audioldm2/sound-generation-audioldm2.ipynb)
 - [SoftVC VITS Singing Voice Conversion and OpenVINO™](./softvc-voice-conversion/softvc-voice-conversion.ipynb)

diff --git a/notebooks/llm-rag-llamaindex/llm-rag-llamaindex.ipynb b/notebooks/llm-rag-llamaindex/llm-rag-llamaindex.ipynb
@@ -186,6 +186,7 @@
     "- [**bge-reranker-v2-m3**](https://huggingface.co/BAAI/bge-reranker-v2-m3)\n",
     "- [**bge-reranker-large**](https://huggingface.co/BAAI/bge-reranker-large)\n",
     "- [**bge-reranker-base**](https://huggingface.co/BAAI/bge-reranker-base)\n",
+    "\n",
     "Reranker model with cross-encoder will perform full-attention over the input pair, which is more accurate than embedding model (i.e., bi-encoder) but more time-consuming than embedding model. Therefore, it can be used to re-rank the top-k documents returned by embedding model.\n",
     "\n",
     "You can also find available LLM model options in [llm-chatbot](../llm-chatbot/README.md) notebook.\n"

diff --git a/notebooks/stable-audio/README.md b/notebooks/stable-audio/README.md
@@ -0,0 +1,26 @@
+# Sound Generation with Stable Audio Open and OpenVINO™
+
+[Stable Audio Open](https://huggingface.co/stabilityai/stable-audio-open-1.0) is an open-source model optimized for generating short audio samples, sound effects, and production elements using text prompts. The model was trained on data from Freesound and the Free Music Archive, respecting creator rights.
+
+<img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/76171391/ed4aa0f2-0501-4519-8b24-c1c3072b4ef2" />
+
+#### Key Takeaways:
+
+ - Stable Audio Open is an open source text-to-audio model for generating up to 47 seconds of samples and sound effects.
+ - Users can create drum beats, instrument riffs, ambient sounds, foley and production elements.
+ - The model enables audio variations and style transfer of audio samples.
+
+This model is made to be used with the [stable-audio-tools](https://github.com/Stability-AI/stable-audio-tools) library for inference.
+
+## Notebook contents
+This tutorial consists of the following steps:
+- Prerequisites
+- Load the original model and inference
+- Convert the model to OpenVINO IR
+- Compiling models and inference
+- Interactive inference
+
+## Installation instructions
+This is a self-contained example that relies solely on its own code.</br>
+We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
+For details, please refer to [Installation Guide](../../README.md).
diff --git a/notebooks/stable-audio/stable-audio.ipynb b/notebooks/stable-audio/stable-audio.ipynb
diff --git a/notebooks/stable-video-diffusion/stable-video-diffusion.ipynb b/notebooks/stable-video-diffusion/stable-video-diffusion.ipynb
@@ -1579,7 +1579,7 @@
    "source": [
     "int8_out_path = Path(\"generated_int8.mp4\")\n",
     "\n",
-    "export_to_video(frames, str(out_path), fps=7)\n",
+    "export_to_video(int8_frames, str(int8_out_path), fps=7)\n",
     "int8_frames[0].save(\n",
     "    \"generated_int8.gif\",\n",
     "    save_all=True,\n",

diff --git a/requirements.txt b/requirements.txt
@@ -2,5 +2,4 @@
 jupyterlab
 ipywidgets
 ipykernel>=5.0
-ipython>=7.16.3
-setuptools<70
+ipython>=7.16.3