NVIDIA · katjasrz · Jul 10, 2025
diff --git a/community/README.md b/community/README.md
@@ -86,4 +86,8 @@ Community examples are sample code and deployments for RAG pipelines that are no
 
 * [Chat with LLM Llama 3.1 Nemotron Nano 4B](./chat-llama-nemotron/)
 
-  This is a React-based conversational UI designed for interacting with a powerful local LLM. It incorporates RAG to enhance contextual understanding and is backed by an NVIDIA Dynamo inference server running the NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1 model. The setup enables low-latency, cloud-free AI assistant capabilities, with live document search and reasoning, all deployable on local or edge infrastructure.
+  This is a React-based conversational UI designed for interacting with a powerful local LLM. It incorporates RAG to enhance contextual understanding and is backed by an NVIDIA Dynamo inference server running the NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1 model. The setup enables low-latency, cloud-free AI assistant capabilities, with live document search and reasoning, all deployable on local or edge infrastructure.
+
+* [LLM Inference Series: Performance, Optimization & Deployment with LLMs](llm-inference-series)
+
+This repository supports a video + notebook series exploring how to run, optimize, and serve Large Language Models (LLMs) with a focus on latency, throughput, user experience (UX), and NVIDIA GPU acceleration.
diff --git a/community/llm-inference-series/.gitignore b/community/llm-inference-series/.gitignore
@@ -0,0 +1,35 @@
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+venv/
+.venv/
+
+# Jupyter
+.ipynb_checkpoints/
+
+# Data files
+*.csv
+*.json
+*.pkl
+*.h5
+*.hdf5
+
+# OS
+.DS_Store
+Thumbs.db
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+
+# Logs
+*.log
+
+# Project
+01_inference_101/batch_benchmark.csv