Docs update (#183)

mpangrazzi · web-flow · commit 19830f4d1a47 · 2025-10-30T11:40:30.000+01:00
* Add docs for include_outputs_from

* Fix wrong params

* Align correctly with default (localhost)

* remove blank spaces

* Lint

* Remove mentions to dc-query-api
diff --git a/docs/concepts/pipeline-wrapper.md b/docs/concepts/pipeline-wrapper.md
@@ -502,6 +502,92 @@ YAML configuration follows the same priority rules: YAML setting > environment v
 
 See the [Multi-LLM Streaming Example](https://github.com/deepset-ai/hayhooks/tree/main/examples/pipeline_wrappers/multi_llm_streaming) for a complete working implementation.
 
+## Accessing Intermediate Outputs with `include_outputs_from`
+
+!!! info "Understanding Pipeline Outputs"
+    By default, Haystack pipelines only return outputs from **leaf components** (final components with no downstream connections). Use `include_outputs_from` to also get outputs from intermediate components like retrievers, preprocessors, or parallel branches.
+
+### Streaming with `on_pipeline_end` Callback
+
+For streaming responses, pass `include_outputs_from` to `streaming_generator()` or `async_streaming_generator()`, and use the `on_pipeline_end` callback to access intermediate outputs. For example:
+
+```python
+    def run_chat_completion(self, model: str, messages: List[dict], body: dict) -> Generator:
+        question = get_last_user_message(messages)
+
+        # Store retrieved documents for citations
+        self.retrieved_docs = []
+
+        def on_pipeline_end(result: dict[str, Any]) -> None:
+            # Access intermediate outputs here
+            if "retriever" in result:
+                self.retrieved_docs = result["retriever"]["documents"]
+                # Use for citations, logging, analytics, etc.
+
+        return streaming_generator(
+            pipeline=self.pipeline,
+            pipeline_run_args={
+                "retriever": {"query": question},
+                "prompt_builder": {"query": question}
+            },
+            include_outputs_from={"retriever"},  # Make retriever outputs available
+            on_pipeline_end=on_pipeline_end
+        )
+```
+
+**What happens:** The `on_pipeline_end` callback receives both `llm` and `retriever` outputs in the `result` dict, allowing you to access retrieved documents alongside the generated response.
+
+The same pattern works with async streaming:
+
+```python
+async def run_chat_completion_async(self, model: str, messages: List[dict], body: dict) -> AsyncGenerator:
+    question = get_last_user_message(messages)
+
+    def on_pipeline_end(result: dict[str, Any]) -> None:
+        if "retriever" in result:
+            self.retrieved_docs = result["retriever"]["documents"]
+
+    return async_streaming_generator(
+        pipeline=self.async_pipeline,
+        pipeline_run_args={
+            "retriever": {"query": question},
+            "prompt_builder": {"query": question}
+        },
+        include_outputs_from={"retriever"},
+        on_pipeline_end=on_pipeline_end
+    )
+```
+
+### Non-Streaming API
+
+For non-streaming `run_api` or `run_api_async` endpoints, pass `include_outputs_from` directly to `pipeline.run()` or `pipeline.run_async()`. For example:
+
+```python
+def run_api(self, query: str) -> dict:
+    result = self.pipeline.run(
+        data={"retriever": {"query": query}},
+        include_outputs_from={"retriever"}
+    )
+    # Build custom response with both answer and sources
+    return {"answer": result["llm"]["replies"][0], "sources": result["retriever"]["documents"]}
+```
+
+Same pattern for async:
+
+```python
+async def run_api_async(self, query: str) -> dict:
+    result = await self.async_pipeline.run_async(
+        data={"retriever": {"query": query}},
+        include_outputs_from={"retriever"}
+    )
+    return {"answer": result["llm"]["replies"][0], "sources": result["retriever"]["documents"]}
+```
+
+!!! tip "When to Use `include_outputs_from`"
+    - **Streaming**: Pass `include_outputs_from` to `streaming_generator()` or `async_streaming_generator()` and use `on_pipeline_end` callback to access the outputs
+    - **Non-streaming**: Pass `include_outputs_from` directly to `pipeline.run()` or `pipeline.run_async()`
+    - **YAML Pipelines**: Automatically handled - see [YAML Pipeline Deployment](yaml-pipeline-deployment.md#output-mapping)
+
 ## File Upload Support
 
 Hayhooks can handle file uploads by adding a `files` parameter:
diff --git a/docs/concepts/yaml-pipeline-deployment.md b/docs/concepts/yaml-pipeline-deployment.md
@@ -79,7 +79,7 @@ outputs:
       -d '{
         "name": "my_chat_pipeline",
         "description": "Chat pipeline for Q&A",
-        "yaml_content": "...",
+        "source_code": "...",
         "overwrite": false
       }'
     ```
@@ -94,7 +94,7 @@ outputs:
         json={
             "name": "my_chat_pipeline",
             "description": "Chat pipeline for Q&A",
-            "yaml_content": "...",  # Your YAML content as string
+            "source_code": "...",  # Your YAML content as string
             "overwrite": False
         }
     )
@@ -151,6 +151,23 @@ outputs:
 - Response fields are serialized to JSON
 - Complex objects are automatically serialized
 
+!!! success "Automatic `include_outputs_from` Derivation"
+    Hayhooks **automatically** derives the `include_outputs_from` parameter from your `outputs` section. This ensures that all components referenced in the outputs are included in the pipeline results, even if they're not leaf components.
+
+    **Example:** If your outputs reference `retriever.documents` and `llm.replies`, Hayhooks automatically sets `include_outputs_from={"retriever", "llm"}` when running the pipeline.
+
+    **What this means:** You don't need to configure anything extra - just declare your outputs in the YAML, and Hayhooks ensures those component outputs are available in the results!
+
+    !!! note "Comparison with PipelineWrapper"
+        **YAML Pipelines** (this page): `include_outputs_from` is **automatic** - derived from your `outputs` section
+
+        **PipelineWrapper**: `include_outputs_from` must be **manually passed**:
+
+        - For streaming: Pass to `streaming_generator()` / `async_streaming_generator()`
+        - For non-streaming: Pass to `pipeline.run()` / `pipeline.run_async()`
+
+        See [PipelineWrapper: include_outputs_from](pipeline-wrapper.md#accessing-intermediate-outputs-with-include_outputs_from) for examples.
+
 ## API Usage
 
 ### After Deployment
diff --git a/docs/features/cli-commands.md b/docs/features/cli-commands.md
@@ -73,7 +73,7 @@ hayhooks run --reload
 
 | Option | Short | Description | Default |
 |--------|-------|-------------|---------|
-| `--host` | | Host to bind to | `127.0.0.1` |
+| `--host` | | Host to bind to | `localhost` |
 | `--port` | | Port to listen on | `1416` |
 | `--workers` | | Number of worker processes | `1` |
 | `--pipelines-dir` | | Directory for pipeline definitions | `./pipelines` |
@@ -97,7 +97,7 @@ hayhooks mcp run --host 0.0.0.0 --port 1417
 
 | Option | Short | Description | Default |
 |--------|-------|-------------|---------|
-| `--host` | | MCP server host | `127.0.0.1` |
+| `--host` | | MCP server host | `localhost` |
 | `--port` | | MCP server port | `1417` |
 | `--pipelines-dir` | | Directory for pipeline definitions | `./pipelines` |
 | `--additional-python-path` | | Additional Python path | `None` |
diff --git a/docs/features/mcp-support.md b/docs/features/mcp-support.md
@@ -30,15 +30,15 @@ pip install hayhooks[mcp]
 hayhooks mcp run
 ```
 
-This starts the MCP server on `HAYHOOKS_MCP_HOST:HAYHOOKS_MCP_PORT` (default: `127.0.0.1:1417`).
+This starts the MCP server on `HAYHOOKS_MCP_HOST:HAYHOOKS_MCP_PORT` (default: `localhost:1417`).
 
 ### Configuration
 
 Environment variables for MCP server:
 
 ```bash
-HAYHOOKS_MCP_HOST=127.0.0.1    # MCP server host
-HAYHOOKS_MCP_PORT=1417        # MCP server port
+HAYHOOKS_MCP_HOST=localhost    # MCP server host
+HAYHOOKS_MCP_PORT=1417         # MCP server port
 ```
 
 ## Transports
diff --git a/docs/getting-started/configuration.md b/docs/getting-started/configuration.md
@@ -38,7 +38,7 @@ hayhooks run --host 0.0.0.0 --port 1416 --pipelines-dir ./pipelines
 
 The most frequently used options:
 
-- `HAYHOOKS_HOST` - Host to bind to (default: `127.0.0.1`)
+- `HAYHOOKS_HOST` - Host to bind to (default: `localhost`)
 - `HAYHOOKS_PORT` - Port to listen on (default: `1416`)
 - `HAYHOOKS_PIPELINES_DIR` - Pipeline directory for auto-deployment (default: `./pipelines`)
 - `LOG` - Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR` (default: `INFO`)
@@ -51,7 +51,7 @@ For the complete list of all environment variables and detailed descriptions, se
 
 ```bash
 # .env.development
-HAYHOOKS_HOST=127.0.0.1
+HAYHOOKS_HOST=localhost
 HAYHOOKS_PORT=1416
 LOG=DEBUG
 HAYHOOKS_SHOW_TRACEBACKS=true
diff --git a/src/hayhooks/server/utils/deploy_utils.py b/src/hayhooks/server/utils/deploy_utils.py
@@ -548,7 +548,7 @@ def add_yaml_pipeline_to_registry(
     if streaming_components:
         clog.debug(f"Found streaming_components in YAML: {streaming_components}")
 
-    # Automatically derive include_outputs_from from the outputs mapping (matches dc-query-api behavior)
+    # Automatically derive include_outputs_from from the outputs mapping.
     # This ensures we get outputs from all components referenced in the outputs declaration,
     # not just leaf components. Useful for debugging and getting intermediate results.
     # Extract component names from paths like "llm.replies" -> "llm"