You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/concepts/pipeline-wrapper.md
+86Lines changed: 86 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -502,6 +502,92 @@ YAML configuration follows the same priority rules: YAML setting > environment v
502
502
503
503
See the [Multi-LLM Streaming Example](https://github.com/deepset-ai/hayhooks/tree/main/examples/pipeline_wrappers/multi_llm_streaming) for a complete working implementation.
504
504
505
+
## Accessing Intermediate Outputs with `include_outputs_from`
506
+
507
+
!!! info "Understanding Pipeline Outputs"
508
+
By default, Haystack pipelines only return outputs from **leaf components** (final components with no downstream connections). Use `include_outputs_from` to also get outputs from intermediate components like retrievers, preprocessors, or parallel branches.
509
+
510
+
### Streaming with `on_pipeline_end` Callback
511
+
512
+
For streaming responses, pass `include_outputs_from` to `streaming_generator()` or `async_streaming_generator()`, and use the `on_pipeline_end` callback to access intermediate outputs. For example:
include_outputs_from={"retriever"}, # Make retriever outputs available
534
+
on_pipeline_end=on_pipeline_end
535
+
)
536
+
```
537
+
538
+
**What happens:** The `on_pipeline_end` callback receives both `llm` and `retriever` outputs in the `result` dict, allowing you to access retrieved documents alongside the generated response.
For non-streaming `run_api` or `run_api_async` endpoints, pass `include_outputs_from` directly to `pipeline.run()` or `pipeline.run_async()`. For example:
564
+
565
+
```python
566
+
defrun_api(self, query: str) -> dict:
567
+
result =self.pipeline.run(
568
+
data={"retriever": {"query": query}},
569
+
include_outputs_from={"retriever"}
570
+
)
571
+
# Build custom response with both answer and sources
- **Streaming**: Pass `include_outputs_from` to `streaming_generator()` or `async_streaming_generator()` and use `on_pipeline_end` callback to access the outputs
588
+
- **Non-streaming**: Pass `include_outputs_from` directly to `pipeline.run()` or `pipeline.run_async()`
589
+
- **YAML Pipelines**: Automatically handled - see [YAML Pipeline Deployment](yaml-pipeline-deployment.md#output-mapping)
590
+
505
591
## File Upload Support
506
592
507
593
Hayhooks can handle file uploads by adding a `files` parameter:
Hayhooks **automatically** derives the `include_outputs_from` parameter from your `outputs` section. This ensures that all components referenced in the outputs are included in the pipeline results, even if they're not leaf components.
156
+
157
+
**Example:** If your outputs reference `retriever.documents` and `llm.replies`, Hayhooks automatically sets `include_outputs_from={"retriever", "llm"}` when running the pipeline.
158
+
159
+
**What this means:** You don't need to configure anything extra - just declare your outputs in the YAML, and Hayhooks ensures those component outputs are available in the results!
160
+
161
+
!!! note "Comparison with PipelineWrapper"
162
+
**YAML Pipelines** (this page): `include_outputs_from` is **automatic** - derived from your `outputs` section
163
+
164
+
**PipelineWrapper**: `include_outputs_from` must be **manually passed**:
165
+
166
+
- For streaming: Pass to `streaming_generator()` / `async_streaming_generator()`
167
+
- For non-streaming: Pass to `pipeline.run()` / `pipeline.run_async()`
168
+
169
+
See [PipelineWrapper: include_outputs_from](pipeline-wrapper.md#accessing-intermediate-outputs-with-include_outputs_from) for examples.
0 commit comments