Add first-pass at stability tritonserver-based imagegen comp #227

acwrenn · 2024-06-20T18:04:02Z

Description

This PR adds a new component for image generation using Stability. The API is currently a WIP - lots of feedback appreciated!

Issues

NA

Type of change

Bug fix (non-breaking change which fixes an issue)
[ x] New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)

Dependencies

Tritonserver
Stable Diffusion = "Habana/stable-diffusion-2"

Tests

WIP

for more information, see https://pre-commit.ci

…omps into add_imagegen_comp

acwrenn · 2024-06-20T18:07:59Z

Hey maintainers - this is clearly not in a high-quality state yet. But I want to get architecture/code location/generic feedback on the approach before I spend the time to polish it. Any notes would be greatly appreciated!

This Stable Diffusion model powers a card in the Intel AI Explorer - so upstreaming it to OPEA seemed like a reasonable next step.

https://or-dev.dcs-tools-experiments.infra-host.com/explore

comps/imagegen/launch_tritonserver.sh

ftian1 · 2024-06-27T02:14:04Z

comps/imagegen/triton/install_efa.sh

+
+#!/bin/bash -ex
+
+# Copyright (C) 2024 Intel Corporation


The license header gets replicated twice. can you pls clean up it?

mkbhanda · 2024-06-27T02:17:49Z

Please create distinct microservices for the pipeline components, for instance triton server should be in its own microservice. Data cleaning, embedding, model server, model etc. The genAIexample contains a pipeline composed of microservices in the GenAImicrocomps and we need e2e tests for everything.

dbkinder

Some documentation would be a good idea.

ashahba

Hi @acwrenn
Thanks for this PR.
If you have any questions about some of my comments, Please let me know.

ashahba · 2024-06-27T02:20:00Z

comps/imagegen/imagegen.py

+def generate_image(*, text, triton_endpoint):
+    start = time.time()
+
+    with httpclient.InferenceServerClient(triton_endpoint, network_timeout=1000 * 300) as client:


What's the reasoning for the 300s timeout? Isn't that a bit too long?

The initial ImageGen request can take a while - having the ImageGen non-model comp "warm up" the model service would let us reduce this, is that a pattern seen in any other comp I can lift?

ashahba · 2024-06-27T02:21:36Z

comps/imagegen/imagegen.py

+            inputs,
+            request_id=str(1),
+            outputs=outputs,
+            timeout=1000 * 300,


Same here. Would it make more sense to set this to?
timeout=network_timeout

ashahba · 2024-06-27T02:27:40Z

comps/imagegen/imagegen.py

+    name="opea_service@imagegen",
+    service_type=ServiceType.IMAGEGEN,
+    endpoint="/v1/images/generate",
+    host="0.0.0.0",


Usually, I'd set this to localhost or 127.0.0.1 and then provide a mechanism for the user to decide if they want to run locally or listen to the world.

I lifted this from the TTS comp - I guess its a common question of "do it the same way as the codebase, or do what makes sense locally"

Should I stick to "do it locally correct" in this case?

ashahba · 2024-06-27T02:28:57Z

comps/imagegen/triton/Dockerfile

+
+ARG TRITON_VERSION=24.04
+
+FROM nvcr.io/nvidia/tritonserver:${TRITON_VERSION}-py3 AS triton


I need to check on this container.

ashahba · 2024-06-27T02:30:01Z

comps/imagegen/triton/Dockerfile

+
+ARG MODEL_NAME=stability
+
+RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb && \


I need to check this one too.

I can take a stab and building the triton containers from source - but I am not sure where that code would live. Probably not in this repo. And who would host it?

The Nvidia-distributed triton server container DOES contain a bunch of extra stuff we don't need.

ashahba · 2024-06-27T02:51:45Z

comps/imagegen/triton/Makefile

+
+DEVICES?=3
+run:
+	docker run \


same here for proxies

ashahba · 2024-06-27T02:51:53Z

comps/imagegen/triton/Makefile

+		-ti test
+
+test:
+	docker run --runtime runc -ti --net host nvcr.io/nvidia/tritonserver:24.04-py3-sdk \


ashahba · 2024-06-27T02:52:58Z

comps/imagegen/triton/client.py

+        inputs,
+        request_id=str(1),
+        outputs=outputs,
+        timeout=1000 * 300,


timeout=network_timeout

ashahba · 2024-06-27T02:54:59Z

comps/imagegen/triton/install_packages.sh

+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+


Please remove these lines.

ashahba · 2024-06-27T02:55:16Z

comps/imagegen/triton/install_efa.sh

+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+


Please remove these lines.

acwrenn · 2024-06-27T15:40:48Z

Please create distinct microservices for the pipeline components, for instance triton server should be in its own microservice. Data cleaning, embedding, model server, model etc. The genAIexample contains a pipeline composed of microservices in the GenAImicrocomps and we need e2e tests for everything.

This is an interesting question - because the tritonserver portion acts as a replacement for the TGI services in the LLM examples - which are not distinct comps contained in this repo.

So - there seems to be a clear distinction between a service that performs inference, and a service that glues other parts together. I dont think that requiring the buisness-logic service onto an accelerator-having-host is a good idea.

So - I guess my question is one looking for clarity. There should be a comp that is JUST the model server, and then keep the ImageGen API container as a different comp?

for more information, see https://pre-commit.ci

…omps into add_imagegen_comp

for more information, see https://pre-commit.ci

…omps into add_imagegen_comp

for more information, see https://pre-commit.ci

* refine chatqna test script Signed-off-by: letonghan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete comments Signed-off-by: letonghan <[email protected]> * modify expected result of embedding Signed-off-by: letonghan <[email protected]> * update rerank expected result Signed-off-by: letonghan <[email protected]> * update llm expected result Signed-off-by: letonghan <[email protected]> * update docker compose yaml Signed-off-by: letonghan <[email protected]> * fix conda issue Signed-off-by: letonghan <[email protected]> * add log_path for log collection Signed-off-by: letonghan <[email protected]> --------- Signed-off-by: letonghan <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

dcmiddle · 2024-10-04T21:08:57Z

@acwrenn are you still actively working this PR or should it be closed?

acwrenn and others added 4 commits June 20, 2024 11:02

Add first-pass at stability tritonserver-based imagegen comp

de78a3d

[pre-commit.ci] auto fixes from pre-commit.com hooks

78b6b26

for more information, see https://pre-commit.ci

Add lic header to all files

16f32a8

Merge branch 'add_imagegen_comp' of https://github.com/acwrenn/GenAIC…

195f700

…omps into add_imagegen_comp

gollum2411 reviewed Jun 20, 2024

View reviewed changes

comps/imagegen/launch_tritonserver.sh Outdated Show resolved Hide resolved

Fix port variable name

0cf5965

ftian1 reviewed Jun 27, 2024

View reviewed changes

dbkinder reviewed Jun 27, 2024

View reviewed changes

ashahba requested changes Jun 27, 2024

View reviewed changes

acwrenn and others added 16 commits June 27, 2024 08:48

Update apt format

4ea1511

Centralize network_timeout

acfeb8e

[pre-commit.ci] auto fixes from pre-commit.com hooks

45ac958

for more information, see https://pre-commit.ci

Remove test triton client file

6f238cd

Merge branch 'add_imagegen_comp' of https://github.com/acwrenn/GenAIC…

d111033

…omps into add_imagegen_comp

Remove double header

2f47d34

[pre-commit.ci] auto fixes from pre-commit.com hooks

6bae349

for more information, see https://pre-commit.ci

Fix port variable name

58d14ca

Update apt format

268d182

Centralize network_timeout

2c78a6a

Remove test triton client file

89ffd51

[pre-commit.ci] auto fixes from pre-commit.com hooks

7cfda4d

for more information, see https://pre-commit.ci

Remove double header

0670bec

Add a first-pass at a README

e4634ec

Merge branch 'add_imagegen_comp' of https://github.com/acwrenn/GenAIC…

da80ad5

…omps into add_imagegen_comp

[pre-commit.ci] auto fixes from pre-commit.com hooks

221fae3

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add first-pass at stability tritonserver-based imagegen comp #227

Add first-pass at stability tritonserver-based imagegen comp #227

acwrenn commented Jun 20, 2024

acwrenn commented Jun 20, 2024 •

edited

Loading

ftian1 Jun 27, 2024

mkbhanda commented Jun 27, 2024

dbkinder left a comment

ashahba left a comment

ashahba Jun 27, 2024

acwrenn Jun 27, 2024

ashahba Jun 27, 2024 •

edited

Loading

ashahba Jun 27, 2024 •

edited

Loading

acwrenn Jun 27, 2024 •

edited

Loading

ashahba Jun 27, 2024

ashahba Jun 27, 2024

acwrenn Jun 27, 2024

ashahba Jun 27, 2024

ashahba Jun 27, 2024

ashahba Jun 27, 2024

ashahba Jun 27, 2024

ashahba Jun 27, 2024

acwrenn commented Jun 27, 2024

dcmiddle commented Oct 4, 2024


		ARG TRITON_VERSION=24.04

		FROM nvcr.io/nvidia/tritonserver:${TRITON_VERSION}-py3 AS triton


		ARG MODEL_NAME=stability

		RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb && \

		# Copyright (C) 2024 Intel Corporation
		# SPDX-License-Identifier: Apache-2.0

Add first-pass at stability tritonserver-based imagegen comp #227

Are you sure you want to change the base?

Add first-pass at stability tritonserver-based imagegen comp #227

Conversation

acwrenn commented Jun 20, 2024

Description

Issues

Type of change

Dependencies

Tests

acwrenn commented Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

mkbhanda commented Jun 27, 2024

dbkinder left a comment

Choose a reason for hiding this comment

ashahba left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ashahba Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

ashahba Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

acwrenn Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acwrenn commented Jun 27, 2024

dcmiddle commented Oct 4, 2024

acwrenn commented Jun 20, 2024 •

edited

Loading

ashahba Jun 27, 2024 •

edited

Loading

ashahba Jun 27, 2024 •

edited

Loading

acwrenn Jun 27, 2024 •

edited

Loading