v0.1.14 breaks streaming response rendering from Claude 3.5 Sonnet #134

radu-malliu · 2024-08-02T20:48:16Z

We picked up v0.1.14 in our project today and noticed streaming responses from Claude via Bedrock are no longer rendered as chunks are delivered. The message from the model remains empty until a new reply is entered, then the message is rendered.
Reverting to v0.1.13 fixes the issue.

After entering a reply

After entering one more reply

The message not being renedered would be a response to:

ConversationChain(
        llm=ss.llm,
        verbose=True,
        memory=ConversationBufferWindowMemory(
            k=ss.configs['models']['llm'][model]['memory_window'],
            ai_prefix="Assistant",
            chat_memory=StreamlitChatMessageHistory(),
        ),
        prompt=llm_prompt,
    ).predict(
        input=input_text, callbacks=[StreamHandler(st.empty())]
    )

The text was updated successfully, but these errors were encountered:

efriis · 2024-08-02T21:18:26Z

Hey @radu-malliu ! Any chance you have a piece of code that reproduces this? I'm running the following which is streaming successfully:

%pip install -qU langchain-aws==0.1.14

from langchain_aws import ChatBedrock

bedrock = ChatBedrock(model_id="anthropic.claude-3-5-sonnet-20240620-v1:0")
for chunk in bedrock.stream("hi"):
    print(chunk.content, end="|", flush=True)

radu-malliu · 2024-08-02T22:44:11Z

hey @efriis, thanks for taking a look! I suspect it's related to the star rating we add to replies. Trying to whip something up to repro.

OSS-GR · 2024-08-06T14:26:53Z

Hi there @efriis , I am on 0.1.15 and I am facing a similar issue. Just by taking this sample code from the documentation I am able to reproduce this. The chunks do not stream, instead it is as if I have used .invoke over .stream in terms of the perception to the user.

from typing_extensions import Annotated, TypedDict
from langchain_aws.chat_models.bedrock import ChatBedrock
import logging


logging.basicConfig(level=logging.INFO)

class AnswerWithJustification(TypedDict):
    '''An answer to the user question along with justification for the answer.'''
    answer: Annotated[str,...]
    justification: Annotated[str,...]


llm =ChatBedrock(
    model_id="anthropic.claude-3-5-sonnet-20240620-v1:0",
    model_kwargs={"temperature": 0.001},
    region_name="us-east-1",
    streaming=True
)  # type: ignore[call-arg]
structured_llm = llm.with_structured_output(AnswerWithJustification)

for chunk in structured_llm.stream("What weighs more a pound of bricks or a pound of feathers"):

    print(chunk)

RisaKirisu · 2024-08-06T21:26:20Z

Similar issue. Streaming callback handler is not working properly with ChatBedrock. This code should reproduce the problem:

from langchain_aws import ChatBedrock
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

chat_model = ChatBedrock(model_id="anthropic.claude-3-5-sonnet-20240620-v1:0", streaming=True, region_name='us-east-1')
system_message = ""
chat_model.model_kwargs = {
            "system": system_message,
            "max_tokens": 10,
            "top_k": 50,
            "top_p": 1,
            "temperature": 0.1,
        }

chat_model.callbacks = [StreamingStdOutCallbackHandler()]
response = chat_model.invoke([
            ("human", "Hello")
        ])
# streaming response should be streamed to stdout, but it's not.

hourliert · 2024-08-08T13:41:12Z

Hi all, I was facing the same issue and I shared a mitigation here: #144 (comment)
If you are using the old LLMChain.run method, you can fix it by doing so:

        chain = LLMChain(prompt=prompt, llm=sonnet_llm, llm_kwargs={"stream": True}) # <---- THIS
        chain.run(name="Jane")

OSS-GR · 2024-08-12T07:13:36Z

Thanks @hourliert but we've been using LCEL in our codebase and so this LLMChain is not a solution that solves the streaming issue for us.

Hoping that someone can get a chance to look at this soon.

…154) As pointed out in #134 (comment), some Bedrock models that support streaming tool calls do not properly stream structured output. This is due to our implementation of `with_structured_output`. Here we update the output parsing for models that support streaming tool calls.

3coins · 2024-10-03T21:58:44Z

Answered in #217. Also, streaming support is best for LCEL chains, look at this page for a LCEL conversation chain with memory support.
https://python.langchain.com/v0.3/docs/how_to/message_history

Closing as duplicate.

langcarl bot added the investigate label Aug 2, 2024

efriis assigned ccurme and efriis Aug 2, 2024

ccurme mentioned this issue Aug 12, 2024

bedrock converse[patch]: support streaming in with_structured_output #154

Merged

langcarl bot unassigned efriis Aug 30, 2024

ccurme assigned ccurme and unassigned ccurme Aug 30, 2024

shabie mentioned this issue Oct 3, 2024

Setting streaming to True doesn't seems to make any difference #217

Closed

3coins closed this as completed Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.14 breaks streaming response rendering from Claude 3.5 Sonnet #134

v0.1.14 breaks streaming response rendering from Claude 3.5 Sonnet #134

radu-malliu commented Aug 2, 2024 •

edited

Loading

efriis commented Aug 2, 2024

radu-malliu commented Aug 2, 2024

OSS-GR commented Aug 6, 2024 •

edited

Loading

RisaKirisu commented Aug 6, 2024

hourliert commented Aug 8, 2024

OSS-GR commented Aug 12, 2024

3coins commented Oct 3, 2024

v0.1.14 breaks streaming response rendering from Claude 3.5 Sonnet #134

v0.1.14 breaks streaming response rendering from Claude 3.5 Sonnet #134

Comments

radu-malliu commented Aug 2, 2024 • edited Loading

efriis commented Aug 2, 2024

radu-malliu commented Aug 2, 2024

OSS-GR commented Aug 6, 2024 • edited Loading

RisaKirisu commented Aug 6, 2024

hourliert commented Aug 8, 2024

OSS-GR commented Aug 12, 2024

3coins commented Oct 3, 2024

radu-malliu commented Aug 2, 2024 •

edited

Loading

OSS-GR commented Aug 6, 2024 •

edited

Loading