Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.1.14 breaks streaming response rendering from Claude 3.5 Sonnet #134

Closed
radu-malliu opened this issue Aug 2, 2024 · 7 comments
Closed
Assignees

Comments

@radu-malliu
Copy link

radu-malliu commented Aug 2, 2024

We picked up v0.1.14 in our project today and noticed streaming responses from Claude via Bedrock are no longer rendered as chunks are delivered. The message from the model remains empty until a new reply is entered, then the message is rendered.
Reverting to v0.1.13 fixes the issue.

After entering a reply
step1

After entering one more reply
step2

The message not being renedered would be a response to:

ConversationChain(
        llm=ss.llm,
        verbose=True,
        memory=ConversationBufferWindowMemory(
            k=ss.configs['models']['llm'][model]['memory_window'],
            ai_prefix="Assistant",
            chat_memory=StreamlitChatMessageHistory(),
        ),
        prompt=llm_prompt,
    ).predict(
        input=input_text, callbacks=[StreamHandler(st.empty())]
    )
@efriis
Copy link
Member

efriis commented Aug 2, 2024

Hey @radu-malliu ! Any chance you have a piece of code that reproduces this? I'm running the following which is streaming successfully:

%pip install -qU langchain-aws==0.1.14

from langchain_aws import ChatBedrock

bedrock = ChatBedrock(model_id="anthropic.claude-3-5-sonnet-20240620-v1:0")
for chunk in bedrock.stream("hi"):
    print(chunk.content, end="|", flush=True)

@radu-malliu
Copy link
Author

hey @efriis, thanks for taking a look! I suspect it's related to the star rating we add to replies. Trying to whip something up to repro.

@OSS-GR
Copy link

OSS-GR commented Aug 6, 2024

Hi there @efriis , I am on 0.1.15 and I am facing a similar issue. Just by taking this sample code from the documentation I am able to reproduce this. The chunks do not stream, instead it is as if I have used .invoke over .stream in terms of the perception to the user.

from typing_extensions import Annotated, TypedDict
from langchain_aws.chat_models.bedrock import ChatBedrock
import logging


logging.basicConfig(level=logging.INFO)

class AnswerWithJustification(TypedDict):
    '''An answer to the user question along with justification for the answer.'''
    answer: Annotated[str,...]
    justification: Annotated[str,...]


llm =ChatBedrock(
    model_id="anthropic.claude-3-5-sonnet-20240620-v1:0",
    model_kwargs={"temperature": 0.001},
    region_name="us-east-1",
    streaming=True
)  # type: ignore[call-arg]
structured_llm = llm.with_structured_output(AnswerWithJustification)

for chunk in structured_llm.stream("What weighs more a pound of bricks or a pound of feathers"):

    print(chunk)

@RisaKirisu
Copy link

Similar issue. Streaming callback handler is not working properly with ChatBedrock. This code should reproduce the problem:

from langchain_aws import ChatBedrock
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

chat_model = ChatBedrock(model_id="anthropic.claude-3-5-sonnet-20240620-v1:0", streaming=True, region_name='us-east-1')
system_message = ""
chat_model.model_kwargs = {
            "system": system_message,
            "max_tokens": 10,
            "top_k": 50,
            "top_p": 1,
            "temperature": 0.1,
        }

chat_model.callbacks = [StreamingStdOutCallbackHandler()]
response = chat_model.invoke([
            ("human", "Hello")
        ])
# streaming response should be streamed to stdout, but it's not. 

@hourliert
Copy link

Hi all, I was facing the same issue and I shared a mitigation here: #144 (comment)
If you are using the old LLMChain.run method, you can fix it by doing so:

        chain = LLMChain(prompt=prompt, llm=sonnet_llm, llm_kwargs={"stream": True}) # <---- THIS
        chain.run(name="Jane")

@OSS-GR
Copy link

OSS-GR commented Aug 12, 2024

Thanks @hourliert but we've been using LCEL in our codebase and so this LLMChain is not a solution that solves the streaming issue for us.

Hoping that someone can get a chance to look at this soon.

ccurme added a commit that referenced this issue Aug 16, 2024
…154)

As pointed out in
#134 (comment),
some Bedrock models that support streaming tool calls do not properly
stream structured output. This is due to our implementation of
`with_structured_output`. Here we update the output parsing for models
that support streaming tool calls.
@langcarl langcarl bot unassigned efriis Aug 30, 2024
@ccurme ccurme assigned ccurme and unassigned ccurme Aug 30, 2024
@3coins
Copy link
Collaborator

3coins commented Oct 3, 2024

Answered in #217. Also, streaming support is best for LCEL chains, look at this page for a LCEL conversation chain with memory support.
https://python.langchain.com/v0.3/docs/how_to/message_history

Closing as duplicate.

@3coins 3coins closed this as completed Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants