Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle Anthropic 'Overloaded' error when streaming in langchain #688

Open
Kevin-McIsaac opened this issue Oct 11, 2024 · 1 comment

Comments

@Kevin-McIsaac
Copy link

I'm building a RAG creates the Langchain chain:

prompt = ChatPromptTemplate.from_template(template)
model =ChatAnthropic(model=model_version,temperature=0, max_tokens=1024, timeout=None, max_retries=3, )
chain = prompt | model | StrOutputParser()

which I then streamed to a streamlit UI using

LLM_stream = chain.stream
response:str = ""
for chunk in LLM_stream:
    ....

Mostly this work great, however some times, perhaps 1 in 100 questions to the RAG, I get the error:

anthropic.APIStatusError: {'type': 'error', 'error': {'details': None, 'type': 'overloaded_error', 'message': 'Overloaded'}}

Any thoughts on dealing with this, e.g., I create the max_retries or add a time out.

Logs

File "/home/kmcisaac/Projects/policy_pal/chat/chat_app.py", line 146, in generate_answer
for chunk in LLM_stream:
│ └ <generator object RunnableSequence.stream at 0x7f4386fc2c50>
└ 'd for a non-'

File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 3407, in stream
yield from self.transform(iter([input]), config, **kwargs)
│ │ │ │ └ {}
│ │ │ └ None
│ │ └ {'question': 'What is the minimum casual employment required for a non-LMI loan?', 'level_of_detail': 'Be concise. Proved a o...
│ └ <function RunnableSequence.transform at 0x7f43a2b89bc0>
└ ChatPromptTemplate(input_variables=['context', 'level_of_detail', 'question'], input_types={}, partial_variables={}, messages...
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 3394, in transform
yield from self._transform_stream_with_config(
│ └ <function Runnable._transform_stream_with_config at 0x7f43a2b88400>
└ ChatPromptTemplate(input_variables=['context', 'level_of_detail', 'question'], input_types={}, partial_variables={}, messages...
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 2197, in _transform_stream_with_config
chunk: Output = context.run(next, iterator) # type: ignore
│ │ │ └ <generator object RunnableSequence._transform at 0x7f43873579c0>
│ │ └ <method 'run' of '_contextvars.Context' objects>
│ └ <_contextvars.Context object at 0x7f43767e0ec0>
└ 'd for a non-'
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 3357, in _transform
yield from final_pipeline
└ <generator object BaseTransformOutputParser.transform at 0x7f43a34beb40>
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/output_parsers/transform.py", line 64, in transform
yield from self._transform_stream_with_config(
│ └ <function Runnable._transform_stream_with_config at 0x7f43a2b88400>
└ StrOutputParser()
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 2197, in _transform_stream_with_config
chunk: Output = context.run(next, iterator) # type: ignore
│ │ │ └ <generator object BaseTransformOutputParser._transform at 0x7f43863f8220>
│ │ └ <method 'run' of '_contextvars.Context' objects>
│ └ <_contextvars.Context object at 0x7f43862b5dc0>
└ 'd for a non-'
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/output_parsers/transform.py", line 29, in _transform
for chunk in input:
│ └ <itertools._tee object at 0x7f4386fa2600>
└ AIMessageChunk(content='d for a non-', additional_kwargs={}, response_metadata={}, id='run-d6320e6f-f285-42e5-9c0e-6ad6b44204...
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/runnables/base.py", line 1431, in transform
yield from self.stream(final, config, **kwargs)
│ │ │ │ └ {}
│ │ │ └ {'tags': [], 'metadata': {}, 'callbacks': <langchain_core.callbacks.manager.CallbackManager object at 0x7f4384540c10>, 'recur...
│ │ └ ChatPromptValue(messages=[HumanMessage(content="You are a mortgage broker research assistant. Read the following policy conte...
│ └ <function BaseChatModel.stream at 0x7f4387b51e40>
└ ChatAnthropic(model='claude-3-haiku-20240307', temperature=0.0, max_retries=3, anthropic_api_url='https://api.anthropic.com',...
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 420, in stream
raise e
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 400, in stream
for chunk in self._stream(messages, stop=stop, **kwargs):
│ │ │ │ │ └ {}
│ │ │ │ └ None
│ │ │ └ [HumanMessage(content="You are a mortgage broker research assistant. Read the following policy context very carefully and use...
│ │ └ <function ChatAnthropic._stream at 0x7f43877f8a40>
│ └ ChatAnthropic(model='claude-3-haiku-20240307', temperature=0.0, max_retries=3, anthropic_api_url='https://api.anthropic.com',...
└ ChatGenerationChunk(text='d for a non-', message=AIMessageChunk(content='d for a non-', additional_kwargs={}, response_metada...
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/langchain_anthropic/chat_models.py", line 715, in _stream
for event in stream:
│ └ <anthropic.Stream object at 0x7f438453b3d0>
└ RawContentBlockDeltaEvent(delta=TextDelta(text='d for a non-', type='text_delta'), index=0, type='content_block_delta')
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/anthropic/_streaming.py", line 68, in iter
for item in self._iterator:
│ │ └ <generator object Stream.stream at 0x7f438451c040>
│ └ <anthropic.Stream object at 0x7f438453b3d0>
└ RawContentBlockDeltaEvent(delta=TextDelta(text='d for a non-', type='text_delta'), index=0, type='content_block_delta')
File "/home/kmcisaac/Projects/policy_pal/.venv/lib/python3.11/site-packages/anthropic/_streaming.py", line 110, in stream
raise self._client._make_status_error(
│ │ └ <function Anthropic._make_status_error at 0x7f438776e2a0>
│ └ <anthropic.Anthropic object at 0x7f4386080b90>
└ <anthropic.Stream object at 0x7f438453b3d0>

anthropic.APIStatusError: {'type': 'error', 'error': {'details': None, 'type': 'overloaded_error', 'message': 'Overloaded'}}
2024-10-11 09:02:37.734 | INFO | page:main:361 - Sync answers in 2.0s costing 0.00¢

@onel
Copy link

onel commented Oct 24, 2024

I encountered the same problem.

I think the issue here is that the API is a bit inconsistent for streaming messages.
Every message that comes you can read the type

for message in response:
    if message.type == 'message_start':

    elif message.type == 'content_block_delta':

    elif message.type == 'error':

etc

The problem is that message.type == 'error ' never happens. The code mentions that an error is raised
https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/_streaming.py#L110

I would expect a message of type error to also be sent as for the rest.

The README gives an example of wrapping the .generate() call and catching that error but nothing is mentioned for the streaming case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants