`OpenAIGenerator` raises `IndexError` when using `stream_options={"include_usage": True}` #8507

bendavis78 · 2024-10-31T01:03:56Z

By default OpenAIGenerator does not include token usage when using streaming_callback. According to OpenAI's documentation, you can include usage in streaming responses by passing stream_options={"include_usage": True}.

However when passing this via generation_kwargs, the _connect_chunks method fails with an IndexError:

response = []

def stream_response(chunk: StreamingChunk):
    response.append(chunk.content)
    clear_output(wait=True)
    display(Markdown("".join(response)))

result = pipeline.run({
    "query_embedder": {"text": question},
    "retriever": {"top_k": 3},
    "prompt": {"query": question, "language": "English"},
    "generator": {
        "streaming_callback": stream_response,
        "generation_kwargs": {"stream_options": {"include_usage": True}},
    }
})

Exception:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[37], line 8
      5     clear_output(wait=True)
      6     display(Markdown("".join(response)))
----> 8 result = pipeline.run({
      9     "query_embedder": {"text": question},
     10     "retriever": {"top_k": 3},
     11     "prompt": {"query": question, "language": "English"},
     12     "generator": {
     13         "streaming_callback": stream_response,
     14         "generation_kwargs": {"stream_options": {"include_usage": True}},
     15     }
     16 })

File [/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py:229](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py#line=228), in Pipeline.run(self, data, include_outputs_from)
    226     msg = f"Maximum run count {self._max_runs_per_component} reached for component '{name}'"
    227     raise PipelineMaxComponentRuns(msg)
--> 229 res: Dict[str, Any] = self._run_component(name, components_inputs[name])
    231 if name in include_outputs_from:
    232     # Deepcopy the outputs to prevent downstream nodes from modifying them
    233     # We don't care about loops - Always store the last output.
    234     extra_outputs[name] = deepcopy(res)

File [/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py:67](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py#line=66), in Pipeline._run_component(self, name, inputs)
     65 span.set_content_tag("haystack.component.input", inputs)
     66 logger.info("Running component {component_name}", component_name=name)
---> 67 res: Dict[str, Any] = instance.run(**inputs)
     68 self.graph.nodes[name]["visits"] += 1
     70 # After a Component that has variadic inputs is run, we need to reset the variadic inputs that were consumed

File [/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py:227](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py#line=226), in OpenAIGenerator.run(self, prompt, streaming_callback, generation_kwargs)
    225             chunks.append(chunk_delta)
    226             streaming_callback(chunk_delta)  # invoke callback with the chunk_delta
--> 227     completions = [self._connect_chunks(chunk, chunks)]
    228 elif isinstance(completion, ChatCompletion):
    229     completions = [self._build_message(completion, choice) for choice in completion.choices]

File [/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py:249](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py#line=248), in OpenAIGenerator._connect_chunks(self, chunk, chunks)
    241 """
    242 Connects the streaming chunks into a single ChatMessage.
    243 """
    244 complete_response = ChatMessage.from_assistant("".join([chunk.content for chunk in chunks]))
    245 complete_response.meta.update(
    246     {
    247         "model": chunk.model,
    248         "index": 0,
--> 249         "finish_reason": chunk.choices[0].finish_reason,
    250         "usage": {},  # we don't have usage data for streaming responses
    251     }
    252 )
    253 return complete_response

IndexError: list index out of range
Selection deleted

The text was updated successfully, but these errors were encountered:

anakin87 added type:bug Something isn't working type:feature New feature or request and removed type:bug Something isn't working labels Oct 31, 2024

julian-risch added the P2 Medium priority, add to the next sprint if no P1 available label Nov 18, 2024

julian-risch assigned silvanocerza Nov 18, 2024

silvanocerza mentioned this issue Nov 19, 2024

fix: OpenAIChatGenerator and OpenAIGenerator crashing when streaming with usage tracking #8558

Merged

silvanocerza closed this as completed in #8558 Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`OpenAIGenerator` raises `IndexError` when using `stream_options={"include_usage": True}` #8507

`OpenAIGenerator` raises `IndexError` when using `stream_options={"include_usage": True}` #8507

bendavis78 commented Oct 31, 2024

OpenAIGenerator raises IndexError when using stream_options={"include_usage": True} #8507

OpenAIGenerator raises IndexError when using stream_options={"include_usage": True} #8507

Comments

bendavis78 commented Oct 31, 2024

`OpenAIGenerator` raises `IndexError` when using `stream_options={"include_usage": True}` #8507

`OpenAIGenerator` raises `IndexError` when using `stream_options={"include_usage": True}` #8507