Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAIGenerator raises IndexError when using stream_options={"include_usage": True} #8507

Closed
bendavis78 opened this issue Oct 31, 2024 · 0 comments · Fixed by #8558
Closed
Assignees
Labels
P2 Medium priority, add to the next sprint if no P1 available type:feature New feature or request

Comments

@bendavis78
Copy link

By default OpenAIGenerator does not include token usage when using streaming_callback. According to OpenAI's documentation, you can include usage in streaming responses by passing stream_options={"include_usage": True}.

However when passing this via generation_kwargs, the _connect_chunks method fails with an IndexError:

response = []

def stream_response(chunk: StreamingChunk):
    response.append(chunk.content)
    clear_output(wait=True)
    display(Markdown("".join(response)))

result = pipeline.run({
    "query_embedder": {"text": question},
    "retriever": {"top_k": 3},
    "prompt": {"query": question, "language": "English"},
    "generator": {
        "streaming_callback": stream_response,
        "generation_kwargs": {"stream_options": {"include_usage": True}},
    }
})

Exception:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[37], line 8
      5     clear_output(wait=True)
      6     display(Markdown("".join(response)))
----> 8 result = pipeline.run({
      9     "query_embedder": {"text": question},
     10     "retriever": {"top_k": 3},
     11     "prompt": {"query": question, "language": "English"},
     12     "generator": {
     13         "streaming_callback": stream_response,
     14         "generation_kwargs": {"stream_options": {"include_usage": True}},
     15     }
     16 })

File [/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py:229](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py#line=228), in Pipeline.run(self, data, include_outputs_from)
    226     msg = f"Maximum run count {self._max_runs_per_component} reached for component '{name}'"
    227     raise PipelineMaxComponentRuns(msg)
--> 229 res: Dict[str, Any] = self._run_component(name, components_inputs[name])
    231 if name in include_outputs_from:
    232     # Deepcopy the outputs to prevent downstream nodes from modifying them
    233     # We don't care about loops - Always store the last output.
    234     extra_outputs[name] = deepcopy(res)

File [/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py:67](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/core/pipeline/pipeline.py#line=66), in Pipeline._run_component(self, name, inputs)
     65 span.set_content_tag("haystack.component.input", inputs)
     66 logger.info("Running component {component_name}", component_name=name)
---> 67 res: Dict[str, Any] = instance.run(**inputs)
     68 self.graph.nodes[name]["visits"] += 1
     70 # After a Component that has variadic inputs is run, we need to reset the variadic inputs that were consumed

File [/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py:227](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py#line=226), in OpenAIGenerator.run(self, prompt, streaming_callback, generation_kwargs)
    225             chunks.append(chunk_delta)
    226             streaming_callback(chunk_delta)  # invoke callback with the chunk_delta
--> 227     completions = [self._connect_chunks(chunk, chunks)]
    228 elif isinstance(completion, ChatCompletion):
    229     completions = [self._build_message(completion, choice) for choice in completion.choices]

File [/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py:249](http://localhost:9104/usr/local/lib/python3.11/site-packages/haystack/components/generators/openai.py#line=248), in OpenAIGenerator._connect_chunks(self, chunk, chunks)
    241 """
    242 Connects the streaming chunks into a single ChatMessage.
    243 """
    244 complete_response = ChatMessage.from_assistant("".join([chunk.content for chunk in chunks]))
    245 complete_response.meta.update(
    246     {
    247         "model": chunk.model,
    248         "index": 0,
--> 249         "finish_reason": chunk.choices[0].finish_reason,
    250         "usage": {},  # we don't have usage data for streaming responses
    251     }
    252 )
    253 return complete_response

IndexError: list index out of range
Selection deleted
@anakin87 anakin87 added type:bug Something isn't working type:feature New feature or request and removed type:bug Something isn't working labels Oct 31, 2024
@julian-risch julian-risch added the P2 Medium priority, add to the next sprint if no P1 available label Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Medium priority, add to the next sprint if no P1 available type:feature New feature or request
Projects
None yet
4 participants