Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ComponentTool to Haystack tools #8693

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

vblagoje
Copy link
Member

@vblagoje vblagoje commented Jan 9, 2025

Why:

Adds ComponentTool enabling Haystack components to be wrapped and used as LLM-compatible tools thus integrating them into Haystack's new tooling architecture.

What:

  • Added ComponentTool class to facilitate the use of Haystack components as callable tools.
  • Modified __init__.py to include ComponentTool in the module exports.
  • Updated pyproject.toml to include the docstring-parser dependency for documentation extraction.

How can it be used:

This implementation enables the creation of LLM-compatible tools from existing components. Here’s an example of usage:

from haystack import component, Pipeline
from haystack.dataclasses import ChatMessage, ChatRole
from haystack.tools import ComponentTool
from haystack.components.tools import ToolInvoker
from haystack.components.generators.chat import OpenAIChatGenerator

@component
class WeatherComponent:
    @component.output_types(reply=str)
    def run(self, city: str, units: str = "celsius"):
        return {"reply":  f"Weather in {city}: 20°{units}"}

weather_tool = ComponentTool(component=WeatherComponent())

# Create pipeline with OpenAIChatGenerator and ToolInvoker
pipeline = Pipeline()
pipeline.add_component("llm", OpenAIChatGenerator(model="gpt-4o-mini", tools=[weather_tool]))
pipeline.add_component("tool_invoker", ToolInvoker(tools=[weather_tool]))

# Connect components
pipeline.connect("llm.replies", "tool_invoker.messages")

message = ChatMessage.from_user("What's the weather like in S.F.?")

# Run pipeline
result = pipeline.run({"llm": {"messages": [message]}})

With weather_tool assigned to your ChatGenerator and ToolInvoker, you can now invoke the component automatically with given a ChatMessage that will trigger this tool, e.g ChatMessage.from_user("What's the weather like in S.F.?")

How did you test it:

Extensive unit and integration tests were created, validating the functionality of ComponentTool with various components. Tests included invoking tools and ensuring correct parameter handling and responses from LLM integrations.

Notes for the reviewer:

Review the implementation of the ComponentTool for potential edge cases with non-component inputs. The integration tests require an OpenAI API key to run successfully, so consider verifying this before executing the tests.

@vblagoje vblagoje requested review from a team as code owners January 9, 2025 12:51
@vblagoje vblagoje requested review from dfokina and anakin87 and removed request for a team January 9, 2025 12:51
@vblagoje vblagoje marked this pull request as draft January 9, 2025 12:54
@vblagoje vblagoje marked this pull request as ready for review January 9, 2025 13:14
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good to me. My main request is to make the docstring-parser dependency optional. Let's discuss if @anakin87 disagrees or if we decide to make jsonschema a required dependency too.

pyproject.toml Outdated
@@ -57,6 +57,7 @@ dependencies = [
"requests",
"numpy",
"python-dateutil",
"docstring-parser",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using lazyimport for this dependency similar to jsonschema.

---
features:
- |
Introduced the ComponentTool, a new tool that wraps Haystack components allowing them to be utilized as tools for LLMs (various ChatGenerators). This tool supports automatic function schema generation, input type validation, and offers enhanced integration with various data types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned limitations in the call regarding which components are supported. Could you please add them to the release note? If there is a particular Haystack component that is not supported mention it here as an example.

)

pipeline = Pipeline()
pipeline.add_component("llm", OpenAIChatGenerator(model="gpt-4", tools=[tool]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpt-4o-mini here too or is it gpt-4 on purpose?

haystack/tools/component_tool.py Outdated Show resolved Hide resolved
haystack/tools/component_tool.py Outdated Show resolved Hide resolved
haystack/tools/component_tool.py Show resolved Hide resolved
"Use this method to create a Tool only with Haystack component instances."
)
raise ValueError(message)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am contradicting my comment in the experimental PR (https://github.com/deepset-ai/haystack-experimental/pull/159/files/ce864dd8f4f10c2c42f3ded35a515ef71a995585#r1892494701)
but probably now that we have clarified that components to convert into tools MUST NOT be part of a Pipeline, I think it is worth adding a check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a simple test to cover this.

Comment on lines 288 to 289
elif is_pydantic_v2_model(python_type):
schema = self._create_pydantic_schema(python_type, description)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our components don't use Pydantic Models, so I’m unsure why we’re supporting this.
Allowing such flexibility might be counterproductive.

If there's an important perspective or requirement that I'm overlooking, I’d appreciate it if you could elaborate.

haystack/tools/component_tool.py Show resolved Hide resolved
@@ -0,0 +1,549 @@
# SPDX-FileCopyrightText: 2022-present deepset GmbH <[email protected]>
Copy link
Member

@anakin87 anakin87 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add tests for:

  • to_dict/from_dict
  • complete YAML serde (here is a good example)

@vblagoje
Copy link
Member Author

vblagoje commented Jan 9, 2025

Thank you @julian-risch and @anakin87 - great feedback. I haven’t removed support for Pydantic models yet but if both of you are okay with its removal - I’ll proceed. The support is largely effortless thanks to TypeAdapter conversion, but as @anakin87 mentioned it could be counterproductive. I’ll remove it in a separate commit to maintain a reference for its implementation in case our users ask for it or an internal need for it arises in the future.

@anakin87
Copy link
Member

anakin87 commented Jan 9, 2025

Let's also remember to add this module to docs/pydoc/config/tools_api.yml

@coveralls
Copy link
Collaborator

coveralls commented Jan 9, 2025

Pull Request Test Coverage Report for Build 12694409064

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.03%) to 91.122%

Totals Coverage Status
Change from base Build 12694180152: 0.03%
Covered Lines: 8765
Relevant Lines: 9619

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants