-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add StringJoiner
as a convenience component
#8353
Comments
Hey @silvanocerza I made a draft PR with the component here. However, I noticed in testing that the component doesn't actually work in a Pipeline. Specifically it fails this test @pytest.mark.integration
def test_with_pipeline(self):
string_1 = "What's Natural Language Processing?"
string_2 = "What's is life?"
pipe = Pipeline()
pipe.add_component("prompt_builder_1", PromptBuilder("Builder 1: {{query}}"))
pipe.add_component("prompt_builder_2", PromptBuilder("Builder 2: {{query}}"))
pipe.add_component("joiner", StringJoiner())
pipe.connect("prompt_builder_1.prompt", "joiner.strings")
pipe.connect("prompt_builder_2.prompt", "joiner.strings")
results = pipe.run(data={"prompt_builder_1": {"query": string_1}, "prompt_builder_2": {"query": string_2}})
assert "joiner" in results
assert len(results["joiner"]["strings"]) == 2 # <-- This Fails. List only has one item in it (the string_2).
assert results["joiner"]["strings"] == [
"Builder 1: What's Natural Language Processing?",
"Builder 2: What's is life?",
] so I wanted to ask if you happen to know whether the |
@sjrl not sure about this but looking at the type definition of HAYSTACK_VARIADIC_ANNOTATION = "__haystack__variadic_t"
# # Generic type variable used in the Variadic container
T = TypeVar("T")
# Variadic is a custom annotation type we use to mark input types.
# This type doesn't do anything else than "marking" the contained
# type so it can be used in the `InputSocket` creation where we
# check that its annotation equals to CANALS_VARIADIC_ANNOTATION
Variadic: TypeAlias = Annotated[Iterable[T], HAYSTACK_VARIADIC_ANNOTATION] |
@davidsbatista thanks for the info. When I talked to @silvanocerza offline he thought this should work. @silvanocerza have you had a chance to look at this yet? |
@sjrl have a look at this sample/testing code: https://github.com/deepset-ai/haystack/blob/main/haystack/testing/sample_components/joiner.py I think what you want might already be there |
I've just tried the |
@sjrl @silvanocerza good news - I think "I found the bug", running the
|
Oh interesting, I didn't realize the subgraph branch also changes how the non-cyclic piplines work as well. |
Is your feature request related to a problem? Please describe.
As @ju-gu and I have been building pipelines with many branches that branch into many different Prompt Builders we ran into a need for a component like
StringJoiner
which would concatenate strings into a list of strings. In our specific use case our workflow looks like (except imagine six branches):ConditionalRouter --> PromptBuilder 1 --> Generator --> AnswerBuilder --> AnswerJoiner
|-> PromptBuilder 2 --> Generator --> AnswerBuilder -|
Using the new StringJoiner we would reduce the number of Generator + AnswerBuilder Components we would need to:
ConditionalRouter --> PromptBuilder 1 --> StringJoiner --> OutputAdpater --> Generator --> AnswerBuilder
|-> PromptBuilder 2 -|
Describe the solution you'd like
Check with the team that adding this component would be okay. If so I'll go ahead and make a PR.
The text was updated successfully, but these errors were encountered: