feat!: new `ChatMessage` #8640

anakin87 · 2024-12-13T15:36:17Z

Related Issues

part of Migration of experimental ChatMessage to Haystack #8583

Proposed Changes:

Change the ChatMessage dataclass (porting of feat: update ChatMessage to support tools haystack-experimental#52)
The main differences from linked PR concern how to make changes evident to users

How did you test it?

CI

Notes for the reviewer

I have prepared the draft docs for the new ChatMessage, including a migration guide. It might be helpful in reviewing this PR.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I documented my code
I ran pre-commit hooks and fixed any issue

coveralls · 2024-12-13T15:42:49Z

Pull Request Test Coverage Report for Build 12376304805

Details

0 of 0 changed or added relevant lines in 0 files are covered.
5 unchanged lines in 3 files lost coverage.
Overall coverage increased (+0.09%) to 90.565%

Files with Coverage Reduction	New Missed Lines	%
components/generators/openai_utils.py	1	83.33%
components/generators/chat/hugging_face_api.py	2	97.67%
dataclasses/chat_message.py	2	98.69%

Totals
Change from base Build 12353137414:	0.09%
Covered Lines:	8207
Relevant Lines:	9062

💛 - Coveralls

anakin87 · 2024-12-16T17:41:56Z

haystack/components/generators/chat/hugging_face_api.py

-        formatted_msg["name"] = message.name
-
-    return formatted_msg
+    return {"role": message.role.value, "content": message.text or ""}


these changes to HF API Chat Generator are only temporary: we will override them soon when porting the support for Tools

anakin87 · 2024-12-16T17:42:10Z

haystack/components/generators/openai_utils.py

-        openai_msg["name"] = message.name
-
-    return openai_msg
+    return {"role": message.role.value, "content": message.text}


these changes to the OpenAI Chat Generator are only temporary: we will override them soon when porting the support for Tools

anakin87 · 2024-12-16T17:47:09Z

haystack/dataclasses/chat_message.py

+        general_msg = (
+            "Use the `from_assistant`, `from_user`, `from_system`, and `from_tool` class methods to create a "
+            "ChatMessage. For more information about the new API and how to migrate, see the documentation:"
+            " https://docs.haystack.deepset.ai/docs/data-classes#chatmessage"


this link and the following ones will be updated once we have a new documentation page containing examples and the migration guide

haystack/dataclasses/chat_message.py

LastRemote · 2024-12-17T08:02:29Z

haystack/dataclasses/chat_message.py

+                raise TypeError(f"Unsupported type in ChatMessage content: `{type(part).__name__}` for `{part}`.")
+
+        serialized["_content"] = content
+        return serialized


I still do not like the idea that the serialized chatmessage has underscore prefixes. May I ask what is the rationale behind it?

This simply stems from the new dataclass definition:

@dataclass class ChatMessage: _role: ChatRole _content: Sequence[ChatMessageContentT] _meta: Dict[str, Any] = field(default_factory=dict, hash=False)

The idea is that these attributes are internal.
For creation, you should use class methods from_user, etc.
For attribute access, several properties are available: text, texts, tool_call, tool_calls, ...

@anakin87 I get this part, but I still hope the serialized dictionaries can drop the underscore prefixes for easier access. For example, in a deployed chatbot pipeline, I'd prefer something like

{"chat_history": [{"role": "user", "content": [...]}, {"role": "assistant", "content": [...]}, "meta": {...}]}

as a part of the request instead of

{"chat_history": [{"_role": "user", "_content": [...]}, {"_role": "assistant", "_content": [...]}, "_meta": {...}]}

(assuming this takes an input chat_history of type List[ChatMessage]).

Also it makes migration a bit easier.

Also as a bonus I have something like this in my draft PR:

if isinstance(data["content"], str): content.append(TextContent(text=data["content"])) else: for part in data["content"]: ... # rest of the logic

This makes the serialization form of a ChatMessage fully backwards compatible, and synergizes well with common shortcuts (in openai and anthropic APIs)

@anakin87 @vblagoje wdyt about this? pinging in case this was lost in the threads.

@LastRemote sorry for not responding sooner.

Since this is a substantial change, we decided to make it visible and not backward compatible.

As soon as possible, we will publish a detailed migration guide + GitHub/Discord announcements. (Tracked in #8623 and #8654).

@anakin87 Thanks for the reply, but I'm still not fully convinced. We can still keep the internal attributes with underscore prefixes, but just for serialization/deserialization I would suggest us dropping those for the sake of simplicity. Backward compatibility is not my top concern here.

In my current use cases, I have many deployed haystack pipelines that utilize List[ChatMessage] as a part of input/output (which is very common in chatbots to track chat history and multiple reply candidates), it makes more sense to have something like

{"chat_history": [{"role": "user", "content": [...]}, {"role": "assistant", "content": [...]}, "meta": {...}]}

instead of adding an underscore prefix in all fields. Of course, this could be mitigated by some customized logic on the server end (not sure if hayhook does this), but IMO that adds a layer of unneeded complexity.

vblagoje

Seems good on the first pass. Only one comment regarding ToolCallResult - I'll do another pass soon after we consolidate on that one

vblagoje · 2024-12-17T08:43:15Z

haystack/dataclasses/chat_message.py

+    :param error: Whether the Tool invocation resulted in an error.
+    """
+
+    result: str


A note here - I think we should have the result field be a Union of TextContent and other future media types that will include ByteStream and perhaps the upcoming MediaContent. Not sure if it is smart to add that immediately i.e have:

result: TextContent

and then make it Union as we add those new types. Or add it all in one batch sometimes soon when introduce these new types. wdyt is the best approach here?

I haven't studied in depth the multimodal support in different API providers, so I would prefer to introduce these changes when we know more, hopefully in a non-breaking way.

Yeah, neither did I but @mathislucka mentioned encountering this hurdle in his experiments IIRC. If it is a big change to have TextContent here then perhaps lets leave it for next release as this is admittedly already a big change. Wonder what do you think about this @LastRemote - i.e. having tools returning non str content. Have you already encountered that use case?

I haven't studied in depth the multimodal support in different API providers, so I would prefer to introduce these changes when we know more, hopefully in a non-breaking way.

@anakin87 I did some research on this previously: Anthropic models can accept image (as an object containing base64 and mime type) tool result as well as text.

Oh, and it can have a list of contents.
https://docs.anthropic.com/en/api/messages#body-messages-content-content

having tools returning non str content. Have you already encountered that use case?

@vblagoje I am aware of this but it is not my current use case. To be honest I haven't really thought about this (I didn't change it in my multimodal PR), but this could be another Sequence[ChatMessageContentT] for maximum flexibility (it is up to the models and corresponding utility functions whether to accept certain types in tool call result)

Yes, something like that. As this future change will likely break few specific tool integrations, and not affect many users, perhaps we can postpone it to after the integration and keep str as ToolCallResult field for now

...and not affect many users...

I prefer being a little careful on this one though. If a future change makes result to be a list (Anthropic already supports this), existing tool implementations may be affected.

In all honesty, I would prefer to keep the PR as it is regarding this aspect, especially since these changes have been extensively tested in recent months.

I am afraid that introducing a change on the fly at this time could cause some bugs, as well as complicate our lives.

For the future, unlike this PR, I think we can introduce the changes gradually (and without breaking changes), because they would be much smaller in scope.

(@vblagoje regarding the problems Mathis encountered, my impression was that they were not directly related to the tool message types, but to the Toolnvoker return types.)

In all honesty, I would prefer to keep the PR as it is regarding this aspect...

Sure, I am fine with it. Right now I am not seeing a lot of valid use cases for this, and there is still room for future updates.

Yes @anakin87 agreed

anakin87 · 2024-12-17T13:24:45Z

@dfokina please take a look

dfokina

Pushed one tiny fix, all is good ✅

anakin87 added 7 commits December 13, 2024 08:46

draft

4b941b0

del HF token in tests

9ec7de7

Merge branch 'fix-hf-token-test' into new-chatmessage

4865ab0

adaptations

7b6e9d2

progress

c462ddc

Merge branch 'main' into new-chatmessage

94a103a

fix type

873ae4f

github-actions bot added topic:tests type:documentation Improvements on the docs labels Dec 13, 2024

anakin87 changed the title ~~New chatmessage~~ feat!: new ChatMessage Dec 13, 2024

import sorting

fe6c4c8

anakin87 added 4 commits December 16, 2024 12:10

more control on deserialization

1a5b46c

release note

e3f4c89

Merge branch 'main' into new-chatmessage

2370c2f

improvements

180d0f3

anakin87 commented Dec 16, 2024

View reviewed changes

anakin87 marked this pull request as ready for review December 16, 2024 17:49

anakin87 requested review from a team as code owners December 16, 2024 17:49

anakin87 requested review from dfokina, davidsbatista and vblagoje and removed request for a team and davidsbatista December 16, 2024 17:49

LastRemote suggested changes Dec 17, 2024

View reviewed changes

vblagoje reviewed Dec 17, 2024

View reviewed changes

anakin87 added 2 commits December 17, 2024 10:45

support name field

328cebd

fix chatpromptbuilder test

b88daae

anakin87 requested a review from vblagoje December 17, 2024 11:06

vblagoje approved these changes Dec 17, 2024

View reviewed changes

Update chat_message.py

c663d44

dfokina approved these changes Dec 17, 2024

View reviewed changes

anakin87 merged commit ea36026 into main Dec 17, 2024
18 checks passed

anakin87 deleted the new-chatmessage branch December 17, 2024 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat!: new `ChatMessage` #8640

feat!: new `ChatMessage` #8640

anakin87 commented Dec 13, 2024 •

edited

Loading

coveralls commented Dec 13, 2024 •

edited

Loading

anakin87 Dec 16, 2024

anakin87 Dec 16, 2024

anakin87 Dec 16, 2024

LastRemote Dec 17, 2024

anakin87 Dec 17, 2024

LastRemote Dec 17, 2024 •

edited

Loading

LastRemote Dec 17, 2024 •

edited

Loading

LastRemote Dec 18, 2024 •

edited

Loading

anakin87 Dec 19, 2024

LastRemote Dec 19, 2024 •

edited

Loading

vblagoje left a comment

vblagoje Dec 17, 2024

anakin87 Dec 17, 2024

vblagoje Dec 17, 2024

LastRemote Dec 17, 2024 •

edited

Loading

LastRemote Dec 17, 2024

vblagoje Dec 17, 2024 •

edited

Loading

LastRemote Dec 17, 2024

anakin87 Dec 17, 2024

LastRemote Dec 17, 2024

vblagoje Dec 17, 2024

anakin87 commented Dec 17, 2024

dfokina left a comment

feat!: new ChatMessage #8640

feat!: new ChatMessage #8640

Conversation

anakin87 commented Dec 13, 2024 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

coveralls commented Dec 13, 2024 • edited Loading

Pull Request Test Coverage Report for Build 12376304805

Details

💛 - Coveralls

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LastRemote Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

LastRemote Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

LastRemote Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LastRemote Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

vblagoje left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LastRemote Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vblagoje Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anakin87 commented Dec 17, 2024

dfokina left a comment

Choose a reason for hiding this comment

feat!: new `ChatMessage` #8640

feat!: new `ChatMessage` #8640

anakin87 commented Dec 13, 2024 •

edited

Loading

coveralls commented Dec 13, 2024 •

edited

Loading

LastRemote Dec 17, 2024 •

edited

Loading

LastRemote Dec 17, 2024 •

edited

Loading

LastRemote Dec 18, 2024 •

edited

Loading

LastRemote Dec 19, 2024 •

edited

Loading

LastRemote Dec 17, 2024 •

edited

Loading

vblagoje Dec 17, 2024 •

edited

Loading