Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: Correctly handle multi-element rich text #25762

Merged
merged 6 commits into from
Dec 16, 2024

Conversation

h1ros
Copy link
Contributor

@h1ros h1ros commented Aug 27, 2024

Description:

  • Add _concatenate_rich_text method to combine all elements in rich text arrays
  • Update load_page method to use _concatenate_rich_text for rich text properties
  • Ensure all text content is captured, including inline code and formatted text
  • Add unit tests to verify correct handling of multi-element rich text
    This fix prevents truncation of content after backticks or other formatting elements.

Issue:

Using Notion DB Loader, the text for richtext and title is truncated after 1st element was loaded as Notion Loader only read the first element.

Dependencies: any dependencies required for this change
None.

- Add _concatenate_rich_text method to combine all elements in rich text arrays
- Update load_page method to use _concatenate_rich_text for rich text properties
- Ensure all text content is captured, including inline code and formatted text
- Add unit tests to verify correct handling of multi-element rich text

This fix prevents truncation of content after backticks or other formatting elements.
Copy link

vercel bot commented Aug 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 16, 2024 8:20pm

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) labels Aug 27, 2024
@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Aug 28, 2024
@efriis efriis disabled auto-merge December 16, 2024 19:56
@efriis efriis assigned efriis and unassigned baskaryan Dec 16, 2024
@efriis efriis enabled auto-merge (squash) December 16, 2024 20:04
@efriis efriis merged commit 8f5e72d into langchain-ai:master Dec 16, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) lgtm PR looks good. Use to confirm that a PR is ready for merging. size:L This PR changes 100-499 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants