This repository has been archived by the owner on Mar 1, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Airbyte based readers #428
Airbyte based readers #428
Changes from 9 commits
970b564
952cc8f
214d0e4
237600c
79623d8
e7313d3
3de44a4
ad9ad4a
21d7c5c
ea60ec3
64e1ad5
d37b137
7c2d32d
2a7294c
4211699
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we should figure out a better test system on our side, but currently tests are failing because we assume that third-party imports are lazy imports.
i know specifically for AirbyteRecordMessage it's used in the
RecordHandler
type used to typerecord_handler
. For this do you think we could typerecord_handler
asAny
first, and then within__init__
lazy importAirbyteRecordMessage
, importRecordHandler
, and do anisinstance()
check onrecord_handler
?A bit hacky but will at least allow tests to pass. thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer composition vs. multiple inheritance, easier to inspect the specific object (BaseEmbeddedIntegration) you're using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oof, guessing there's no iterator method available for llamaindex?
Could we implement one anyway / is there any value in doing so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, what do you think, @jerryjliu ? Probably something we can split out of the first PR though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by default every loader implements
load_data
but you're free to define custom methods too!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK cool, let's add a
lazy_load
as well that's returning an iterable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity where is _load_data defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's on the base integration class that's coming from the CDK. We chose this approach so we can roll out improvements without having to update the loader itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see, makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add an example for how we expect people to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly thought of a simple for loop, added an example for that. I couldn't find a concept within llama index that would abstract this away, @jerryjliu what do you think about this?