-
-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle Single Files and also enable html, pdf file formats for /learn #712
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Added single file functionality 2. Added HTML files 3. Added PDF files
for more information, see https://pre-commit.ci
Added pypdf==4.1.0, required for handling pdf files in /learn
…pyter-ai into learn_more_file_formats
1. Added single file functionality 2. Added HTML files 3. Added PDF files
for more information, see https://pre-commit.ci
Added pypdf==4.1.0, required for handling pdf files in /learn
JasonWeill
force-pushed
the
learn_more_file_formats
branch
from
April 8, 2024 20:55
f045924
to
6afc81d
Compare
JasonWeill
reviewed
Apr 8, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looking good! Thanks for demo'ing your change for me as well. Just a couple of suggestions for improvement.
…pyter-ai into learn_more_file_formats
Made changes for 1. matching all file extensions in lower case, to ensure no case sensitivity 2. Streamlined the PDF loader to remove a loop over pages, using join and list comprehension.
for more information, see https://pre-commit.ci
JasonWeill
approved these changes
Apr 9, 2024
@meeseeksdev please backport to 1.x |
meeseeksmachine
pushed a commit
to meeseeksmachine/jupyter-ai
that referenced
this pull request
Apr 10, 2024
… pdf file formats for /learn
srdas
added a commit
that referenced
this pull request
Apr 10, 2024
…formats for /learn (#723) Co-authored-by: Sanjiv Das <[email protected]>
Marchlak
pushed a commit
to Marchlak/jupyter-ai
that referenced
this pull request
Oct 28, 2024
…jupyterlab#712) * Update directory.py to add new file formats 1. Added single file functionality 2. Added HTML files 3. Added PDF files * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dependencies Added pypdf==4.1.0, required for handling pdf files in /learn * Update directory.py to add new file formats 1. Added single file functionality 2. Added HTML files 3. Added PDF files * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dependencies Added pypdf==4.1.0, required for handling pdf files in /learn * Amended directory.py Made changes for 1. matching all file extensions in lower case, to ensure no case sensitivity 2. Streamlined the PDF loader to remove a loop over pages, using join and list comprehension. * Update directory.py * Update directory.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #641
Fixes #358
Amends PR #663, which may be closed after giving credit to apurvakhatri.
Update directory.py to add new file formats
For a recent working paper (37 pages pdf) that is not public see below that it works in
/learn
as a single file for in-context prompting. This exemplifies 1 and 2 above.On a raw html page with my publications (https://srdas.github.io/research.htm) saved as a
.html
file, it delivers a nice summary shown below, exemplifying 3 above. Note that here/learn
was applied to an entire folder containing txt, pdf, and html files.