-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
community[patch]: avoid shouldIgnore on folders for GithubRepoLoader #6255
community[patch]: avoid shouldIgnore on folders for GithubRepoLoader #6255
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this! Would it be possible/easy to push up a test for this as well? If not no worries.
@bracesproul thanks for checking it. I'd be happy to add a test. What do I use as |
Only add a test if there's a way to mock the API request. If there's no way to do it without an API key then don't bother adding a test. |
Ok, I don't think it can be tested without an access token so this PR is ready to go. |
This PR avoids the
shouldIgnore
check on directories when looping over files in a github repo that is loaded using theGithubRepoLoader
.The issue with checking
shouldIgnore
on a directory is that a directory path might match an ignore pattern while its content does not. For instance,ignorePaths: ["*", "!*/", "!**/*.txt"]
results inignore.ignores("subfolder") == true
butignore.ignores("subfolder/file.txt") == false
. Before this fix, theGithubRepoLoader
will misssubfolder/file.txt
and any other files in subfolders that should be included. My solution here is to avoid theshouldIgnore
test on directories and only checkshouldIgnore
on files. This means all directories will be traversed in a recursive load but only the correct files will be loaded.I was unsure how to add a test case for this as an accessToken is required to do recursive collection of github repos.
Fixes #6214