Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support added for a single file and directory #663

Closed

Conversation

apurvakhatri
Copy link
Contributor

Hello Team,

Issue number of the reported bug or feature request: #641

Changes
Support added to QA a single file. The code is capable to detect single file or directory and process accordingly.

Testing performed
The functionality has been tested on my system (MacOS).

Additional context
I have tested for single files, file inside a directory and directory only.

Copy link
Member

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR! I left some feedback for you to address. Let me know if you have further questions. 🤗

subdirs[:] = [
d for d in subdirs if not (d[0] == "." or d in EXCLUDE_DIRS)
]
filenames = [f for f in filenames if not f[0] == "."]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few issues with the implementation proposed by this branch:

  • We seek a list of file paths relative to the current directory. However, this branch only adds file names.

  • This branch updates filenames using the assignment operator = instead of .append(), meaning that the list of filenames is dropped with each iteration of the for loop.

  • filenames is also being used by the for block itself. This means that even if the previous issue is fixed, every iteration of this for loop will still delete the value of filenames set by the previous iteration. Take this as a simplified example:

>>> for i in range(5):
...   print(i)
...   i = 1
...
0
1
2
3
4

This implementation can be corrected and simplified greatly. Here are my suggestions.

  1. The logic within the for filename in filenames: ... block on line 69 should be extracted to a separate split_file(path, splitter) function.

  2. Revert the other changes, and simply add this block at the very top of this split() function definition:

if os.path.isfile(path):
    return split_file(path, splitter)

Copy link
Member

@dlqqq dlqqq Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked all this info by adding print() statements in the definition of split() to verify the value of filenames. To test, I ran jupyter lab from the root of this Git repo and called /learn docs to learn all of the Jupyter AI documentation.

Can you do the same before I review this again? Thanks in advance!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlqqq The code for the function split() can be simplified to the following form of the original function:
image
I have tested this separately and it works for a single file or a directory.

@dlqqq
Copy link
Member

dlqqq commented Apr 15, 2024

Superseded by #712.

@dlqqq dlqqq closed this Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants