Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add GitExtractor component #5459

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

raphaelchristi
Copy link
Contributor

This pull request introduces the GitExtractor component for analyzing Git repositories. Key features:

  • Repository Info: Extracts branch, remotes, and commit details
  • Statistics: Calculates file counts, sizes, and line numbers
  • Directory Structure: Generates complete folder tree
  • File Content: Extracts text files with binary handling
  • Memory Safe: Implements content truncation for large repos
  • Error Handling: Graceful error recovery and resource cleanup

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Dec 26, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 26, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 26, 2024
@raphaelchristi raphaelchristi force-pushed the feature/git-extractor-component branch from f2d65d0 to 68737d2 Compare December 26, 2024 18:11
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 26, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 26, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 26, 2024
Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, @raphaelchristi

This is looking good.

The tmpdir calls are blocking and it would be better if they were async. Could you refactor that?

Also, you don't need to delete the folder because you could use a context manager that will remove the folder once the code block runs.

… cleanup

- Convert methods to async using async/await
- Add asynccontextmanager for automatic tmpdir cleanup
- Remove manual shutil.rmtree calls
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Dec 28, 2024
@raphaelchristi
Copy link
Contributor Author

Hi @ogabrielluiz ,

Thank you for the review! I've implemented the suggested changes:

  • Converted all methods to async using async/await
  • Added asynccontextmanager for the tmpdir operations
  • Implemented automatic cleanup using the context manager

Let me know if you'd like me to make any additional adjustments to the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants