Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gitextractor): subtask Clone Git Repo ended unexpectedly #8136

Merged
merged 5 commits into from
Oct 22, 2024

Conversation

caioq
Copy link
Contributor

@caioq caioq commented Oct 9, 2024

Summary

  • According to github documentation the installation access token expire after 1 hour. This commit generates a new installation access token at the start of each gitextractor task avoiding the auth error that happens for pipelines that last more than 1 hour.

Does this close any open issues?

Closes #7958

Screenshots

Before the fix:
image

After the fix: The pipeline lasts more than 1 hour and the gitextractor tasks keep working
image

Other Information

I understand that this solution is not the most efficient, since for each gitextractor task it will generate new tokens even when the current token is still valid. However, I believe it can be used as a temporary solution to enable the use of Github App without problems.

For a more efficient solution, we could generate a new token only when it has reached its expiration time.
Analyzing the code, I believe that the expiration time of this token needs to be persisted in the database in order to be accessed when preparing the task. What do you think? I may be evolving this solution in another new PR.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. component/plugins This issue or PR relates to plugins pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux labels Oct 9, 2024
Copy link
Contributor

@klesh klesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation assumes all repositories belong to GitHub, which is incorrect. We need to decouple GitExtractor plugin from specific data source platforms like GitHub and GitLab.

Here's a suggested approach:

Define an interface: Create an interface named DynamicGitUrl (or a more descriptive name) within the gitextractor plugin. This interface should define a method to retrieve the latest Git URL based on a given connection ID and scope ID.

Implement PrepareTaskData: In the gitextractor.PrepareTaskData function, if a plugin, connection ID, and scope ID are provided, use core.GetPlugin to fetch the plugin instance and dynamically cast it to the DynamicGitUrl interface. Then, call the interface's method with the connection ID and scope ID to retrieve the latest Git URL.

Implement DynamicGitUrl in Data Source plugins: Each data source plugin (like GitHub) should implement the DynamicGitUrl interface, providing its own logic for determining the Git URL based on connection and scope information.

This approach allows for a more flexible and extensible design. The GitExtractor plugin remains agnostic to specific data sources, while each data source plugin is responsible for providing the appropriate Git URL retrieval logic.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Oct 17, 2024
@caioq
Copy link
Contributor Author

caioq commented Oct 17, 2024

The current implementation assumes all repositories belong to GitHub, which is incorrect. We need to decouple GitExtractor plugin from specific data source platforms like GitHub and GitLab.

Here's a suggested approach:

Define an interface: Create an interface named DynamicGitUrl (or a more descriptive name) within the gitextractor plugin. This interface should define a method to retrieve the latest Git URL based on a given connection ID and scope ID.

Implement PrepareTaskData: In the gitextractor.PrepareTaskData function, if a plugin, connection ID, and scope ID are provided, use core.GetPlugin to fetch the plugin instance and dynamically cast it to the DynamicGitUrl interface. Then, call the interface's method with the connection ID and scope ID to retrieve the latest Git URL.

Implement DynamicGitUrl in Data Source plugins: Each data source plugin (like GitHub) should implement the DynamicGitUrl interface, providing its own logic for determining the Git URL based on connection and scope information.

This approach allows for a more flexible and extensible design. The GitExtractor plugin remains agnostic to specific data sources, while each data source plugin is responsible for providing the appropriate Git URL retrieval logic.

Hi @klesh, thanks for the review!
I agree with you and liked your suggestion. I implemented it here 26b716f for the Github plugin, let me know if this is what you had in mind.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Oct 17, 2024
@caioq caioq requested a review from klesh October 17, 2024 20:08
klesh
klesh previously approved these changes Oct 21, 2024
Copy link
Contributor

@klesh klesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES!!! Exactly. Fantastic.
Have you tested it on your local machine? Is it work as expected?

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Oct 21, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Oct 22, 2024
@klesh klesh merged commit edd7dc0 into apache:main Oct 22, 2024
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/plugins This issue or PR relates to plugins lgtm This PR has been approved by a maintainer pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug][gitextractor] subtask Clone Git Repo ended unexpectedly
2 participants