-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] bundle: Parallel download and decompression #4504
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all |
@vyasgun: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
7766bc1
to
bb0b17c
Compare
This commit does the following: - Return a reader from the bundle Download function. - Use the reader to stream the bytes to Extract function. This commit replaces grab client with the net/http client to ensure that the bytes are streamed come in correct order to the Extract func. Currently, only zst decompression is being used in the UncompressWithReader function as it is the primary compression algorithm being used in crc.
bb0b17c
to
3a62d1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is great work and it's functional.
My findings:
-
cancelling through web api (socket) works
-
better logging would be nice (currently it's skipping the download part) so that it's clear that the download and uncompression is being done simultaneously
-
progress bar could show more info about both processes
-
resuming interrupted download doesn't work - everything starts from the beginning
-
golangci-lint issues
Suggestions:
- add (cli/config) option to disable this functionality (revert back to old behavior)
} | ||
client := http.Client{Transport: &http.Transport{}} | ||
|
||
if ctx == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check might need to be moved higher, as I see that http.NewRequestWithContext(ctx, "GET", uri, nil)
is already called earlier (and if ctx is null, might produce errors).
} | ||
return downloadInfo.Download(ctx, constants.GetDefaultBundlePath(preset), 0664) | ||
} | ||
|
||
func Download(ctx context.Context, preset crcPreset.Preset, bundleURI string, enableBundleQuayFallback bool) (string, error) { | ||
func Download(ctx context.Context, preset crcPreset.Preset, bundleURI string, enableBundleQuayFallback bool) (io.Reader, string, error) { | ||
var reader io.Reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line is kind of "hidden" here above the big comment, I suggest moving it down below it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable does not seem to be required in this block, the last line of the function can return nil
instead of reader
@@ -116,43 +116,43 @@ func GetPresetName(imageName string) crcpreset.Preset { | |||
return preset | |||
} | |||
|
|||
func PullBundle(ctx context.Context, imageURI string) (string, error) { | |||
func PullBundle(ctx context.Context, imageURI string) (io.Reader, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add support for pulling from container image repositories as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand your question, PullBundle
pulls from a container image registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If PullBundle
never returns a bundle, I would not change its signature, and do something like this in Download
instead of return image.PullBundle(ctx, bundleURI)
path, err := image.PullBundle(ctx, bundleURI)
return nil, path, err
But maybe you have plans to make it return a reader later?
@@ -198,8 +229,14 @@ func Use(bundleName string) (*CrcBundleInfo, error) { | |||
return defaultRepo.Use(bundleName) | |||
} | |||
|
|||
func Extract(ctx context.Context, path string) (*CrcBundleInfo, error) { | |||
if err := defaultRepo.Extract(ctx, path); err != nil { | |||
func Extract(ctx context.Context, reader io.Reader, path string) (*CrcBundleInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use just one return statement and one error check in this case:
var err error
if reader == nil {
err = defaultRepo.Extract(ctx, path)
} else {
err = defaultRepo.ExtractWithReader(ctx, reader, path)
}
if err != nil {
return nil, err
}
return defaultRepo.Get(filepath.Base(path))
Add more blank lines as you see fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to make the same suggestion
@@ -124,6 +125,36 @@ func (bundle *CrcBundleInfo) createSymlinkOrCopyPodmanRemote(binDir string) erro | |||
return bundle.copyExecutableFromBundle(binDir, PodmanExecutable, constants.PodmanRemoteExecutableName) | |||
} | |||
|
|||
func (repo *Repository) ExtractWithReader(ctx context.Context, reader io.Reader, path string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function and Extract
are very similar, could they be merged in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract
could probably use os.Open
and call into ExtractWithReader
return nil, err | ||
} | ||
return untar(ctx, reader, targetDir, fileFilter, showProgress) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: empty line 47
@@ -86,6 +101,9 @@ func uncompress(ctx context.Context, tarball, targetDir string, fileFilter func( | |||
} | |||
} | |||
|
|||
func Untar(ctx context.Context, reader io.Reader, targetDir string, fileFilter func(string) bool, showProgress bool) ([]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this is for some future functionality?
@@ -124,6 +125,36 @@ func (bundle *CrcBundleInfo) createSymlinkOrCopyPodmanRemote(binDir string) erro | |||
return bundle.copyExecutableFromBundle(binDir, PodmanExecutable, constants.PodmanRemoteExecutableName) | |||
} | |||
|
|||
func (repo *Repository) ExtractWithReader(ctx context.Context, reader io.Reader, path string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract
could probably use os.Open
and call into ExtractWithReader
@@ -163,7 +163,7 @@ func downloadDataFiles(goos string, components []string, destDir string) ([]stri | |||
if !shouldDownload(components, componentName) { | |||
continue | |||
} | |||
filename, err := download.Download(context.TODO(), dl.url, destDir, dl.permissions, nil) | |||
_, filename, err := download.Download(context.TODO(), dl.url, destDir, dl.permissions, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd add a download.DownloadFile(…) (string, error)
to make it clear when we don't need the reader.
@@ -116,43 +116,43 @@ func GetPresetName(imageName string) crcpreset.Preset { | |||
return preset | |||
} | |||
|
|||
func PullBundle(ctx context.Context, imageURI string) (string, error) { | |||
func PullBundle(ctx context.Context, imageURI string) (io.Reader, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If PullBundle
never returns a bundle, I would not change its signature, and do something like this in Download
instead of return image.PullBundle(ctx, bundleURI)
path, err := image.PullBundle(ctx, bundleURI)
return nil, path, err
But maybe you have plans to make it return a reader later?
} | ||
return downloadInfo.Download(ctx, constants.GetDefaultBundlePath(preset), 0664) | ||
} | ||
|
||
func Download(ctx context.Context, preset crcPreset.Preset, bundleURI string, enableBundleQuayFallback bool) (string, error) { | ||
func Download(ctx context.Context, preset crcPreset.Preset, bundleURI string, enableBundleQuayFallback bool) (io.Reader, string, error) { | ||
var reader io.Reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable does not seem to be required in this block, the last line of the function can return nil
instead of reader
@@ -198,8 +229,14 @@ func Use(bundleName string) (*CrcBundleInfo, error) { | |||
return defaultRepo.Use(bundleName) | |||
} | |||
|
|||
func Extract(ctx context.Context, path string) (*CrcBundleInfo, error) { | |||
if err := defaultRepo.Extract(ctx, path); err != nil { | |||
func Extract(ctx context.Context, reader io.Reader, path string) (*CrcBundleInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to make the same suggestion
logging.Infof("Extracting bundle: %s...", bundleName) | ||
if _, err := bundle.Extract(ctx, bundlePath); err != nil { | ||
if _, err := bundle.Extract(ctx, reader, bundlePath); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a bundlePath
and a reader
feels a bit redundant, ideally we could pass one or the other, but I'm not sure it is currently that easy.
Description
This pull request does the following:
This commit replaces grab client with the net/http client to ensure that the bytes are streamed come in correct order to the Extract func. Currently, only zst decompression is being used in the UncompressWithReader function as it is the primary compression algorithm being used in crc.
The download progress bar has been removed temporarily and will be added back as part of refactoring the code.
Fixes: #4336
Type of change
test, version modification, documentation, etc.)
Proposed changes
Testing
Contribution Checklist