-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT Store data in bytes not io.BytesIO #91
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ryanking13. This is a nice improvement. I'd always much rather be handling a bytes
object than a BytesIO
As long as the argument is a bytes
object, creating a BytesIO
is effectively free, it just stores the bytes object:
https://github.com/python/cpython/blob/v3.11.6/Modules/_io/bytesio.c#L945-L949
So using them only once rather than having to seek(0)
all the time is probably a better pattern.
micropip/_compat_in_pyodide.py
Outdated
async def fetch_bytes(url: str, kwargs: dict[str, str]) -> bytes: | ||
parsed_url = urlparse(url) | ||
if parsed_url.scheme == "emfs": | ||
return open(parsed_url.path, "rb") | ||
if parsed_url.scheme == "file": | ||
result_bytes = Path(parsed_url.path).read_bytes() | ||
elif parsed_url.scheme == "file": | ||
result_bytes = (await loadBinaryFile(parsed_url.path)).to_bytes() | ||
else: | ||
result_bytes = await (await pyfetch(url, **kwargs)).bytes() | ||
return BytesIO(result_bytes) | ||
return result_bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we don't do anything with result_bytes
anymore, maybe tidy as follows:
async def fetch_bytes(url: str, kwargs: dict[str, str]) -> bytes:
parsed_url = urlparse(url)
if parsed_url.scheme == "emfs":
return Path(parsed_url.path).read_bytes()
if parsed_url.scheme == "file":
return (await loadBinaryFile(parsed_url.path)).to_bytes()
return await (await pyfetch(url, **kwargs)).bytes()
micropip/_compat_not_in_pyodide.py
Outdated
response = _fetch(url, kwargs=kwargs) | ||
return BytesIO(response.read()) | ||
return response.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
return _fetch(url, kwargs=kwargs).read()
@ryanking13 Thanks! Let us know if you plan to address Hood's suggestions, otherwise +1 to merge even as is. |
Thanks for the review! I had forgotten about this PR. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
This is a split off of #90.
This PR changes how wheel data is stored in
WheelInfo
class during the package installation: instead of converting the downloaded wheel data toio.BytesIO
and storing it, store it asbytes
and convert it toio.BytesIO
when we need to pass it to other methods that acceptsio.BytesIO
object.No functional change is intended, just to avoid the mistake of having to do a seek(0) after reading the data.
There seems a slight overhead of converting bytes to io.BytesIO, but I think it is not critical.
Benchmark