-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For MrJob, make unpacking archives optional #93
Comments
Can you try this out @GilHoggarth and see if it works? |
The installation line fails to pull in the patch, which I guess is due to the latest policy changes around github access. FYI, I'm running:
This installation works with mrjob:
|
Hacking the changes directly into
Quite understandably, you'll expect I made a mess of adding your code! |
If I change line 693 to
|
Installation seems to work if the There were a few issues with the implementation, but it seems to work okay now. As per ukwa/mrjob@e3901a2 |
I've opened a PR (Yelp/mrjob#2215) but we can just install our branch for now. |
Installed via pip and seen to be working. |
For the purpose of the our hadoop data migration, this patch works successfully. However, you might wish to keep this ticket open whilst the patch is waiting to be included upstream. Consequently, I'm unassigning myself from this ticket |
Hm, attempted to request review in https://groups.google.com/g/mrjob but my post isn't turning up. Unless I messed up posting there somehow? |
I'm developing a fork of MrJob that makes unpacking archives optional, here: https://github.com/ukwa/mrjob/tree/make-unpacking-archives-optional
It should be possible to install this into a venv using:
If that works, then update the MrJob config as per the updated docs:
Running the job with this configuration should skip the unpacking-archives step and leave the files as they were.
EDIT: If this works, I'll try to contribute the change back upstream.
The text was updated successfully, but these errors were encountered: