Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] Allow to specify the working dir for a container image #4638

Closed
2 tasks done
vkaiser-mb opened this issue Dec 23, 2023 · 7 comments
Closed
2 tasks done
Labels
enhancement New feature or request flytekit FlyteKit Python related issue

Comments

@vkaiser-mb
Copy link

Motivation: Why do you think this is important?

Imagine you have a workflow with 2 tasks A and B. Both tasks have different container images.
Currently, the working dir is per default "/root".
Both container images dont have this folder accessible. Instead it would be great if you could specify for every image the according working dir that is compatible with the image.
You can change the working dir via the CLI, but this has two downsides:

  • You cant specify different working dirs for task A and task B
  • Working dir and container depend on each other and therefore should be defined in the same place in code.

Note: In the local usecase, the specified working dir can be ignored, similar like the container_image is ignored.

Goal: What should the final outcome look like, ideally?

@task(container_image=Container(image="imageFoo", working_dir="/home/foo"))
def foo():

@task(container_image=Container(image="imageBar", working_dir="/home/bar"))
def bar():

and

@task(container_image=Container(image=ImageSpec(....), working_dir="..."))
def foo():

of course it could be compatible with the old interface and still allow do specify only images if no working dir modification is needed:

@task(container_image="imageFoo")
def foo():

@task(container_image=ImageSpec(....))
def foo():

Describe alternatives you've considered

To avoid nesting, Imagespec could also have the option of working dir directly:

@task(container_image=ImageSpec(base_image="imageFoo", working_dir="/home/foo"))
def foo():

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@vkaiser-mb vkaiser-mb added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Dec 23, 2023
Copy link

dosubot bot commented Dec 23, 2023

🤖

Hey there @vkaiser-mb! I'm Dosu, your friendly neighborhood bot. 🤖 I'm here to help you tackle bugs, answer your questions, and guide you on your journey to becoming a contributor. While we wait for a human maintainer to join us, consider me your guide on the side. Let's debug and learn together, shall we? 🚀

Thank you for your detailed feature request. I understand that you would like to have the ability to specify a working directory for each container image in a Flyte workflow. This feature would indeed provide more flexibility when working with different container images that have different directory structures.

Currently, as you've noted, Flyte does not support specifying a working directory for a container image. The Container type in the Flyte codebase, which is used to define how a task should be executed, does not have a field for specifying a working directory. This type is used in the TaskTemplate struct, which can specify a Container as its target for execution.

Your proposed changes to the Container type and the @task decorator seem feasible and would indeed be backward compatible, allowing users to specify only an image if no working directory modification is needed. However, implementing this feature would require changes to the FlyteIDL (the Flyte Interface Definition Language), the Flytekit (the Python SDK for Flyte), and possibly other components of the Flyte ecosystem.

As a maintainer of the Flyte repository, I can't create pull requests or issues. However, I would recommend you to create a feature request issue in the Flyte repository detailing your proposal. This would allow the Flyte community to discuss and potentially implement this feature.

Please let me know if you have any other questions or need further clarification. I'm here to help!

Best regards,
Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@eapolinario
Copy link
Contributor

We talked about this in slack and there was consensus around letting the working dir be set in the docker images instead.

Closing this issue for now, please re-open if you feel that this might be worth more discussion.

@kumare3
Copy link
Contributor

kumare3 commented Dec 29, 2023

@eapolinario but we need to make a change to not set any working for as default right

@vkaiser-mb
Copy link
Author

How exactly would I specify the working dir in the docker image? Will it take the dockers CWD? Would I specify a Env var? Since I did not check the mechanics, I wonder how the upload works. (My assumption was that is the reason for specifying the working dir in advance. In our workaround we also needed to modify the PYTHONPATH env var.) Arent those things that would need to be adapted in the flyte code?

@vkaiser-mb
Copy link
Author

@eapolinario Happy new year! Will there be another Github issue to resolve this in your proposed way?

@kumare3 kumare3 reopened this Jan 16, 2024
@kumare3
Copy link
Contributor

kumare3 commented Jan 16, 2024

@vkaiser-mb - @eapolinario was actually out on vacation and is back tomorrow. I am re-opening the issue to actually make the change in flytekit.

So the change is as follows

  1. Change pyflyte run's default run path to be current_dir or .
  2. Change base docker images to set WORKDIR to desired location

We will use this issue to track these 2 changes

@vkaiser-mb
Copy link
Author

Awesome! Thanks.

@eapolinario eapolinario added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Jan 18, 2024
@kumare3 kumare3 closed this as completed Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

3 participants