Skip to content
This repository has been archived by the owner on Dec 9, 2024. It is now read-only.

Multi-task functionality in cloud manager. #369

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dubey
Copy link
Member

@dubey dubey commented May 9, 2019

No description provided.

@dubey dubey requested review from lindong28 and tfboyd May 9, 2019 16:34
@tfboyd
Copy link
Member

tfboyd commented May 9, 2019

@lindong28 is the better reviewer. It takes me more time and he is stronger. I will stay on if there is any disagreement or slowdowns.

Copy link
Contributor

@lindong28 lindong28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pach @dubey. Left some comments below.

It is probably hard to make things both simple and intuitive if we add the extra loop in this script. One alternative approach would be having the loop outside this script, so that we can achieve the multi-task functionality while still keeping the cloud_manager.py usage clean for most users.

parser = argparse.ArgumentParser(
usage='cloud_manager.py {} [<args>]'.format(command),
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
epilog='''The supported commands are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the helper message is a bit confusing after this change. For example, python perfzero/lib/cloud_manager.py status -h will print arguments such as --accelerator_type that does not below to this command.

It would make sense to update to helper message from cloud_manager.py -h to make it clear that user can run e.g. cloud_manager.py status -h to get specific helper message for each command.

if command == 'create':
create(flags.username, flags.project, flags.zone, flags.machine_type,
create(flags.instance_name, flags.project, flags.zone, flags.machine_type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of using user_name is that we can easily identify who created the instance. So that if some instances are running but not used for a long term, we can find the owner and possibly stop those instances. Using user_name as parameter name will make this goal clearer.

If there is no specific benefit of letting user user arbitrary instance name string, may be we can keep it as is.

'port forwarding for tensorboard:\n%s\n', port_forwarding_cmd)


def _instance_names(instance_name, num_instances):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user creates two instance, the name of the first in stance will be instance_name-1. If user creates one instance, the name of the first instance will be instance_name. These two names are different.

Let's say user executes cloud_manager.py create --num_instances=1 followed by cloud_manager.py create --num_instances=2. Three instances will be created. It seems a bit counter-intuitive, since user probably expects two instances to be created with his/her name.

instance_names = _instance_names(instance_name, num_instances)
logging.debug('Creating gcloud computing instances {}'.format(instance_names))
machine_type = get_machine_type(machine_type, accelerator_count)
cmd = (_instances_cmd('create', instance_names, project, zone) +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say user runs cloud_manager.py create --num_instances=2 to create two instances. Then user runs cloud_manager.py create --num_instances=3. Does the second command fail, or does it create one extra instance? Either way, it is probably hard to achieve the ideal goal of creating one extra instance for user.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants