-
Notifications
You must be signed in to change notification settings - Fork 634
Multi-task functionality in cloud manager. #369
base: master
Are you sure you want to change the base?
Conversation
@lindong28 is the better reviewer. It takes me more time and he is stronger. I will stay on if there is any disagreement or slowdowns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pach @dubey. Left some comments below.
It is probably hard to make things both simple and intuitive if we add the extra loop in this script. One alternative approach would be having the loop outside this script, so that we can achieve the multi-task functionality while still keeping the cloud_manager.py usage clean for most users.
parser = argparse.ArgumentParser( | ||
usage='cloud_manager.py {} [<args>]'.format(command), | ||
formatter_class=argparse.ArgumentDefaultsHelpFormatter) | ||
epilog='''The supported commands are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the helper message is a bit confusing after this change. For example, python perfzero/lib/cloud_manager.py status -h
will print arguments such as --accelerator_type
that does not below to this command.
It would make sense to update to helper message from cloud_manager.py -h
to make it clear that user can run e.g. cloud_manager.py status -h
to get specific helper message for each command.
if command == 'create': | ||
create(flags.username, flags.project, flags.zone, flags.machine_type, | ||
create(flags.instance_name, flags.project, flags.zone, flags.machine_type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of using user_name
is that we can easily identify who created the instance. So that if some instances are running but not used for a long term, we can find the owner and possibly stop those instances. Using user_name
as parameter name will make this goal clearer.
If there is no specific benefit of letting user user arbitrary instance name string, may be we can keep it as is.
'port forwarding for tensorboard:\n%s\n', port_forwarding_cmd) | ||
|
||
|
||
def _instance_names(instance_name, num_instances): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user creates two instance, the name of the first in stance will be instance_name-1
. If user creates one instance, the name of the first instance will be instance_name
. These two names are different.
Let's say user executes cloud_manager.py create --num_instances=1
followed by cloud_manager.py create --num_instances=2
. Three instances will be created. It seems a bit counter-intuitive, since user probably expects two instances to be created with his/her name.
instance_names = _instance_names(instance_name, num_instances) | ||
logging.debug('Creating gcloud computing instances {}'.format(instance_names)) | ||
machine_type = get_machine_type(machine_type, accelerator_count) | ||
cmd = (_instances_cmd('create', instance_names, project, zone) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say user runs cloud_manager.py create --num_instances=2
to create two instances. Then user runs cloud_manager.py create --num_instances=3
. Does the second command fail, or does it create one extra instance? Either way, it is probably hard to achieve the ideal goal of creating one extra instance for user.
No description provided.