Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modeldb-meta.yaml clone or copy a local repository #111

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion modeldb/modeldb.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def download_model(arg_tuple):
# from ModelDB, but it can be overriden to come from GitHub instead.
if "github" in model_run_info:
# This means we should try to download the model content from
# GitHub instead of from ModelDB.
# GitHub instead of from ModelDB. (but see the local: case below.)
github = model_run_info["github"]
organisation = "ModelDBRepository"
suffix = "" # default branch
Expand All @@ -57,6 +57,27 @@ def download_model(arg_tuple):
# if you need to test changes to a model that does not exist on
# GitHub under the ModelDBRepository organisation.
organisation = github[1:]
elif github.startswith("local:"):
# Using
# github: "local: /path/to/repository"
# in modeldb-run.yaml implies that we 'git clone /path/to/repository'.
# This is useful to tentatively explore the effect of
# model changes on results with different nrn versions
# without committing to github or before being ready to
# make a pull request
print("\nTo be cloned by runmodel", github)
return model_id, model
elif github.startswith("copy:"):
# Using
# github: "copy: /path/to/parentfolder"
# in modeldb-run.yaml implies that we
# 'cp -R /path/to/parentfolder/<id> <workingdir>'
# A copy differs from a clone in that local changes in
# the checkout are mirrored in the copy without having to
# be committed. Note the copy leaves out the .git and
# x86_64 folders.
print("\nTo be copied by runmodel %s/%s" % (github, model_id))
return model_id, model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrnhines : Do we need to have both of these options? i.e. I see that copy: would cover what we typically need in our development workflow. Adding local: help in any other way?

By the way, for naming itself, IMO local: is a bit better than copy:.

Copy link
Member Author

@nrnhines nrnhines Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My experience is that copy is more useful than clone since the former copies also the changes in the working checked out version (so one does not need to first commit those changes). It could easily be called "local:". However, I think the environment variable approach above export MODELDB_LOCAL_REPOSITORIES=$HOME/models/modeldb might be better than having to modify modeldb-meta.yaml. We should discuss some of the detailed positive and negative aspects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the environment variable approach above export MODELDB_LOCAL_REPOSITORIES=$HOME/models/modeldb might be better than having to modify modeldb-meta.yaml.

That's also OK. But this will also need local commits.

Anyway, I feel like the scope + impact of this decision is limited. So I think it's perfectly fine if you choose either approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important part is to have this flexibility and document it properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there needs to be any hurry with this. I'll let it ripen for a while wait til the next round of modeldb updates to see how some of the details help with the workflow.

else:
raise Exception("Invalid value for github key: {}".format(github))
url = "https://api.github.com/repos/{organisation}/{model_id}/zipball{suffix}".format(
Expand Down
31 changes: 29 additions & 2 deletions modeldb/modelrun.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,23 +205,41 @@ def build_python_runfile(model):


def prepare_model(model):
if "github" in model.keys() and (model['github'].startswith("local:")
or model['github'].startswith("copy")):
_prepare_model(model, None, clone=True)
return

# unzip model from cache
with zipfile.ZipFile(
os.path.join(MODELS_ZIP_DIR, str(model.id) + ".zip"), "r"
os.path.join(MODELS_ZIP_DIR, str(model.id) + ".zip"), "r"
) as zip_ref:
_prepare_model(model, zip_ref, clone=False)

def _prepare_model(model, zip_ref, clone=False):
if clone:
model_dir = os.path.join(
model.working_dir,
str(model.id),
)
else:
model_dir = os.path.join(
model.working_dir,
str(model.id),
os.path.dirname(zip_ref.infolist()[0].filename),
)
if True:
model_run_info_file = os.path.join(model_dir, str(model.id) + ".yaml")
if model._inplace and os.path.isfile(model_run_info_file):
with open(model_run_info_file) as run_info_file:
model["run_info"] = yaml.load(run_info_file, yaml.Loader)
else:
if model._clean and is_dir_non_empty(model_dir):
shutil.rmtree(model_dir)
zip_ref.extractall(os.path.join(model.working_dir, str(model.id)))
if clone:
gitclone(model, model_dir)
else:
zip_ref.extractall(os.path.join(model.working_dir, str(model.id)))

# set model_dir
model.run_info["model_dir"] = model_dir
Expand All @@ -247,6 +265,15 @@ def prepare_model(model):
yaml.dump(model.run_info, run_info_file, sort_keys=True)


def gitclone(model, model_dir):
if model['github'].startswith('copy:'):
cmd = 'cp -R %s/%s %s' % (model['github'][5:], str(model.id), model_dir)
else:
cmd = 'git clone %s %s' % (model['github'][7:], model_dir)
print("\n", cmd)
a = subprocess.run(cmd, shell=True, check=True, capture_output=True)


def run_model(model):
start_time = time.perf_counter()
# Some models are skipped on purpose
Expand Down
Loading