forked from kubeflow/kubeflow
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create a script to build the jupyter-web-app image and create a PR to…
… update it (kubeflow#3066) * Create a script to auto build the Jupyter web app image and update the prototype * The script works as follows * Use git to determine the last commit at which the source to the Jupyter web app changed * Look for an image tagged with that commit * If no such image exists build a new image * Update the ksonnet prototype to use the new image * Push the commit to git; we will use the kubeflow-bot account * Use the hub CLI to create the pull request if one doesn't already exist * Create a Makefile to build the jupyter web app * Add a git label to the image so we can compare against the current image. * Put the new python code in python package kubeflow/kubeflow * We now have namespace packaging working * Provide a K8s job to run it. In a follow on PR we will turn this into a cron job Miscellaneous changes: Specify the build and publish projects separately. * Update pylintrcfile to always do no-self-use * Fix some lint issues * Update the README. * Address comments * Replace regex parsing (Which is not very readable) with simpler parsing. * Resolve conflicts. * Fix some bugs. * Fix bug with the image. * Latest. * Restore changes to the Jupyter web app image.
- Loading branch information
1 parent
308b79f
commit edbeedb
Showing
13 changed files
with
550 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Copyright 2017 The Kubernetes Authors. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
# Project used with GCB | ||
PROJECT ?= kubeflow-dev | ||
# Registry where the image should be published | ||
REGISTRY_PROJECT ?= kubeflow-dev | ||
OUTPUT ?= output.yaml | ||
|
||
# We want the git tag to be the last commit to this directory so we don't | ||
# bump the image on unrelated changes. | ||
GIT_TAG ?= $(shell git log -n 1 --pretty=format:"%h" ./) | ||
|
||
info: | ||
echo image: \"gcr.io/$(REGISTRY_PROJECT)/jupyter-web-app:$(GIT_TAG)\" > $(OUTPUT) | ||
|
||
build-gcb: info | ||
gcloud --project=$(PROJECT) \ | ||
builds submit \ | ||
--machine-type=n1-highcpu-32 \ | ||
--substitutions=_GIT_VERSION=$(GIT_TAG),_REGISTRY=$(REGISTRY_PROJECT) \ | ||
--config=cloudbuild.yaml . |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
#cloudbuild.yaml | ||
steps: | ||
- name: 'gcr.io/cloud-builders/docker' | ||
args: | ||
- 'build' | ||
- '-t' | ||
- 'gcr.io/${_REGISTRY}/jupyter-web-app:${_GIT_VERSION}' | ||
- '--label=git-version=${_GIT_VERSION}' | ||
- '.' | ||
- name: 'gcr.io/cloud-builders/docker' | ||
args: | ||
- 'tag' | ||
- 'gcr.io/${_REGISTRY}/jupyter-web-app:${_GIT_VERSION}' | ||
- 'gcr.io/${_REGISTRY}/jupyter-web-app:latest' | ||
images: ['gcr.io/${_REGISTRY}/jupyter-web-app:${_GIT_VERSION}', | ||
'gcr.io/${_REGISTRY}/jupyter-web-app:latest'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
image: "gcr.io/kubeflow-dev/jupyter-web-app:v0-43-g810b0b46" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
__path__ = __import__('pkgutil').extend_path(__path__, __name__) |
Empty file.
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,315 @@ | ||
"""Script to build and update the Jupyter WebApp image. | ||
Requires python3 | ||
hub CLI depends on an OAuth token with repo permissions: | ||
https://hub.github.com/hub.1.html | ||
* It will look for environment variable GITHUB_TOKEN | ||
""" | ||
|
||
import logging | ||
import os | ||
import re | ||
import tempfile | ||
import yaml | ||
|
||
import fire | ||
import git | ||
import httplib2 | ||
|
||
from kubeflow.testing import util # pylint: disable=no-name-in-module | ||
|
||
from containerregistry.client import docker_creds | ||
from containerregistry.client import docker_name | ||
from containerregistry.client.v2_2 import docker_http | ||
from containerregistry.client.v2_2 import docker_image as v2_2_image | ||
from containerregistry.transport import transport_pool | ||
|
||
class WebAppUpdater(object): # pylint: disable=useless-object-inheritance | ||
def __init__(self): | ||
self._last_commit = None | ||
|
||
def build_image(self, build_project, registry_project): | ||
"""Build the image. | ||
Args: | ||
build_project: GCP project used to build the image. | ||
registry_project: GCP project used to host the image. | ||
""" | ||
env = dict() | ||
env.update(os.environ) | ||
env["PROJECT"] = build_project | ||
env["REGISTRY_PROJECT"] = registry_project | ||
env["GIT_TAG"] = self._last_commit | ||
|
||
with tempfile.NamedTemporaryFile() as hf: | ||
name = hf.name | ||
env["OUTPUT"] = name | ||
web_dir = self._component_dir() | ||
util.run(["make", "build-gcb"], env=env, cwd=web_dir) | ||
|
||
# TODO(jlewi): We want to get the actual image produced by GCB. Right | ||
# now this is a bit brittle because we have multiple layers of substitution | ||
# e.g. in the Makefile and then the GCB YAML. | ||
# It might be better to parse the stdout of make-build-gcb to get the | ||
# GCB job name and then fetch the GCB info specifying the images. | ||
with open(name) as hf: | ||
data = yaml.load(hf) | ||
|
||
return data["image"] | ||
|
||
def _replace_parameters(self, lines, values): | ||
"""Replace parameters in ksonnet text. | ||
Args: | ||
lines: Lines of text | ||
values: A dictionary containing the names of parameters and the values | ||
to set them to. | ||
Returns: | ||
lines: Modified lines | ||
old: Dictionary of old values for these parameters | ||
""" | ||
old = {} | ||
for i, line in enumerate(lines): | ||
# Split the line on white space | ||
pieces = re.findall(r'\S+', line) | ||
|
||
# Check if this line is a parameter | ||
# // @optionalParam image string gcr.io/myimage Some image | ||
if len(pieces) < 5: | ||
continue | ||
|
||
if pieces[0] != "//" or pieces[1] != "@optionalParam": | ||
continue | ||
|
||
param_name = pieces[2] | ||
if not param_name in values: | ||
continue | ||
|
||
old[param_name] = pieces[4] | ||
logging.info("Changing param %s from %s to %s", param_name, pieces[4], | ||
values[param_name]) | ||
pieces[4] = values[param_name] | ||
|
||
lines[i] = " ".join(pieces) | ||
|
||
return lines, old | ||
|
||
def update_prototype(self, image): | ||
"""Update the prototype file. | ||
Args: | ||
image: New image to set | ||
Returns: | ||
prototype_file: The modified prototype file or None if the image is | ||
already up to date. | ||
""" | ||
values = {"image": image} | ||
|
||
|
||
prototype_file = os.path.join(self._root_dir(), | ||
"kubeflow/jupyter/prototypes", | ||
"jupyter-web-app.jsonnet") | ||
with open(prototype_file) as f: | ||
prototype = f.read().split("\n") | ||
|
||
new_lines, old_values = self._replace_parameters(prototype, values) | ||
|
||
if old_values["image"] == image: | ||
logging.info("Existing image was already the current image; %s", image) | ||
return None | ||
temp_file = prototype_file + ".tmp" | ||
with open(temp_file, "w") as w: | ||
w.write("\n".join(new_lines)) | ||
os.rename(temp_file, prototype_file) | ||
|
||
return prototype_file | ||
|
||
@property | ||
def last_commit(self): | ||
"""Get the last commit of a change to the source for the jupyter-web-app.""" | ||
if not self._last_commit: | ||
# Get the hash of the last commit to modify the source for the Jupyter web | ||
# app image | ||
self._last_commit = util.run(["git", "log", "-n", "1", | ||
"--pretty=format:\"%h\"", | ||
"components/jupyter-web-app"], | ||
cwd=self._root_dir()).strip("\"") | ||
|
||
return self._last_commit | ||
|
||
def _find_remote_repo(self, repo, remote_url): # pylint: disable=no-self-use | ||
"""Find the remote repo if it has already been added. | ||
Args: | ||
repo: The git python repo object. | ||
remote_url: The URL of the remote repo e.g. | ||
[email protected]:jlewi/kubeflow.git | ||
Returns: | ||
remote: git-python object representing the remote repo or none if it | ||
isn't present. | ||
""" | ||
for r in repo.remotes: | ||
for u in r.urls: | ||
if remote_url == u: | ||
return r | ||
|
||
return None | ||
|
||
def all(self, build_project, registry_project, remote_fork, | ||
add_github_host=False): # pylint: disable=too-many-statements,too-many-branches | ||
"""Build the latest image and update the prototype. | ||
Args: | ||
build_project: GCP project used to build the image. | ||
registry_project: GCP project used to host the image. | ||
remote_fork: Url of the remote fork. | ||
The remote fork used to create the PR; | ||
e.g. [email protected]:jlewi/kubeflow.git. currently only ssh is | ||
supported. | ||
add_github_host: If true will add the github ssh host to known ssh hosts. | ||
""" | ||
repo = git.Repo(self._root_dir()) | ||
util.maybe_activate_service_account() | ||
last_commit = self.last_commit | ||
|
||
# Ensure github.com is in the known hosts | ||
if add_github_host: | ||
output = util.run(["ssh-keyscan", "github.com"]) | ||
with open(os.path.join(os.getenv("HOME"), ".ssh", "known_hosts"), | ||
mode='a') as hf: | ||
hf.write(output) | ||
|
||
if not remote_fork.startswith("[email protected]"): | ||
raise ValueError("Remote fork currently only supports ssh") | ||
|
||
remote_repo = self._find_remote_repo(repo, remote_fork) | ||
|
||
if not remote_repo: | ||
fork_name = remote_fork.split(":", 1)[-1].split("/", 1)[0] | ||
logging.info("Adding remote %s=%s", fork_name, remote_fork) | ||
remote_repo = repo.create_remote(fork_name, remote_fork) | ||
|
||
logging.info("Last change to components-jupyter-web-app was %s", last_commit) | ||
|
||
base = "gcr.io/{0}/jupyter-web-app".format(registry_project) | ||
|
||
# Check if there is already an image tagged with this commit. | ||
image = base + ":" + self.last_commit | ||
transport = transport_pool.Http(httplib2.Http) | ||
src = docker_name.from_string(image) | ||
creds = docker_creds.DefaultKeychain.Resolve(src) | ||
|
||
image_exists = False | ||
try: | ||
with v2_2_image.FromRegistry(src, creds, transport) as src_image: | ||
logging.info("Image %s exists; digest: %s", image, | ||
src_image.digest()) | ||
image_exists = True | ||
except docker_http.V2DiagnosticException as e: | ||
if e.status == 404: | ||
logging.info("%s doesn't exist", image) | ||
else: | ||
raise | ||
|
||
if not image_exists: | ||
logging.info("Building the image") | ||
image = self.build_image(build_project, registry_project) | ||
logging.info("Created image: %s", image) | ||
else: | ||
logging.info("Image %s already exists", image) | ||
|
||
# We should check what the current image is if and not update it | ||
# if its the existing image | ||
prototype_file = self.update_prototype(image) | ||
|
||
if not prototype_file: | ||
logging.info("Prototype not updated so not creating a PR.") | ||
return | ||
|
||
branch_name = "update_jupyter_{0}".format(last_commit) | ||
|
||
if repo.active_branch.name != branch_name: | ||
logging.info("Creating branch %s", branch_name) | ||
|
||
branch_names = [b.name for b in repo.branches] | ||
if branch_name in branch_names: | ||
logging.info("Branch %s exists", branch_name) | ||
util.run(["git", "checkout", branch_name], cwd=self._root_dir()) | ||
else: | ||
util.run(["git", "checkout", "-b", branch_name], cwd=self._root_dir()) | ||
|
||
logging.info("Add file %s to repo", prototype_file) | ||
repo.index.add([prototype_file]) | ||
repo.index.commit("Update the jupyter web app image to {0}".format(image)) | ||
|
||
util.run(["git", "push", "-f", remote_repo.name], cwd=self._root_dir()) | ||
|
||
self.create_pull_request(commit=last_commit) | ||
|
||
def create_pull_request(self, base="kubeflow:master", commit=None): | ||
"""Create a pull request. | ||
Args: | ||
base: The base to use. Defaults to "kubeflow:master". This should be | ||
in the form <GitHub OWNER>:<branch> | ||
""" | ||
# TODO(jlewi): Modeled on | ||
# https://github.com/kubeflow/examples/blob/master/code_search/docker/ks/update_index.sh | ||
# TODO(jlewi): We should use the GitHub API and check if there is an | ||
# existing open pull request. Or potentially just use the hub CLI. | ||
|
||
if not commit: | ||
commit = self.last_commit | ||
logging.info("No commit specified defaulting to %s", commit) | ||
|
||
pr_title = "[auto PR] Update the jupyter-web-app image to {0}".format(commit) | ||
|
||
# See hub conventions: | ||
# https://hub.github.com/hub.1.html | ||
# The GitHub repository is determined automatically based on the name | ||
# of remote repositories | ||
output = util.run(["hub", "pr", "list", "--format=%U;%t\n"], | ||
cwd=self._root_dir()) | ||
|
||
|
||
lines = output.splitlines() | ||
|
||
prs = {} | ||
for l in lines: | ||
n, t = l.split(";", 1) | ||
prs[t] = n | ||
|
||
if pr_title in prs: | ||
logging.info("PR %s already exists to update the Jupyter web app image " | ||
"to %s", prs[pr_title], commit) | ||
return | ||
|
||
with tempfile.NamedTemporaryFile(delete=False) as hf: | ||
hf.write(pr_title.encode()) | ||
message_file = hf.name | ||
|
||
# TODO(jlewi): -f creates the pull requests even if there are local changes | ||
# this was useful during development but we may want to drop it. | ||
util.run(["hub", "pull-request", "-f", "--base=" + base, "-F", | ||
message_file], | ||
cwd=self._root_dir()) | ||
|
||
def _root_dir(self): | ||
this_dir = os.path.dirname(__file__) | ||
return os.path.abspath(os.path.join(this_dir, "..", "..", "..", "..")) | ||
|
||
def _component_dir(self): | ||
return os.path.join(self._root_dir(), "components", "jupyter-web-app") | ||
|
||
if __name__ == '__main__': | ||
logging.basicConfig(level=logging.INFO, | ||
format=('%(levelname)s|%(asctime)s' | ||
'|%(pathname)s|%(lineno)d| %(message)s'), | ||
datefmt='%Y-%m-%dT%H:%M:%S', | ||
) | ||
logging.getLogger().setLevel(logging.INFO) | ||
fire.Fire(WebAppUpdater) |
Oops, something went wrong.