Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollup changes with MOE #1545

Merged
merged 33 commits into from
Dec 7, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a8f75bb
Fixing the delete cluster operations for Redshift, and permitting spi…
asaksena Nov 15, 2017
3a27056
Import latest changes from GitHub
dlott Nov 16, 2017
3713424
Delete resources before attempting to publish
ehankland Nov 16, 2017
358714f
Edw benchmark improvements to driver and sql script handling.
asaksena Nov 16, 2017
a12bfe6
Add a flag to enable AOF mode in Redis server.
s-deitz Nov 16, 2017
f6f656d
Enable support for high-availability Google CloudSQL Postgres databas…
gareth-ferneyhough Nov 17, 2017
c9dd03b
Refactoring the sql execution driver script to reference sql queries …
asaksena Nov 17, 2017
3d5ced1
Linter fixes.
dlott Nov 20, 2017
69471b5
Changed NewlineDelimitedJSONPublisher (used by BigQueryPublisher) to …
gareth-ferneyhough Nov 20, 2017
85e7884
- Remove i3 image checking
yuyantingzero Nov 21, 2017
2541bc9
Add aws_image_name_filter flag to ease specifying images.
dlott Aug 27, 2017
a373593
Fix uninstall command in Tensorflow package
gareth-ferneyhough Nov 27, 2017
cc9e112
Add V100 GPU support for GCE
gareth-ferneyhough Nov 27, 2017
264ad24
Add support for running on multiple VMs in the Tensorflow benchmark. …
gareth-ferneyhough Nov 28, 2017
d7ae27d
Increase connection timeout for SCP from 5 seconds to 30 seconds.
s-deitz Nov 28, 2017
9eb1756
Adds some DaCapo benchmarks to PKB.
s-deitz Nov 28, 2017
ca9513b
Use Tensorflow environment vars when looking up the Tensorflow versio…
gareth-ferneyhough Nov 30, 2017
8bfbb9a
- Use gcc/fortran/g++-4.7 as default in SPECCPU test (consistent with…
yuyantingzero Nov 30, 2017
18f6812
- Fix bug in build_tool.GetVersion
yuyantingzero Nov 30, 2017
ec6b301
Add daemonset which sets nvidia-smi permissions so that a restricted …
gareth-ferneyhough Nov 30, 2017
751655b
Don't set GPU clock speed or autoboost policy if the requested value …
gareth-ferneyhough Nov 30, 2017
ac912a0
Add AWS Aurora Database as an option for when running pgbench.
NathanTeeuwen Dec 1, 2017
849287c
Add an IMAGE_OWNER constant to AWS virtual machine.
s-deitz Dec 1, 2017
cea000b
Only emit boot samples once per test
ehankland Dec 1, 2017
99ffbf2
Add descending integerlist range capabilities
ehankland Dec 1, 2017
67d3430
Change the NVIDIA permissions daemonset so that it does not consume GPU
gareth-ferneyhough Dec 2, 2017
6c29921
Change the NVIDIA permissions daemonset so that it waits until nvidia…
gareth-ferneyhough Dec 2, 2017
6bfba38
Add gpu_type and num_gpus to container_service resource metadata.
gareth-ferneyhough Dec 5, 2017
b5bd95e
Update GKE cluster version to 1.8.4-gke.0 when creating a cluster with
gareth-ferneyhough Dec 5, 2017
54cb16d
Import changes from GitHub with MOE.
s-deitz Dec 7, 2017
1829648
The model we used is resnet not restnet.
tohaowu Dec 7, 2017
780487d
Fix tox flake8 errors and a unit test.
s-deitz Dec 7, 2017
17c2d90
Merge branch 'master' into moe
s-deitz Dec 7, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES.next.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
- Support for ProfitBricks API v4:
- Add `profitbricks_image_alias` flag and support for image aliases
- Add new location, `us/ewr`
- Add aws_image_name_filter flag to ease specifying images.

###Bug fixes and maintenance updates:
- Moved GPU-related specs from GceVmSpec to BaseVmSpec
Expand Down
7 changes: 7 additions & 0 deletions perfkitbenchmarker/configs/benchmark_config_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,13 @@ def _GetOptionDecoderConstructions(cls):
managed_relational_db.AURORA_POSTGRES,
]
}),
'zones': (option_decoders.ListDecoder, {
'item_decoder': option_decoders.StringDecoder(),
'default': None
}),
'machine_type': (option_decoders.StringDecoder, {
'default': None
}),
'engine_version': (option_decoders.StringDecoder, {
'default': None
}),
Expand Down
5 changes: 5 additions & 0 deletions perfkitbenchmarker/container_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,11 @@ def GetResourceMetadata(self):
'zone': self.zone,
'size': self.num_nodes,
}
if self.gpu_count:
metadata.update({
'gpu_type': self.gpu_type,
'num_gpus': self.gpu_count,
})
return metadata


Expand Down
2 changes: 1 addition & 1 deletion perfkitbenchmarker/data/edw/redshift_driver.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ START_TIME=$SECONDS

for REDSHIFT_SCRIPT in "${REDSHIFT_SCRIPT_LIST[@]}"
do
PGPASSWORD=$REDSHIFT_PASSWORD psql -h $REDSHIFT_HOST -p 5439 -d $REDSHIFT_DB -U $REDSHIFT_USER -f redshift_sql/$REDSHIFT_SCRIPT > /dev/null &
PGPASSWORD=$REDSHIFT_PASSWORD psql -h $REDSHIFT_HOST -p 5439 -d $REDSHIFT_DB -U $REDSHIFT_USER -f $REDSHIFT_SCRIPT > /dev/null &
pid=$!
pids="$pids $pid"
done
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# This file defines a daemonset which runs automatically on all
# kubernetes nodes. It is used like so: kubectl create -f <this file_path>.
# The daemonset does the following:
# - waits until nvidia-smi is mounted and available on PATH
# - enables persistence mode on the nvidia driver
# - allows all users to set the GPU clock speed
# In effect, this allows pods created without a privileged security context to
# set the GPU clock speeds
# This daemonset config does not define GPU resources, because otherwise it
# would consume them, leaving them unavailable to pods. Instead, it runs in
# privileged mode (so it can see all GPUs), and manually mounts the CUDA
# lib and bin directories.

apiVersion: apps/v1beta2
kind: DaemonSet
metadata:
name: nvidia-add-unrestricted-permissions-dameon-set
spec:
selector:
matchLabels:
name: nvidia-add-unrestricted-permissions
template:
metadata:
labels:
name: nvidia-add-unrestricted-permissions
spec:
containers:
- name: nvidia-add-unrestricted-permissions
image: nvidia/cuda:8.0-devel-ubuntu16.04
securityContext:
privileged: true
command: [ "/bin/bash", "-c", "export PATH=$PATH:/usr/local/bin/nvidia/ && while [ ! $(type -p nvidia-smi) ]; do echo waiting for nvidia-smi to mount...; sleep 2; done && nvidia-smi -pm 1 && nvidia-smi --applications-clocks-permission=UNRESTRICTED && nvidia-smi --auto-boost-permission=UNRESTRICTED && tail -f /dev/null" ]
volumeMounts:
- name: nvidia-debug-tools
mountPath: /usr/local/bin/nvidia
- name: nvidia-libraries
mountPath: /usr/local/nvidia/lib64
volumes:
- name: nvidia-debug-tools
hostPath:
path: /home/kubernetes/bin/nvidia/bin
- name: nvidia-libraries
hostPath:
path: /home/kubernetes/bin/nvidia/lib
3 changes: 2 additions & 1 deletion perfkitbenchmarker/flag_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,8 @@ def __str__(self):

def _CreateXrangeFromTuple(self, input_tuple):
start = input_tuple[0]
stop_inclusive = input_tuple[1] + 1
step = 1 if len(input_tuple) == 2 else input_tuple[2]
stop_inclusive = input_tuple[1] + (1 if step > 0 else -1)
return xrange(start, stop_inclusive, step)


Expand Down Expand Up @@ -187,6 +187,7 @@ def HandleNonIncreasing():
low = int(match.group(1))
high = int(match.group(3))
step = int(match.group(5)) if match.group(5) is not None else 1
step = step if low <= high else -step

if high <= low or (len(result) > 0 and low <= result[-1]):
HandleNonIncreasing()
Expand Down
97 changes: 97 additions & 0 deletions perfkitbenchmarker/linux_benchmarks/dacapo_benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Copyright 2016 PerfKitBenchmarker Authors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Runs DaCapo benchmarks.

This benchmark runs the various DaCapo benchmarks. More information can be found
at: http://dacapobench.org/
"""

import os
import re

from perfkitbenchmarker import configs
from perfkitbenchmarker import errors
from perfkitbenchmarker import flags
from perfkitbenchmarker import linux_packages
from perfkitbenchmarker import sample

flags.DEFINE_string('dacapo_jar_filename', 'dacapo-9.12-bach.jar',
'Filename of DaCapo jar file.')
flags.DEFINE_enum('dacapo_benchmark', 'luindex', ['luindex', 'lusearch'],
'Name of specific DaCapo benchmark to execute.')
flags.DEFINE_integer('dacapo_num_iters', 1, 'Number of iterations to execute.')

FLAGS = flags.FLAGS

BENCHMARK_NAME = 'dacapo'
BENCHMARK_CONFIG = """
dacapo:
description: Runs DaCapo benchmarks
vm_groups:
default:
vm_spec: *default_single_core
"""
_PASS_PATTERN = re.compile(r'^=====.*PASSED in (\d+) msec =====$')


def GetConfig(user_config):
return configs.LoadConfig(BENCHMARK_CONFIG, user_config, BENCHMARK_NAME)


def Prepare(benchmark_spec):
"""Install the DaCapo benchmark suite on the vms.

Args:
benchmark_spec: The benchmark specification. Contains all data that is
required to run the benchmark.
"""
benchmark_spec.vms[0].Install('dacapo')


def Run(benchmark_spec):
"""Run the DaCapo benchmark on the vms.

Args:
benchmark_spec: The benchmark specification. Contains all data that is
required to run the benchmark.

Returns:
A singleton list of sample.Sample objects containing the DaCapo benchmark
run time (in msec).

Raises:
errors.Benchmarks.RunError if the DaCapo benchmark didn't succeed.
"""
_, stderr = benchmark_spec.vms[0].RemoteCommand(
'java -jar %s %s -n %i --scratch-directory=%s' %
(os.path.join(linux_packages.INSTALL_DIR, FLAGS.dacapo_jar_filename),
FLAGS.dacapo_benchmark, FLAGS.dacapo_num_iters,
os.path.join(linux_packages.INSTALL_DIR, 'dacapo_scratch')))
for line in stderr.splitlines():
m = _PASS_PATTERN.match(line)
if m:
return [sample.Sample('run_time', float(m.group(1)), 'ms')]
raise errors.Benchmarks.RunError(
'DaCapo benchmark %s failed.' % FLAGS.dacapo_benchmark)


def Cleanup(benchmark_spec):
"""Cleanup the DaCapo benchmark on the target vm (by uninstalling).

Args:
benchmark_spec: The benchmark specification. Contains all data that is
required to run the benchmark.
"""
benchmark_spec.vms[0].RemoteCommand(
'rm -rf %s' % os.path.join(linux_packages.INSTALL_DIR, 'dacapo_scratch'))
16 changes: 9 additions & 7 deletions perfkitbenchmarker/linux_benchmarks/edw_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@


import copy
import os

from perfkitbenchmarker import configs
from perfkitbenchmarker import data
Expand Down Expand Up @@ -59,15 +60,16 @@ def Prepare(benchmark_spec):

def Run(benchmark_spec):
"""Run phase executes the sql scripts on edw cluster and collects duration."""
driver_name = '{}_driver.sh'.format(benchmark_spec.edw_service.SERVICE_TYPE)
driver_path = data.ResourcePath(driver_name)

scripts_name = '{}_sql'.format(benchmark_spec.edw_service.SERVICE_TYPE)
scripts_path = data.ResourcePath(scripts_name)

vm = benchmark_spec.vms[0]
driver_name = '{}_driver.sh'.format(benchmark_spec.edw_service.SERVICE_TYPE)
driver_path = data.ResourcePath(os.path.join('edw', driver_name))
vm.PushFile(driver_path)
vm.PushFile(scripts_path)

scripts_dir = '{}_sql'.format(benchmark_spec.edw_service.SERVICE_TYPE)
scripts_list = FLAGS.edw_benchmark_scripts
for script in scripts_list:
script_path = data.ResourcePath(os.path.join('edw', scripts_dir, script))
vm.PushFile(script_path)

driver_perms_update_cmd = 'chmod 755 {}'.format(driver_name)
vm.RemoteCommand(driver_perms_update_cmd)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ def Run(benchmark_spec):

for thread_count in FLAGS.multichase_thread_count:
if thread_count > vm.num_cpus:
break
continue
memory_size_iterator = _IterMemorySizes(
lambda: vm.total_memory_kb * 1024, FLAGS.multichase_memory_size_min,
FLAGS.multichase_memory_size_max)
Expand Down
26 changes: 22 additions & 4 deletions perfkitbenchmarker/linux_benchmarks/speccpu2006_benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,19 @@

import itertools
import logging
from operator import mul
import os
import posixpath
import re
import tarfile

from operator import mul
from perfkitbenchmarker import configs
from perfkitbenchmarker import data
from perfkitbenchmarker import errors
from perfkitbenchmarker import flags
from perfkitbenchmarker import sample
from perfkitbenchmarker import stages
from perfkitbenchmarker.linux_packages import build_tools


FLAGS = flags.FLAGS
Expand All @@ -62,6 +63,14 @@
'cfg file must be placed in the local PKB data directory and will be '
'copied to the remote machine prior to executing runspec. See README.md '
'for instructions if running with a repackaged cpu2006v1.2.tgz file.')
flags.DEFINE_string(
'runspec_build_tool_version', None,
'Version of gcc/g++/gfortran. This should match runspec_config. Note, if '
'neither runspec_config and runspec_build_tool_version is set, the test '
'install gcc/g++/gfortran-4.7, since that matches default config version. '
'If runspec_config is set, but not runspec_build_tool_version, default '
'version of build tools will be installed. Also this flag only works with '
'debian.')
flags.DEFINE_integer(
'runspec_iterations', 3,
'Used by the PKB speccpu2006 benchmark. The number of benchmark iterations '
Expand Down Expand Up @@ -119,7 +128,7 @@ def GetConfig(user_config):
return configs.LoadConfig(BENCHMARK_CONFIG, user_config, BENCHMARK_NAME)


def CheckPrerequisites(benchmark_config):
def CheckPrerequisites(unused_benchmark_config):
"""Verifies that the required input files are present."""
try:
# Peeking into the tar file is slow. If running in stages, it's
Expand Down Expand Up @@ -227,6 +236,7 @@ class _SpecCpu2006SpecificState(object):
where the SPEC files are stored.
tar_file_path: Optional string. Path of the tar file on the remote machine.
"""

def __init__(self):
self.cfg_file_path = None
self.iso_file_path = None
Expand All @@ -246,8 +256,15 @@ def Prepare(benchmark_spec):
speccpu_vm_state = _SpecCpu2006SpecificState()
setattr(vm, _BENCHMARK_SPECIFIC_VM_STATE_ATTR, speccpu_vm_state)
vm.Install('wget')
vm.Install('build_tools')
vm.Install('fortran')
vm.Install('build_tools')

# If using default config files and runspec_build_tool_version is not set,
# install 4.7 gcc/g++/gfortan. If either one of the flag is set, we assume
# user is smart
if not FLAGS['runspec_config'].present or FLAGS.runspec_build_tool_version:
build_tool_version = FLAGS.runspec_build_tool_version or '4.7'
build_tools.Reinstall(vm, version=build_tool_version)
if FLAGS.runspec_enable_32bit:
vm.Install('multilib')
vm.Install('numactl')
Expand Down Expand Up @@ -329,6 +346,7 @@ def _ExtractScore(stdout, vm, keep_partial_results, estimate_spec):
keep_partial_results: A boolean indicating whether partial results should
be extracted in the event that not all benchmarks were successfully
run. See the "runspec_keep_partial_results" flag for more info.
estimate_spec: A boolean indicating whether should we estimate spec score.

Sample input for SPECint:
...
Expand Down Expand Up @@ -449,7 +467,7 @@ def _ExtractScore(stdout, vm, keep_partial_results, estimate_spec):


def _GeometricMean(arr):
"Calculates the geometric mean of the array."
"""Calculates the geometric mean of the array."""
return reduce(mul, arr) ** (1.0 / len(arr))


Expand Down
Loading