Merge branch 'GoogleCloudPlatform:master' into gadgilrajeev/hadoop-bl…

…ocksize-variable
Daareyreg · Feb 15, 2023 · f7bf196 · f7bf196
2 parents 3757dee + 25b0f0b
commit f7bf196
Show file tree

Hide file tree

Showing 184 changed files with 3,788 additions and 1,278 deletions.
diff --git a/CHANGES.next.md b/CHANGES.next.md
@@ -30,6 +30,13 @@
 -   Remove pkb's --placement_group_style cloud-agnostic values 'cluster'/
     'cluster_if_supported'/'spread'/'spread_if_supported'.
 -   Replace flag --ibm_azone with --ibm_region.
+-   Changed the default benchmark to `cluster_boot` instead of the standard set.
+    This makes the default behavior for PKB much faster and the standard set of
+    benchmarks was defined many years ago. It's not a reasonable introduction to
+    PKB or something that most people should run by default.
+-   --dpb_export_job_stats is now False by default.
+-   Validate arguments to IssueCommand. Remove force_info_log & suppress_warning
+    parameters, add should_pre_log.
 
 ### New features:
 
@@ -95,6 +102,15 @@
 -   Add Intel MPI benchmark.
 -   Add support for Azure ARM VMs.
 -   Add an HTTP endpoint polling utility & incorporate it into app_service.
+-   Added support for Data Plane Development Kit (DPDK) on Linux VM's to improve
+    networking performance.
+-   Added support for dynamic provisioning of Bigquery flat rate slots at
+    benchmark runtime
+-   Create a new subdirectory of linux_packages called provisioning_benchmarks
+    for benchmarking lifecycle management timings of cloud resources and
+    operations.
+-   Add support for using the hbase2 binding in the Cloud Bigtable YCSB
+    benchmark.
 
 ### Enhancements:
 
@@ -190,9 +206,16 @@
 -   Add `--dpb_job_poll_interval_secs` flag to control job polling frequency in
     DPB benchmarks.
 -   Add support for more readings in nvidia_power tracking.
+-   Report benchmark run costs for dpb_sparksql_benchmark runs on Dataproc
+    Serverless, AWS EMR Serverless & AWS Glue.
+-   Create a list of resources in benchmark_spec to extract common lifecycle
+    timing samples from regardless of benchmark. The set is initially small, but
+    can be expanded to any resource.
+-   Add per-VM resource metadata for id, name, and IP address.
 
 ### Bug fixes and maintenance updates:
 
+-   Add 'runcpu --update' and 'runcpu --version' commands to install phase.
 -   Set the command to download preprovisioned data to be robust and have a five
     minute timeout.
 -   Make Speccpu17 fail if there are compilation errors that will cause missing
@@ -263,6 +286,7 @@
 -   Update the performance results of Bigtable testing which used a more proper
     client setup.
 -   Update the runner's AWS CLI to 1.19.75.
+-   Upgrade from AWS ecr get-login to ecr get-login-password.
 -   Minor fix of the Bigtable benchmarking user guide.
 -   Enable icelake and milan as --gcp_min_cpu_platform options.
 -   Update the bigtable tutorial readme with the content of batch_testing.md.
@@ -281,11 +305,18 @@
 -   Add some required types to BaseAppServiceSpec.
 -   Uses nic type of GVNIC by default (instead of VIRTIO_NET) on GCE
 -   Rename pkb's --placement_group_style values to reflect their cloud-specific
-        CLI arguments (GCP - 'COLLOCATED'/'AVAILABILITY-DOMAIN'; AWS -
+    CLI arguments (GCP - 'COLLOCATED'/'AVAILABILITY-DOMAIN'; AWS -
     'cluster'/'spread'/'partition'; Azure -
     'proximity-placement-group'/'availability-set'). Cloud-agnostic value
     'closest_supported' will choose the most tightly-coupled placement policy
     supported.
 -   Fix how the CBT client is installed for the cloud_bigtable_ycsb_benchmark
     (when --google_bigtable_client_version is set) and use the `cbt` CLI instead
     of the hbase shell to create and delete tables.
+-   Update Bigtable benchmarking configs along with new docker image release.
+    Important dates are added to the user guide.
+-   Remove `--google_bigtable_enable_table_object_sharing`. Use
+    `--ycsb_tar_url=https://storage.googleapis.com/cbt_ycsb_client_jar/ycsb-0.14.0.tar.gz`
+    to retain the previous behavior.
+-   Remove `--google_bigtable_hbase_jar_url`. Rely on
+    `--google_bigtable_client_version` instead.
diff --git a/README.md b/README.md
@@ -294,10 +294,9 @@ for an example implementation.
 
 # How to Run All Standard Benchmarks
 
-Run without the `--benchmarks` parameter and every benchmark in the standard set
-will run serially which can take a couple of hours (alternatively, run with
-`--benchmarks="standard_set"`). Additionally, if you don't specify
-`--cloud=...`, all benchmarks will run on the Google Cloud Platform.
+Run with `--benchmarks="standard_set"` and every benchmark in the standard set
+will run serially which can take a couple of hours. Additionally, if you don't
+specify `--cloud=...`, all benchmarks will run on the Google Cloud Platform.
 
 # How to Run All Benchmarks in a Named Set
 

diff --git a/cloudbuild.yaml b/cloudbuild.yaml
@@ -17,11 +17,14 @@ steps:
     args:
       - build
       - '-t'
-      - 'gcr.io/$PROJECT_ID/pkb:$COMMIT_SHA'
+      - 'us-west1-docker.pkg.dev/$PROJECT_ID/pkb-cloud-build/pkb:$COMMIT_SHA'
       - .
   - name: gcr.io/cloud-builders/docker
     args:
       - run
-      - 'gcr.io/$PROJECT_ID/pkb:$COMMIT_SHA'
+      - 'us-west1-docker.pkg.dev/$PROJECT_ID/pkb-cloud-build/pkb:$COMMIT_SHA'
+serviceAccount: 'projects/$PROJECT_ID/serviceAccounts/[email protected]'
+options:
+  logging: CLOUD_LOGGING_ONLY
 images:
-  - 'gcr.io/$PROJECT_ID/pkb:$COMMIT_SHA'
+  - 'us-west1-docker.pkg.dev/$PROJECT_ID/pkb-cloud-build/pkb:$COMMIT_SHA'
diff --git a/perfkitbenchmarker/benchmark_spec.py b/perfkitbenchmarker/benchmark_spec.py
@@ -22,6 +22,7 @@
 import os
 import pickle
 import threading
+from typing import List
 import uuid
 
 from absl import flags
@@ -33,6 +34,7 @@
 from perfkitbenchmarker import data_discovery_service
 from perfkitbenchmarker import disk
 from perfkitbenchmarker import dpb_service
+from perfkitbenchmarker import edw_compute_resource
 from perfkitbenchmarker import edw_service
 from perfkitbenchmarker import errors
 from perfkitbenchmarker import flag_util
@@ -44,6 +46,7 @@
 from perfkitbenchmarker import provider_info
 from perfkitbenchmarker import providers
 from perfkitbenchmarker import relational_db
+from perfkitbenchmarker import resource as resource_type
 from perfkitbenchmarker import smb_service
 from perfkitbenchmarker import spark_service
 from perfkitbenchmarker import stages
@@ -128,6 +131,7 @@ def __init__(self, benchmark_module, benchmark_config, benchmark_uid):
     self.status_detail = None
     BenchmarkSpec.total_benchmarks += 1
     self.sequence_number = BenchmarkSpec.total_benchmarks
+    self.resources: List[resource_type.Resource] = []
     self.vms = []
     self.regional_networks = {}
     self.networks = {}
@@ -151,6 +155,7 @@ def __init__(self, benchmark_module, benchmark_config, benchmark_uid):
     self.tpus = []
     self.tpu_groups = {}
     self.edw_service = None
+    self.edw_compute_resource = None
     self.nfs_service = None
     self.smb_service = None
     self.messaging_service = None
@@ -237,6 +242,7 @@ def ConstructContainerCluster(self):
         cloud, cluster_type)
     self.container_cluster = container_cluster_class(
         self.config.container_cluster)
+    self.resources.append(self.container_cluster)
 
   def ConstructContainerRegistry(self):
     """Create the container registry."""
@@ -248,6 +254,7 @@ def ConstructContainerRegistry(self):
         cloud)
     self.container_registry = container_registry_class(
         self.config.container_registry)
+    self.resources.append(self.container_registry)
 
   def ConstructDpbService(self):
     """Create the dpb_service object and create groups for its vms."""
@@ -361,6 +368,23 @@ def ConstructEdwService(self):
     # Check if a new instance needs to be created or restored from snapshot
     self.edw_service = edw_service_class(self.config.edw_service)
 
+  def ConstructEdwComputeResource(self):
+    """Create an edw_compute_resource object."""
+    if self.config.edw_compute_resource is None:
+      return
+    edw_compute_resource_cloud = self.config.edw_compute_resource.cloud
+    edw_compute_resource_type = self.config.edw_compute_resource.type
+    providers.LoadProvider(edw_compute_resource_cloud)
+    edw_compute_resource_class = (
+        edw_compute_resource.GetEdwComputeResourceClass(
+            edw_compute_resource_cloud, edw_compute_resource_type
+        )
+    )
+    self.edw_compute_resource = edw_compute_resource_class(
+        self.config.edw_compute_resource
+    )
+    self.resources.append(self.edw_compute_resource)
+
   def ConstructNfsService(self):
     """Construct the NFS service object.
 
@@ -738,6 +762,8 @@ def Provision(self):
           if network.__class__.__name__ == 'AwsNetwork':
             self.edw_service.cluster_subnet_group.subnet_id = network.subnet.id
       self.edw_service.Create()
+    if self.edw_compute_resource:
+      self.edw_compute_resource.Create()
     if self.vpn_service:
       self.vpn_service.Create()
     if hasattr(self, 'messaging_service') and self.messaging_service:
@@ -768,6 +794,8 @@ def Delete(self):
       vm_util.RunThreaded(lambda tpu: tpu.Delete(), self.tpus)
     if self.edw_service:
       self.edw_service.Delete()
+    if hasattr(self, 'edw_compute_resource') and self.edw_compute_resource:
+      self.edw_compute_resource.Delete()
     if self.nfs_service:
       self.nfs_service.Delete()
     if self.smb_service:
@@ -824,10 +852,8 @@ def Delete(self):
   def GetSamples(self):
     """Returns samples created from benchmark resources."""
     samples = []
-    if self.container_cluster:
-      samples.extend(self.container_cluster.GetSamples())
-    if self.container_registry:
-      samples.extend(self.container_registry.GetSamples())
+    for resource in self.resources:
+      samples.extend(resource.GetSamples())
     return samples
 
   def StartBackgroundWorkload(self):

diff --git a/perfkitbenchmarker/configs/benchmark_config_spec.py b/perfkitbenchmarker/configs/benchmark_config_spec.py
@@ -466,6 +466,75 @@ def _ApplyFlags(cls, config_values, flag_values):
       config_values['password'] = flag_values.edw_service_cluster_password
 
 
+class _EdwComputeResourceDecoder(option_decoders.TypeVerifier):
+  """Validates the edw compute resource dictionary of a benchmark config object."""
+
+  def __init__(self, **kwargs):
+    super(_EdwComputeResourceDecoder, self).__init__(
+        valid_types=(dict,), **kwargs
+    )
+
+  def Decode(self, value, component_full_name, flag_values):
+    """Verifies edw compute resource dictionary of a benchmark config object.
+
+    Args:
+      value: dict edw_compute_resource config dictionary
+      component_full_name: string.  Fully qualified name of the configurable
+        component containing the config option.
+      flag_values: flags.FlagValues.  Runtime flag values to be propagated to
+        BaseSpec constructors.
+
+    Returns:
+      _EdwComputeResourceSpec Built from the config passed in value.
+    Raises:
+      errors.Config.InvalidValue upon invalid input value.
+    """
+    edw_compute_resource_config = super(
+        _EdwComputeResourceDecoder, self
+    ).Decode(value, component_full_name, flag_values)
+    result = _EdwComputeResourceSpec(
+        self._GetOptionFullName(component_full_name),
+        flag_values,
+        **edw_compute_resource_config,
+    )
+    return result
+
+
+class _EdwComputeResourceSpec(spec.BaseSpec):
+  """Configurable options of an EDW compute resource.
+
+  Attributes:
+    type: string. The type of the EDW compute resource (bigquery_slots, etc.)
+  """
+
+  def __init__(self, component_full_name, flag_values=None, **kwargs):
+    super(_EdwComputeResourceSpec, self).__init__(
+        component_full_name, flag_values=flag_values, **kwargs)
+
+  @classmethod
+  def _GetOptionDecoderConstructions(cls):
+    result = super(
+        _EdwComputeResourceSpec, cls
+    )._GetOptionDecoderConstructions()
+    result.update({
+        'type': (
+            option_decoders.StringDecoder,
+            {'default': 'bigquery_slots', 'none_ok': False},
+        ),
+        'cloud': (
+            option_decoders.StringDecoder,
+            {'default': None, 'none_ok': True},
+        ),
+    })
+    return result
+
+  @classmethod
+  def _ApplyFlags(cls, config_values, flag_values):
+    super(_EdwComputeResourceSpec, cls)._ApplyFlags(config_values, flag_values)
+    if 'cloud' not in config_values:
+      config_values['cloud'] = flag_values.cloud
+
+
 class _SparkServiceSpec(spec.BaseSpec):
   """Configurable options of an Apache Spark Service.
 
@@ -1591,6 +1660,9 @@ def _GetOptionDecoderConstructions(cls):
         'tpu_groups': (_TpuGroupsDecoder, {
             'default': {}
         }),
+        'edw_compute_resource': (_EdwComputeResourceDecoder, {
+            'default': None
+        }),
         'edw_service': (_EdwServiceDecoder, {
             'default': None
         }),

diff --git a/perfkitbenchmarker/container_service.py b/perfkitbenchmarker/container_service.py
@@ -414,7 +414,7 @@ def GetOrBuild(self, image):
       # manifest inspect inpspects the registry's copy
       inspect_cmd = ['docker', 'manifest', 'inspect', full_image]
       _, _, retcode = vm_util.IssueCommand(
-          inspect_cmd, suppress_warning=True, raise_on_failure=False)
+          inspect_cmd, raise_on_failure=False)
       if retcode == 0:
         return full_image
     self._Build(image)
@@ -552,12 +552,7 @@ def DeployContainerService(self, name, container_spec, num_containers):
 
   def GetSamples(self):
     """Return samples with information about deployment times."""
-    samples = []
-    if self.resource_ready_time and self.create_start_time:
-      samples.append(
-          sample.Sample('Cluster Creation Time',
-                        self.resource_ready_time - self.create_start_time,
-                        'seconds'))
+    samples = super().GetSamples()
     for container in itertools.chain(*list(self.containers.values())):
       metadata = {'image': container.image.split('/')[-1]}
       if container.resource_ready_time and container.create_start_time:

diff --git a/perfkitbenchmarker/data/cloudbigtable/hbase-site.xml.j2 b/perfkitbenchmarker/data/cloudbigtable/hbase-site.xml.j2
@@ -9,7 +9,11 @@
   </property>
   <property>
     <name>hbase.client.connection.impl</name>
-    <value>com.google.cloud.bigtable.hbase{{ hbase_version }}.BigtableConnection</value>
+    <value>com.google.cloud.bigtable.hbase{{ hbase_major_version }}_x.BigtableConnection</value>
+  </property>
+  <property>
+    <name>hbase.client.async.connection.impl</name>
+    <value>org.apache.hadoop.hbase.client.BigtableAsyncConnection</value>
   </property>
   <property>
     <name>google.bigtable.project.id</name>

diff --git a/perfkitbenchmarker/data/cloudbigtable/pom-cbt-client-connector.xml.j2 b/perfkitbenchmarker/data/cloudbigtable/pom-cbt-client-connector.xml.j2
@@ -1,9 +1,11 @@
 <!--
     Dummy project to be able to install
-    com.google.cloud.bigtable:bigtable-hbase-1.x and all its required
+    com.google.cloud.bigtable:bigtable-hbase-{1,2}.x and all its required
     dependencies for cloud_bigtable_ycsb_benchmark. See
     https://mvnrepository.com/artifact/com.google.cloud.bigtable/bigtable-hbase-1.x
-    for the dependencies of the client.
+    or
+    https://mvnrepository.com/artifact/com.google.cloud.bigtable/bigtable-hbase-2.x
+    for the dependencies of the respctive client.
 -->
 <project>
   <modelVersion>4.0.0</modelVersion>
@@ -16,7 +18,7 @@
   <dependencies>
     <dependency>
       <groupId>com.google.cloud.bigtable</groupId>
-      <artifactId>bigtable-hbase-1.x</artifactId>
+      <artifactId>bigtable-hbase-{{ hbase_major_version }}.x</artifactId>
       <version>{{ google_bigtable_client_version }}</version>
 
       <!--Exclude SLF4J as it is included by another dependency.-->

diff --git a/perfkitbenchmarker/data/hammerdbcli_tcl/install_hammerdb_4_0.sh b/perfkitbenchmarker/data/hammerdbcli_tcl/install_hammerdb_4_0.sh
@@ -1,6 +1,6 @@
 #!/bin/bash
 
-wget https://github.com/TPC-Council/HammerDB/releases/download/v4.0/HammerDB-4.0-Linux.tar.gz
+curl -LO 'https://github.com/TPC-Council/HammerDB/releases/download/v4.0/HammerDB-4.0-Linux.tar.gz'
 echo 'fa9c4e2654a49f856cecf63c8ca9be5b  HammerDB-4.0-Linux.tar.gz'  > hammerdb.md5
 
 if ! md5sum -c hammerdb.md5

diff --git a/perfkitbenchmarker/data/hammerdbcli_tcl/install_hammerdb_4_3.sh b/perfkitbenchmarker/data/hammerdbcli_tcl/install_hammerdb_4_3.sh
@@ -1,6 +1,6 @@
 #!/bin/bash
 
-wget https://github.com/TPC-Council/HammerDB/releases/download/v4.3/HammerDB-4.3-Linux.tar.gz
+curl -LO 'https://github.com/TPC-Council/HammerDB/releases/download/v4.3/HammerDB-4.3-Linux.tar.gz'
 
 sudo tar -zxvf HammerDB-4.3-Linux.tar.gz -C /var/lib/google
 

diff --git a/perfkitbenchmarker/data/hammerdbcli_tcl/install_hammerdb_4_5.sh b/perfkitbenchmarker/data/hammerdbcli_tcl/install_hammerdb_4_5.sh
@@ -1,6 +1,6 @@
 #!/bin/bash
 
-wget https://github.com/TPC-Council/HammerDB/releases/download/v4.5/HammerDB-4.5-Linux.tar.gz
+curl -LO 'https://github.com/TPC-Council/HammerDB/releases/download/v4.5/HammerDB-4.5-Linux.tar.gz'
 
 sudo tar -zxvf HammerDB-4.5-Linux.tar.gz -C /var/lib/google