Skip to content

Latest commit

 

History

History
368 lines (326 loc) · 19.8 KB

HOW_TO.md

File metadata and controls

368 lines (326 loc) · 19.8 KB

Benchmark-runner: How to?

Table of Contents

Add any new Python code

If you need to add any new Python code in any directory, you must create an __init__.py file in that directory if it does not already exist. If you don't, that code will not be propagated into the release package.

To check this, run the following command:

$ ls -l $(git ls-files |grep '\.py$' |grep -v '/__init__\.py$' | xargs dirname | sort -n |uniq | sed 's,$,/__init__.py,') 2>&1 >/dev/null

If there is any output, e.g.

ls: cannot access 'tests/unittest/benchmark_runner/common/template_operations/__init__.py': No such file or directory

you need to create an empty file by that name and git add it.

Add new workload, modify parameters to workload, or change parameters for any CI job

The unit tests include a check to ensure that the generated .yaml files do not inadvertently change. This check, located in tests/unittest/benchmark_runner/common/templates/test_golden_files.py, compares these files against expected files found in tests/unittest/benchmark_runner/common/workloads_flavors/golden_files and fails if any golden files have been added, modified, or removed.

If you remove any YAML files, you must identify the changed files and git rm them before committing the result.

If you add or modify any YAML files, you must run the following commands from top level to regenerate and test the golden files:

The make command will run a check automatically to check to verify the golden files; this check is also run as part of the unit tests. The output will look like this:

$ make all
============================== test session starts ===============================
platform linux -- Python 3.9.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /home/rkrawitz/sandbox/benchmark-runner
plugins: typeguard-2.10.0, venv-0.2
collected 1 item

tests/unittest/benchmark_runner/common/templates/test_golden_files.py::test_golden_files PASSED [100%]

=============================== 1 passed in 1.85s ================================

If the check succeeds, you need to add and commit the golden files:

$ git add tests/unittest/benchmark_runner/common/templates/golden_files
$ git commit -m "Update golden files"

This test uses synthetic environment variables that you do not need to modify, and you should never have to manually modify the golden files except to manually remove any that are no longer required.

Examples of changes that require updating the golden files:

  • Adding, removing, or changing any YAML file or template under benchmark_runner
  • Adding new workload or removing an existing one
  • Adding new CI flavor or removing an existing one

If the test fails, it will report lists for files that failed comparison check, missing, unexpected files that are present, or files that could not be compared for some reason. If files have been changed and you verify that the changes are correct, you need to git add the appropriate files as discussed above (usually you can just git add the golden_files directory). If you have remove .yaml files, you must manually git rm them.

You should /never/ modify the golden files manually.

Add new benchmark operator workload to benchmark runner

This section also applies to modifying an existing workload, including any template .yaml files.

  1. git clone https://github.com/redhat-performance/benchmark-runner
  2. cd benchmark-runner
  3. Install prerequisites (these commands assume RHEL/CentOS/Fedora):
    • dnf install make
    • dnf install python3-pip
  4. Open benchmark_runner/benchmark_operator/benchmark_operator_workloads.py
  5. Create new workload method for Pod and VM under BenchmarkOperatorWorkloads class section in benchmark_runner/benchmark_operator/benchmark_operator_workloads.py. It can be duplicated from existing workload method: def stressng_pod or def stressng_vm and customized workload run steps accordingly
  6. Create dedicated <workload> class WorkloadPod or WorkloadVM in dedicated module <workload>_pod.py or <workload>_vm.py and customized workload run steps accordingly e.g. benchmark_runner/benchmark_operator/stressng_pod.py
  7. Add workload method name (workload_pod/workload_vm) to environment_variables_dict['workloads'] in benchmark_runner/main/environment_variables.py
  8. Create workload folder in the benchmark_runner/common/template_operations/templates directory. Create the following files in that directory:
    1. Add workload_data_template for configuration parameters, e.g. benchmark_runner/common/template_operations/templates/stressng/stressng_data_template.yaml.
    2. The data template is structured as discussed below.
    3. Add workload pod and VM custom resource template inside benchmark_runner/common/template_operations/templates/stressng/internal_data
  9. Add workload folder path in MANIFEST.in, add 2 paths: the workload path to 'workload_data_template.yaml' and path to 'internal_data' Pod and VM template yaml files. e.g.
      include benchmark_runner/common/template_operations/templates/stressng/*.yaml
      include benchmark_runner/common/template_operations/templates/stressng/internal_data/*.yaml
    
  10. Add tests for all new methods you write under tests/integration.
  11. Update the golden unit test files as described above
  12. For test and debug workload, need to configure benchmark_runner/main/environment_variables.py
  13. Fill parameters: workload, kubeadmin_password, pin_node_benchmark_operator, pin_node1, pin_node2, elasticsearch, elasticsearch_port
  14. Run /benchmark_runner/main/main.py and verify that the workload run correctly
  15. The workload can be monitored and checked through 'current run' folder inside the run workload flavor (default flavor: 'test_ci')
  16. Open Kibana url and verify workload index populate with data:
  17. Create the workload index: Kibana -> Hamburger tab -> Stack Management -> Index patterns -> Create index pattern -> workload-results -> timestamp -> Done
  18. Verify workload-results index is populated: Kibana -> Hamburger tab -> Discover -> workload-results (index) -> verify that there is a new data

Add new custom workload to benchmark runner

This section also applies to modifying an existing workload, including any template .yaml files.

  1. git clone https://github.com/redhat-performance/benchmark-runner
  2. cd benchmark-runner
  3. Install prerequisites (these commands assume RHEL/CentOS/Fedora):
    • dnf install python3-pip
  4. Create workload Dockerfile, example:
    FROM quay.io/centos/centos:stream8
    Shell/Python that run workload
    Result in Json (redirect to stdout)
    Wrap Json output with begin/end workload stamp
    start_stamp='@@~@@START-WORKLOAD@@~@@'
    end_stamp='@@~@@END-WORKLOAD@@~@@'
    
  5. Upload image to quay.io
  6. Create workload pod yaml, example:
    kind: Pod
    apiVersion: v1
    metadata:
    name: vdbench-pod
    namespace: default
    spec:
    containers:
    - name: vdbench-pod
      namespace: default
      image: quay.io/ebattat/centos-stream8-vdbench5.04.07-pod:latest
      imagePullPolicy: "Always"
      volumeMounts:
       - name: vdbench-pvc
         mountPath: "/workload"
         env:
       - name: BLOCK_SIZES
    
    output pod example:
    '@@~@@START-WORKLOAD@@~@@'
    {
    "workload": "Name",
    "Run": "1",
    "Thread": 1,
    "IOPS": "30"
    }
    '@@~@@END-WORKLOAD@@~@@'
    
  7. Benchmark-runner - add workload Template in benchmark_runner/common/template_operations/templates
    1. Create workload directory for example benchmark_runner/common/template_operations/templates/vdbench
    2. Create custom_data_template.yaml for example benchmark_runner/common/template_operations/templates/vdbench/vdbench_data_template.yaml put here all the data that should be replaced by Jinja in
    3. Create custom pod template benchmark_runner/common/template_operations/templates/vdbench/internal_data/vdbench_pod_template.yaml
  8. Create Workload class benchmark_runner/workloads/workloads.py
    1. Add custom workload method, example:
       @typechecked
       @logger_time_stamp
       def vdbench_pod(self, name: str = ''):
       """
       This method run vdbench pod workload
       :return:
       """
       if name == '':
       name = self.vdbench_pod.__name__
       run = VdbenchPod()
       run.vdbench_pod(name=name)
    
    1. Add custom workload class, benchmark_runner/workloads/vdbench_pod.py: Please copy the whole class and functionality
  9. Add workload method name (workload_pod/workload_vm) to environment_variables_dict['workloads'] in benchmark_runner/main/environment_variables.py
  10. Add workload folder path in MANIFEST.in, add 2 paths: the workload path to 'workload_data_template.yaml' and path to 'internal_data' Pod and VM template yaml files
include benchmark_runner/common/template_operations/templates/vdbench/*.yaml
include benchmark_runner/common/template_operations/templates/vdbench/internal_data/*.yaml
  1. Add tests for all new methods you write under tests/integration.
  2. Update the golden unit test files as described above
  3. For test and debug workload, need to configure benchmark_runner/main/environment_variables.py
  4. Fill parameters: workload, kubeadmin_password, pin_node_benchmark_operator, pin_node1, pin_node2, elasticsearch, elasticsearch_port
  5. Run /benchmark_runner/main/main.py and verify that the workload run correctly
  6. The workload can be monitored and checked through 'current run' folder inside the run workload flavor (default flavor: 'test_ci')
  7. Open Kibana url and verify workload index populate with data:
  8. Create the workload index: Kibana -> Hamburger tab -> Stack Management -> Index patterns -> Create index pattern -> workload-results -> timestamp -> Done
  9. Verify workload-results index is populated: Kibana -> Hamburger tab -> Discover -> workload-results (index) -> verify that there is a new data

Add workload to grafana dashboard

  1. Create Elasticsearch data source
    1. Grafana -> Configuration(setting icon) -> Data source -> add data source -> Elasticsearch
      1. Name: Elasticsearch-workload-results
      2. URL: http://elasticsearch.com:port
      3. Index name: workload-results
      4. Time field name: timestamp (remove @)
      5. Version: 7.10+
      6. Save & test
  2. Open grafana dashboard benchmark-runner-report:
    1. Open grafana
    2. Create(+) -> import -> paste grafana/func/benchmark-runner-report.json -> Load
    3. Create panel from scratch or duplicate existing on (stressng/uperf)
    4. Configure the workload related metrics
    5. Save dashboard -> share -> Export -> view json -> Copy to clipboard -> override existing one grafana/func/benchmark-runner-report.json

Data template

The data template is a structured YAML file, organized as follows:

shared_data:
  <shared_data>
run_type_data:
  perf_ci:
    <perf_ci_data>
  func_ci:
    <func_ci_data>
  default:
    <data for other run types>
kind_data:
  vm:
    <vm_data>
	run_type_data:
	  perf_ci:
	    <vm_data_for_perf_ci>
	  default:
	    <vm_data_for_other_run_types>
  default:
    <data_for_other_kinds>
	run_type_data:
	  perf_ci:
	    <other_kind_data_for_perf_ci>
	  default:
	    <other_kind_data_for_other_run_types>

The shared_data section is mandatory, but all other sections are optional. Generally, the run_type data for func_ci and test_ci is identical, so only perf_ci data need be specified, and the otherwise shared data under default. Similarly, the kata and pod kinds use identical data, and only vm data need be specified separately.

Boilerplate data that is independent of workload has been moved to common.yaml at top level in the templates directory.

Monitor and debug workload

  1. git clone https://github.com/redhat-performance/benchmark-runner
  2. cd benchmark-runner
  3. It is strongly recommended that you create a Python virtual environment for this work:
   $ python3 -m venv venv
   $ . venv/bin/activate
   $ pip3 install -r requirements.txt

When you are finished working, you should deactivate your virtual environment:

   $ deactivate

If you wish to resume work, you merely need to reactivate your virtual environment:

   $ . venv/bin/activate
  1. There are 2 options to run a workload:

    1. Run workload through /benchmark_runner/main/main.py
      1. Pass all mandatory parameters in benchmark_runner/main/environment_variables.py or set their equivalent variables in the environment (command line options override environment variables):
        1. --workload (WORKLOAD) = e.g. stressng_pod
        2. --runner-path (RUNNER_PATH) = path to local cloned benchmark-operator (e.g. /home/user/)
          1. git clone -b v1.0.2 https://github.com/cloud-bulldozer/benchmark-operator (inside runner_path)
        3. --kubeadmin_password (KUBEADMIN_PASSWORD)
        4. --pin-node-benchmark-operator (PIN_NODE_BENCHMARK_OPERATOR) - benchmark-operator node selector
        5. --pin-node1 (PIN_NODE1) - workload first node selector
        6. --pin-node2 (PIN_NODE2) - workload second node selector (for workload with client server e.g. uperf)
        7. --elasticsearch (ELASTICSEARCH`) - elasticsearch url without http prefix
        8. --elasticsearch-port (ELASTICSEARCH_PORT) - elasticsearch port
      2. Run /benchmark_runner/main/main.py with appropriate command line options or environment variables. For example:
        python3 benchmark_runner/main/main.py --runner-path=/parent/of/benchmark-runner --workload=stressng_pod --kubeadmin-password=password --pin-node-benchmark-operator=worker-0 --pin-node1=worker-1 --pin-node2=worker-2 --elasticsearch=elasticsearch_port --elasticsearch-port=80
        
        or
         RUNNER_PATH=/parent/of/benchmark-runner WORKLOAD=stressng_pod KUBEADMIN_PASSWORD=password PIN_NODE_BENCHMARK_OPERATOR=worker-0 PIN_NODE1=worker-1 PIN_NODE2=worker-2 ELASTICSEARCH=elasticsearch_port ELASTICSEARCH_PORT=80 python3 benchmark_runner/main/main.py
        
      3. Verify that benchmark-runner run the workload
    2. Run workload through integration/unittest tests [using pytest]
      1. Need to set all mandatory parameters in tests/integration/benchmark_runner/test_environment_variables.py in the environment.
        1. git clone -b v1.0.2 https://github.com/cloud-bulldozer/benchmark-operator (inside 'RUNNER_PATH')
        2. KUBEADMIN_PASSWORD
        3. PIN_NODE1 - workload first node selector
        4. ELASTICSEARCH - elasticsearch url without http prefix
        5. ELASTICSEARCH_PORT - elasticsearch port
      2. Run the selected test using pytest /tests/integration/benchmark_runner/common/oc/test_oc.py
        1. Enable pytest in Pycharm: Configure pytest in Pycharm -> File -> settings -> tools -> Python integrated tools -> Testing -> pytest -> ok), and run the selected test
        2. Run pytest through terminal: python -m pytest -v tests/ (pip install pytest)
  2. There are three separate flavors of test: test-ci, func-ci, and perf-ci. These are intended for testing, automated functional testing of benchmark-runner itself, and the performance measurement itself. The default is test-ci. These are distinct from any particular test environments; as noted above under [#Add-new-benchmark-operator-workload-to-benchmark-runner](adding new workloads), they also use different template files. The flavor can be selected via the command line option --run-type or the environment variable RUN_TYPE.

    When using a shared ElasticSearch instance (not documented here), it's important not to use the perf-ci run type. This will contaminate the index of the shared ElasticSearch database. There are two ways to use the perf-ci flavor safely:

    1. Pass --stop-when-workload-finish=true on the command line or Set STOP_WHEN_WORKLOAD_FINISH=True in the environment when running the workload.
    2. Use a different, private ElasticSearch instance.

Determine the version of benchmark-runner in the current container image

The version of https://pypi.org/project/benchmark-runner/ should match the version in setup.py, and the https://quay.io/repository/ebattat/benchmark-runner?tab=tags should also match that version. However, if the version on PyPi is not updated quickly enough, the container image may remain stale. This may result in unexpected errors.

To check the version of benchmark-runner in the container image, it's necessary to exec into the latest container image and check the version with pip:

# # If the command below results in an error, you may need to
# # podman ps -a |grep benchmark-runner
# # podman rm $(podman ps -a |grep benchmark-runner |awk '{print $1}')
# # and repeat the command
# podman rmi quay.io/ebattat/benchmark-runner
# podman rmi quay.io/ebattat/benchmark-runner
Untagged: quay.io/ebattat/benchmark-runner:latest
# podman run --rm -it quay.io/ebattat/benchmark-runner:latest /bin/bash
Trying to pull quay.io/ebattat/benchmark-runner:latest...
Getting image source signatures
...
[root@ede12c01460d /]# pip show benchmark-runner
Name: benchmark-runner
Version: 1.0.195
Summary: Benchmark Runner Tool
...
[root@ede12c01460d /]# cd /usr/local/lib/python3.9/site-packages/benchmark_runner

* If the version reported via pip does not match the expected version, the image build did not happen correctly. Please contact the development team for assistance. *