The Sensitivity Profile Experiment can be run in two modes:
- Standalone - Swan runs workloads as standalone processes.
- Kubernetes - Swan runs workloads as Kubernetes Pods.
Experiment must be run by privileged user, so that it can set isolation to workloads.
When the experiment is run, an UUID like 5df7fa72-add4-44a2-67fa-31668bcafe81
is shown. It will be the identifier for this experiment and the key to retrieve the experiment data.
Swan exposes a multitude of configuration flags for fine grained experiment control. To list all flags, plese run memcached-sensitivity-profile -config-dump
. Dumped config can be later used to run experiment.
sudo memcached-sensitivity-profile -config-dump > config.ini
sudo memcached-sensitivity-profile -config config.ini # Config supplied to experiment.
In this section
Below is an example configuration using environment variables to set up the experiment where the machines are configured in the following topology:
Machine | Role |
---|---|
192.168.10.1 | SUT node for Swan Experiment |
192.168.10.2 | Load Generator agent node #1 |
192.168.10.3 | Load Generator agent node #2 |
192.168.10.4 | Load Generator agent node #3 |
192.168.10.5 | Load Generator agent node #4 |
192.168.10.10 | Services node (also Load Generator Master) |
Binaries should be installed on those machines as stated in Installation guide.
In this example, SUT node has 32 hyper threads over 16 physical cores on 2 sockets. Per the topology description showed in Theory section, this leaves 4 cpu-threads and for memcached.
Please paste following snippet into config.ini
file.
Some variables needs to be changed before running the experiment.
REMOTE_SSH_LOGIN
andREMOTE_SSH_KEY_PATH
should point to user and key that is authorized to SSH on every machine in the experiment cluster.MEMCACHED_USER
should be the same asREMOTE_SSH_LOGIN
MEMCACHED_LISTENING_ADDRESS
should be SUT node address.EXPERIMENT_PEAK_LOAD
should be provided appropriate to SUT machine. If tiny VM is used, it might be even as low as 100000. If high-end server machine is used, it might be as high as millions requests per second.EXPERIMENT_SLO
should be set appropriate to SUT machine (in microseconds). In VM environment, the default value should be increased 10x.- When SUT node has HyperThreads,
EXPERIMENT_BE_WORKLOADS
should containstress-ng-cache-l1
entry to examine Memcached sensitivity to L1 cache interference.
# Log level for Swan: debug, info, warn, error, fatal, panic
# Default: info
LOG_LEVEL=info
## --- Remote SSH Access ---
REMOTE_SSH_USER=root
REMOTE_SSH_KEY_PATH=/root/.ssh/id_rsa
## --- Best Effort Workloads ---
# Best Effort workloads that will be run sequentially in colocation with High Priority workload.
# When experiment is run on machine with HyperThreads, user can also add 'stress-ng-cache-l1' to this list.
EXPERIMENT_BE_WORKLOADS=stress-ng-cache-l3,stress-ng-memcpy,stress-ng-stream,caffe
EXPERIMENT_RUN_CAFFE_WITH_L3_CACHE_ISOLATION=false
## --- Experiment configuration ---
# Highiest load that SUT machine can handle without breaking the SLO.
EXPERIMENT_PEAK_LOAD=600000
# Given SLO for the HP workload in experiment in microseconds.
EXPERIMENT_SLO=500
# Duration of each measurement.
EXPERIMENT_LOAD_DURATION=15s
# Each load point is fraction of peak load.
EXPERIMENT_LOAD_POINTS=10
# Number of times each load point will be repeated.
EXPERIMENT_REPETITIONS=1
## --- Isolation ---
EXPERIMENT_HP_WORKLOAD_CPU_COUNT=4
EXPERIMENT_BE_WORKLOAD_CPU_COUNT=4
## --- Memcached Configuration ---
MEMCACHED_LISTENING_ADDRESS=192.168.10.1
MEMCACHED_USER=root
MEMCACHED_THREADS=4
MEMCACHED_THREADS_AFFINITY=false
## --- Mutilate Configuration ---
# Master
EXPERIMENT_MUTILATE_MASTER_ADDRESS=192.168.10.10
MUTILATE_MASTER_THREADS=8
MUTILATE_MASTER_CONNECTIONS=8
# Agents
EXPERIMENT_MUTILATE_AGENT_ADDRESSES=192.168.10.2,192.168.10.3,192.168.10.4,192.168.10.5
## --- Snap Configuration ---
SNAPTELD_ADDRESS=http://192.168.10.1:8181
## --- Cassandra Configuration ---
CASSANDRA_ADDRESS=192.168.10.10
## --- Kubernetes Configuration ---
# Uncomment following flags to run workloads on Kubernetes.
# Experiment will ramp up cluster on SUT and Services node.
# KUBERNETES=true
# KUBERNETES_CLUSTER_RUN_CONTROL_PLANE_ON_HOST=192.168.10.10
Before running memcached-sensitivity-profile
please ensure that
- Cassandra is up and running on the Services node.
- Snapteld is running on SUT.
- Mutilate binary should be compiled on every host.
- From SUT node, an user passed in
REMOTE_SSH_USER
flag can connect via ssh to other nodes using keys authorization.
If everything is ready then simply launch:
sudo memcached-sensitivity-profile -config config.ini
Note the UUID that is printed on stdout and wait for experiment to finish.
When the experiment is complete, the results can be retrieved from Cassandra. Swan ships with a Jupyter Notbook which provides an environment for loading the samples and generating sensitivity profiles.
For instructions on how to run Jupyter Notebook, please refer to the Jupyter user guide.
A few pointers to validate the experiment data:
- Baseline measurements should not violate SLO at any load point.
- At low loads - numbers may not differ for baseline and colocated scenarios. The differences should be in when the saturation occurs. For the colocated scenarios, this should become evident at higher loads. If this does not occur, it might mean that Memcached has not been properly baselined.
- Grey boxes show that Memcached or Mutilate could not achieve requsted number of requests per second.
Below is an example of what the sensitivity profile could be:
The Load row is a percentage of the peak load. A cell in a table express percentage of SLI which is a 99th percentile response time for that Load in relation to SLO. For instance Baseline for 5% Load for SLO 500ms tells that 99 percent of requests responded in time not greater than 160ms which is 32% of SLO time. Thus if we observe SLI above 100% that means violation of SLO. In the presented table Caffe and memBW are relatively weak aggressors and they lead to SLO violation only on higher loads while Stream 100M is very aggressive and leads to SLO violation even on low loads of memcached.
The most important configuration flags are listed in Experiment Configuration page. Other experiment flags can be found in All Experiment Flags page.
Please move to Tuning page.