-
Notifications
You must be signed in to change notification settings - Fork 54
/
README.stresstest
303 lines (234 loc) · 13.3 KB
/
README.stresstest
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
------------------------------------------------------------------
| OVERVIEW |
------------------------------------------------------------------
The stresstest tool is used to connect with and configure remote
machines on the network for running stress tests with sleepgraph.
------------------------------------------------------------------
| USAGE |
------------------------------------------------------------------
The tool is capable of issuing multiple commands which are used
to configure remote machines for sleepgraph stress testing
Usage: stresstest.py -config configfile -kernel version [command]
All the arguments present in stress.py -h can be set via the config
file option. In fact this is the preferred method. You can place
whichever arguments that will likely not change into a single config
to reduce the command line complexity. It uses the [setup] section.
First create a machines file with a list of machines in it.
Each line should include hostname, ip address, and username.
For example:
otcpl-hp-spectre-tgl 192.168.1.243 labuser
otcpl-lenovo-tix1-tgl 192.168.1.232 labuser
otcpl-galaxy-book-10 192.168.1.210 labuser
otcpl-asus-e300-apl 192.168.1.172 labuser
otcpl-asus-e200-cht 192.168.1.202 labuser
otcpl-hp-x360-bsw 192.168.1.215 labuser
The machines file is passed in with the -machines arg. The commands
themselves all use different arguments
------------------------------------------------------------------
| COMMANDS |
------------------------------------------------------------------
init - initialize the machines file and remove any state data
Required args:
-machines file input/output file with host/ip/user list and status
Optional args:
-kernel string kernel version to install from a package in pkgout
build - build a linux kernel from source into a deb/rpm package
Required args:
-pkgfmt type kernel package format [rpm/deb] (default: deb)
-ksrc folder kernel source folder (required to build)
Optional args:
-pkgout folder output folder for kernel packages (default: ksrc/..)
-kname string kernel name as "<version>-<name>" (default: <version>)
-kcfg folder config & patches folder (default: use .config in ksrc)
-ktag gittag kernel source git tag (default: no change)
bisect - bisect a kernel by building locally and testing remotely
Required args:
-pkgfmt type kernel package format [rpm/deb] (default: deb)
-ksrc folder kernel source folder (required to build)
-kcfg folder folder with patches & kernel config (1st *.config)
-kgood tag The good kernel commit/tag
-kbad tag The bad kernel commit/tag
-ktest file An exec script which determines good or bad on target
-host hostname hostname of target machine used for testing
-addr ip ip address or hostname.domain of remote machine
-user username username to use to ssh to remote machine
Optional args:
-userinput allow user interaction when input is required
-pkgout folder output folder for kernel packages (default: ksrc/..)
online - test target machines to verify identity and connectivity
Required args:
-machines file input/output file with host/ip/user list and status
Optional args:
-userinput allow user interaction when executing remote commands
install - install the kernel packages & tools on multiple systems
Required args:
-machines file input/output file with host/ip/user list and status
-pkgfmt type kernel package format [rpm/deb] (default: deb)
-pkgout folder output folder for kernel packages (default: ksrc/..)
-kernel string kernel version to install from a package in pkgout
ready - check target systems to see if they're ready to test
Required args:
-machines file input/output file with host/ip/user list and status
Optional args:
-userinput allow user interaction when executing remote commands
run - run a set of tests on these systems via ssh, store the data on host
Required args:
-machines file input/output file with host/ip/user list and status
-kernel string kernel version to install from a package in pkgout
-mode suspendmode suspend mode to test with sleepgraph on remote machine
can be mem, freeze, or all
[use either count or duration]
-count count maximum sleepgraph iterations to run
-duration minutes maximum duration in minutes to iterate sleepgraph
Optional args:
-testout folder output folder for test data (default: .)
-resetcmd cmdstr optional command used to reset the remote machine
(used on offline/hung machines with "online"/"run")
-failmax count maximum consecutive sleepgraph fails before testing
stops
runmulti - spawn a sleepgraph -multi run on a target system and exit
Required args:
-machines file input/output file with host/ip/user list and status
-kernel string kernel version to install from a package in pkgout
-mode suspendmode suspend mode to test with sleepgraph on remote machine
can be mem, freeze, or all
[use either count or duration]
-count count maximum sleepgraph iterations to run
-duration minutes maximum duration in minutes to iterate sleepgraph
------------------------------------------------------------------
| RUNMULTI EXAMPLE |
------------------------------------------------------------------
A typical run involves building a kernel from source, checking
if the machines are online, installing that kernel on online
machines, waiting for reboot, checking if the kernel is running,
and then firing off a run. First begin with the config file:
------------------------------------------------------------------
[setup]
# Kernel package format
# Set kernel package format [deb/rpm] (default: deb)
pkgfmt: deb
# Kernel package output folder
# Place build output files here (default: ksrc/..)
pkgout: ~/workspace/packages
# Kernel source
# Kernel source folder location (default: required to build)
ksrc: ~/workspace/linux
# Kernel config folder
# Folder with config file and patches to apply to ksrc (default: no change)
kcfg: ~/workspace/stressconfig
# Kernel git tag
# If ksrc is a git repo, set to this tag (default: no change)
# Used "latestrc" to select the newest release candidate
ktag: latestrc
# remove kernels
# These are the kernels we want removed prior to install
rmkernel: [5-6]\.[0-9]*\.[0-9]*-rc[0-9]*\+
# Machines file
# Text file with list of machine/ip values for testing
# Lines will be prepended with status as setup/test occurs
machines: ~/workspace/stressconfig/machine.txt
# Test output folder
# Place test output files and logs here (required for run/status)
testout: ~/workspace/pm-graph-test
------------------------------------------------------------------
1) Build the kernel package
Build the kernel package in your source tree, latestrc is the default.
First it fully cleans out the kernel source tree with a make distclean.
It look in your kcfg folder for the .config file (first file it finds
named *.config). Then it applies any patches it finds there (anything
named *.patch). When it's done building, it moves the packages to the
pkgout folder. Make note of the kernel version for later calls.
./stresstest.py -config my.cfg build
2) Initialize the machines file to remove data from previous runs
Remember, the machines file is both the input and the status file.
The init command generates a fresh copy of the machines file
named: machine-kernel.txt. The tool will append flags to each entry
as you run commands so that it can remember the state for subsequent
command calls. If the machine-kernel.txt file already exists, the
init call clears out all the flags.
./stresstest.py -config my.cfg -kernel 6.1.0-rc2+ init
3) Check which machines are online
Loop through the machines file and test if each one is online and
ssh-able. If they are, they get an O appended to them in the
in machines file. Each time you run "online" only the un-flagged
machines are checked, so you can work on them until all are online.
./stresstest.py -config my.cfg -kernel 6.1.0-rc2-intel-next+ online
4) Install the kernel package on all online systems and reboot
Install the kernel on all systems that have an O in front of them in the
machines file. The "install" command also uninstalls any kernels that
match the "rmkernel" regex argument before installing the new one. This
ensures the system doesn't end up with hundreds of kernels. If successful,
it changes O to I in the machines file. If install failed, it leaves it
as O and provides a log. You can run install multiple times as it will
only run on systems flagged as O.
./stresstest.py -config my.cfg -kernel 6.1.0-rc2-intel-next+ install
5) Check that the systems are ready to run
After waiting for a minute or so, check if the installed kernels are running
on each system that is flagged O or I. If the kernel is found, the flag is
changed to R and the system is ready to run tests on. This can be run multiple
times as systems may require time to boot or require maintainance.
./stresstest.py -config my.cfg -kernel 6.1.0-rc2-intel-next+ ready
6) Run sleepgraph -multi stress testing on all machines
Run stress tests on the machines flagged as ready (with R). You must
decide which mode to run (freeze/mem/all) and either how many iterations
with the -count arg, or how long to loop for with the -duration arg.
The tool exits once the machines are running sleepgraph -multi.
./stresstest.py -config my.cfg -kernel 6.1.0-rc2-intel-next+ -mode freeze -count 2000 runmulti
7) Get the data from the sleepgraph -multi runs
When testing is complete, you can download the result data from the
ready flagged machines. The data is first compressed into tarballs
and is scp-ed into the testout folder.
./stresstest.py -config my.cfg -kernel 6.1.0-rc2-intel-next+ getmulti
------------------------------------------------------------------
| BISECT EXAMPLE |
------------------------------------------------------------------
Bisect a kernel source tree, built locally, tested remotely
./stresstest.py
-kgood 4ce1b97949cbf46e847722461386170e0f709c59 # the known good commit
-kbad b7270c69a36efc61ed6ebd31a8a458f354a6edc0 # the known bad commit
-pkgfmt deb # package format to build the kernel in
-ktest bisect-test.sh # executable script to test good/bad on remote system
# last output line should be "GOOD" or "BAD"
-ksrc ~/workspace/linux # a kernel source tree with no patches applied
-kcfg ~/workspace/stressconfig # a folder that includes the latest.config file
# bisect loads the first *.config file it sees
-host mytestmachine # target machine hostname
-user myusername # target machine username
-addr 192.168.1.23 # target machine ip addr
-pkgout ~/workspace/bisecttest # local folder to store the kernel bisect packages
# bisect builds each step N as linux-version-bisectN.deb
-userinput # allow the tool to ask for input if things go awry
# required if you don't include -ktest, it also
# allows you to fix issues without having to restart
bisect
The following is the basic process that stresstest bisect follows:
Fully clean and reset the kernel source tree
1) git -C [ksrc] checkout .
2) git -C [ksrc] checkout master
3) git -C [ksrc] pull
4) make -C [ksrc] distclean
5) git -C [ksrc] bisect reset
Start the bisect
6) git -C [ksrc] bisect start
7) git -C [ksrc] bisect good [kgood]
8) git -C [ksrc] bisect good [kbad]
Apply any patches found in -kcfg and copy the config
9) patch -N -d [ksrc] -i [kcfg]/*.patch -p1
10) cp [kcfg]/latest.config [ksrc]/.config
Build the kernel package (loop from here for each bisect step <N>)
11) make -C [ksrc] olddefconfig
12) make -C [ksrc] bin[pkgfmt]-pkg LOCALVERSION=-bisect<N>
Send the packages to the target, install and reboot (ubuntu example)
13) scp packges to [user]@[address]/tmp/
14) ssh [user]@[address]/tmp/ "sudo dpkg -i /tmp/*pkg"
15) ssh [user]@[address]/tmp/ "sudo grub-set-default kver"
16) ssh [user]@[address]/tmp/ "sudo reboot"
Send the -ktest script to the target and run it (output is GOOD or BAD)
17) scp [ktest] to [user]@[address]/tmp/
18) ssh [user]@[address]/tmp/ "/tmp/[ktest]"
Proceed with the bisect, clean and reapply the patches so no conflicts
19) git -C [ksrc] checkout .
20) git -C [ksrc] bisect good/bad
21) patch -N -d [ksrc] -i [kcfg]/*.patch -p1
Build the next kernel package or end a successful bisect
22) Go to step 11, loop til bisect is done