Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework model/batch size configuration again #14

Open
wants to merge 30 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
a3eb3c0
update benchmarks repo
jeremybobbin Jul 4, 2020
f442879
report average - not total
jeremybobbin Jul 4, 2020
33dd6fc
make adjustments for reworked report.sh script
jeremybobbin Jul 4, 2020
3a3e9a5
commentary
Apr 17, 2020
c21198f
report gigabytes instead of gibibytes
Apr 20, 2020
dada317
Change GPU VRAM's(given in MiB) divisor to 1024(to GiB)
jeremybobbin Jun 12, 2020
1680bed
Panic when GPU_RAM is undefined or doesn't have a configuration
jeremybobbin Jun 12, 2020
0b1521e
Use lambda repo for benchmarks
chuanli11 Jun 11, 2020
94fb779
Update README.md
chuanli11 Jun 12, 2020
48bd745
run only python3
jeremybobbin Jun 24, 2020
c00df0e
Add amd support to tf2
chuanli11 Jun 12, 2020
043088d
calculate optimal batchsize
jeremybobbin Jul 3, 2020
c68f645
fmt
jeremybobbin Jul 3, 2020
5f9ba51
consider precision in batchsize calculation
jeremybobbin Jul 3, 2020
8fecc5c
expose fn to calculate batch size instead of manipulating directly
jeremybobbin Jul 3, 2020
f8ff897
move batch_size function to main script
jeremybobbin Jul 3, 2020
b44486e
Pass the name of the config file as a parameter
jeremybobbin Jul 4, 2020
0387e05
add fp32 resnet50 config
jeremybobbin Jul 4, 2020
a305b75
assert that GPUs are available before running benchmarks
jeremybobbin Jul 3, 2020
5ac3702
write benchmark entries to log.csv
jeremybobbin Jul 3, 2020
1503085
add timestamp
jeremybobbin Jul 3, 2020
d05d847
better CPU name
jeremybobbin Jul 3, 2020
950d5f2
don't allow different GPU models to be run
jeremybobbin Jul 3, 2020
f544630
refactor GPU homogeny check & setting of GPU_NAME
jeremybobbin Jul 3, 2020
8de9c6f
only swap whitespace in CPU_NAME for log dir
jeremybobbin Jul 3, 2020
f89cfe5
add CPU_NAME to csv
jeremybobbin Jul 3, 2020
7378643
log motherboard name
jeremybobbin Jul 3, 2020
bf6b04f
adjust Tensorflow version in README
jeremybobbin Jul 4, 2020
077d165
add options to benchmark - adjust readme accordingly
jeremybobbin Jul 4, 2020
02e4d1d
wrap min to max GPU seq loop around main loop
jeremybobbin Jul 4, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
[submodule "benchmarks"]
path = benchmarks
url = http://github.com/lambdal/benchmarks
url = https://github.com/tensorflow/benchmarks.git
branch = master
50 changes: 41 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This is the code used for a few of the blog posts on: https://lambdalabs.com/blo

Environment:
- OS: Ubuntu 18.04
- TensorFlow version: 1.14.0
- TensorFlow version: 1.15.3
- CUDA Version 10.0
- CUDNN Version 7.6.2

Expand All @@ -19,15 +19,47 @@ git clone https://github.com/lambdal/lambda-tensorflow-benchmark.git
#### Step Two: Run benchmark with thermal profile

```
TF_XLA_FLAGS=--tf_xla_auto_jit=2 ./batch_benchmark.sh min_num_gpus max_num_gpus num_runs num_batches_per_run thermal_sampling_frequency
./benchmark.sh -l <min_num_gpus> -h <max_num_gpus> -n <num_runs> -b <num_batches_per_run> -t <thermal_sampling_frequency>
python display_thermal.py path-to-thermal.log --thermal_threshold

# example of benchmarking 4 2080_Ti (all used), 1 run, 200 batches per run, measuring thermal every 2 second. 2080_Ti throttles at 89 C.
TF_XLA_FLAGS=--tf_xla_auto_jit=2 ./batch_benchmark.sh 4 4 1 200 2
python display_thermal.py i9-7920X-GeForce_RTX_2080_Ti.logs/resnet152-syn-replicated-fp32-4gpus-32-1-thermal.log --thermal_threshold 89
# example of benchmarking 4 2080_Ti (all used), 1 run, 100 batches per run, measuring thermal every 2 second. 2080_Ti throttles at 89 C.
./benchmark.sh -l 4 -h 4 -n 1 -b 100 -t 2 -c config_resnet50_replicated_fp32_train_syn
python display_thermal.py path-to-thermal/1 --thermal_threshold 89

```

#### AMD

Follow the guidance [here](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream)

```
alias drun='sudo docker run \
-it \
--network=host \
--device=/dev/kfd \
--device=/dev/dri \
--ipc=host \
--shm-size 16G \
--group-add video \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-v $HOME/dockerx:/dockerx'

drun rocm/tensorflow:latest

apt install rocm-libs hipcub miopen-hip
pip3 install --user tensorflow-rocm --upgrade
pip3 install tensorflow

cd /home/dockerx
git clone https://github.com/lambdal/lambda-tensorflow-benchmark.git --recursive
git checkout tf2
git submodule update --init --recursive

./benchmark.sh -l 1 -h 1 -n 1 -b 100 -t 2 -c config_resnet50_replicated_fp32_train_syn
```


#### Note

Use large num_batches_per_run for a thorough test.
Expand All @@ -38,23 +70,23 @@ Use large num_batches_per_run for a thorough test.
* Input proper gpu_indices (a comma seperated list, default 0) and num_iterations (default 10)
```
cd lambda-tensorflow-benchmark
./benchmark.sh gpu_indices num_iterations
./benchmark.sh -i <gpu_indices> -n <num_iterations>
```

#### Step Three: Report results

* Check the repo directory for folder \<cpu>-\<gpu>.logs (generated by benchmark.sh)
* Use the same num_iterations and gpu_indices for both benchmarking and reporting.
```
./report.sh <cpu>-<gpu>.logs num_iterations gpu_indices
./report.sh <cpu>-<gpu>.logs
```

#### Batch process:

```
TF_XLA_FLAGS=--tf_xla_auto_jit=2 ./batch_benchmark.sh min_num_gpus max_num_gpus num_iterations
TF_XLA_FLAGS=--tf_xla_auto_jit=2 ./benchmark.sh -l <min_num_gpus> -h <max_num_gpus> -n <num_iterations>

./batch_report.sh <cpu>-<gpu>.logs min_num_gpus max_num_gpus num_iterations
./report.sh <cpu>-<gpu>.logs

./gether.sh
```
Expand Down
25 changes: 0 additions & 25 deletions batch_benchmark.sh

This file was deleted.

26 changes: 0 additions & 26 deletions batch_report.sh

This file was deleted.

Loading