Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Regression or Improvement: pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_latency_micro_secs:mean_inference_batch_latency_micro_secs #27986

Closed
github-actions bot opened this issue Aug 13, 2023 · 4 comments
Assignees
Labels
awaiting triage perf-alert Automatically filed performance-related alerts.

Comments

@github-actions
Copy link
Contributor

Performance change found in the
test: pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_latency_micro_secs for the metric: mean_inference_batch_latency_micro_secs.

For more information on how to triage the alerts, please look at
Triage performance alert issues section of the README.

Test description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU. Test link -

test : 'apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks',
). Test dashboard - http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2


timestamp: Sun Aug 13 20:17:18 2023, metric_value: 5302767.89
timestamp: Fri Aug 11 20:23:28 2023, metric_value: 3739466.59
timestamp: Thu Aug 10 20:41:05 2023, metric_value: 3760460.05
timestamp: Wed Aug  9 20:22:26 2023, metric_value: 4072217.74 <---- Anomaly
timestamp: Sun Aug  6 20:11:28 2023, metric_value: 3332504.52
timestamp: Sat Aug  5 20:15:50 2023, metric_value: 3040610.81
timestamp: Fri Aug  4 20:17:43 2023, metric_value: 2852621.14
timestamp: Thu Aug  3 20:33:48 2023, metric_value: 2628943.45
timestamp: Wed Aug  2 15:48:49 2023, metric_value: 2590113.42
timestamp: Tue Aug  1 20:21:43 2023, metric_value: 2911238.56
timestamp: Tue Jul 18 20:18:01 2023, metric_value: 3031364.80
timestamp: Mon Jul 17 20:17:35 2023, metric_value: 3884104.27
timestamp: Sun Jul 16 20:12:41 2023, metric_value: 2915562.42
timestamp: Fri Jul 14 20:20:19 2023, metric_value: 3197632.95

@github-actions github-actions bot added awaiting triage perf-alert Automatically filed performance-related alerts. labels Aug 13, 2023
@riteshghorse riteshghorse self-assigned this Aug 16, 2023
@tvalentyn
Copy link
Contributor

left a comment in #27077 (comment)
cc: @AnandInguva . Should we disable GPU benchmarks until we stabilize the signal?

@github-actions
Copy link
Contributor Author

Performance change found in the
test: pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_latency_micro_secs for the metric: mean_inference_batch_latency_micro_secs.

For more information on how to triage the alerts, please look at
Triage performance alert issues section of the README.

Test description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU.
Test link -

test : 'apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks',
).
Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2


timestamp: Fri Oct 20 07:08:05 2023, metric_value: 2688509.52
timestamp: Fri Oct 20 06:39:18 2023, metric_value: 2458050.99
timestamp: Thu Oct 19 07:17:14 2023, metric_value: 2043164.31 <---- Anomaly
timestamp: Tue Oct 17 20:40:46 2023, metric_value: 3899543.47
timestamp: Tue Oct 17 06:53:08 2023, metric_value: 3570869.75
timestamp: Sun Oct 15 20:34:39 2023, metric_value: 3960132.67
timestamp: Sat Oct 14 20:44:21 2023, metric_value: 3336088.22
timestamp: Sat Oct 14 06:28:48 2023, metric_value: 4056443.40
timestamp: Thu Oct 12 20:23:43 2023, metric_value: 5029367.36
timestamp: Wed Oct 11 20:24:49 2023, metric_value: 3878865.37
timestamp: Wed Oct 11 11:30:25 2023, metric_value: 3437622.54
timestamp: Tue Oct 10 21:59:48 2023, metric_value: 3985866.40
timestamp: Tue Oct 10 18:42:44 2023, metric_value: 3502661.44

Copy link
Contributor Author

github-actions bot commented Feb 3, 2024

Performance change found in the
test: pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_latency_micro_secs for the metric: mean_inference_batch_latency_micro_secs.

For more information on how to triage the alerts, please look at
Triage performance alert issues section of the README.

Test description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU.
Test link -

test : 'apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks',
).
Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2


timestamp: Sat Feb  3 06:45:12 2024, metric_value: 3653228.19
timestamp: Thu Feb  1 06:48:04 2024, metric_value: 4523925.36
timestamp: Tue Jan 30 06:47:30 2024, metric_value: 5123653.76 <---- Anomaly
timestamp: Mon Jan 29 06:45:18 2024, metric_value: 3329238.64
timestamp: Sun Jan 28 06:39:42 2024, metric_value: 2567131.22
timestamp: Sat Jan 27 06:43:41 2024, metric_value: 3449347.82
timestamp: Fri Jan 26 06:45:21 2024, metric_value: 2972590.06
timestamp: Thu Jan 25 06:44:42 2024, metric_value: 3099430.59
timestamp: Wed Jan 24 06:55:31 2024, metric_value: 3949223.63
timestamp: Mon Jan 22 06:44:51 2024, metric_value: 3136144.88
timestamp: Sun Jan 21 06:45:18 2024, metric_value: 3054494.27
timestamp: Sat Jan 20 06:38:09 2024, metric_value: 2811992.30
timestamp: Fri Jan 19 06:52:12 2024, metric_value: 3285322.64

@ritchiepapirin ritchiepapirin mentioned this issue Feb 4, 2024
15 tasks
Copy link
Contributor Author

Performance change found in the
test: pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_latency_micro_secs for the metric: mean_inference_batch_latency_micro_secs.

For more information on how to triage the alerts, please look at
Triage performance alert issues section of the README.

Test description: Pytorch image classification on 50k images of size 224 x 224 with resnet 152 with Tesla T4 GPU.
Test link -

test : 'apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks',
).
Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2


timestamp: Fri Feb 23 06:53:41 2024, metric_value: 4560157.85
timestamp: Wed Feb 21 07:12:25 2024, metric_value: 4883101.24
timestamp: Tue Feb 20 06:53:46 2024, metric_value: 5398728.80 <---- Anomaly
timestamp: Sun Feb 18 06:41:31 2024, metric_value: 3060535.64
timestamp: Sat Feb 17 06:52:20 2024, metric_value: 2565324.57
timestamp: Fri Feb 16 06:55:15 2024, metric_value: 3789397.17
timestamp: Thu Feb 15 06:58:53 2024, metric_value: 4431068.78
timestamp: Wed Feb 14 06:50:20 2024, metric_value: 3481961.73
timestamp: Mon Feb 12 06:42:22 2024, metric_value: 3489616.96
timestamp: Sat Feb 10 06:45:00 2024, metric_value: 3158261.25
timestamp: Fri Feb  9 06:47:02 2024, metric_value: 2655775.48
timestamp: Thu Feb  8 06:47:05 2024, metric_value: 2885917.85
timestamp: Wed Feb  7 06:52:04 2024, metric_value: 3010852.66

@liferoad liferoad closed this as completed Mar 5, 2024
@github-actions github-actions bot added this to the 2.55.0 Release milestone Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting triage perf-alert Automatically filed performance-related alerts.
Projects
None yet
Development

No branches or pull requests

3 participants