Add TF MNIST classification cost benchmark #33391

jrmccluskey · 2024-12-16T16:36:02Z

Adds the Tensorflow MNIST classification example as a dataflow cost benchmark workflow as a second example for benchmark implementation. Has the added wrinkle of additional dependencies that need to be installed on workers, unlike the wordcount example.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

github-actions · 2024-12-16T18:40:48Z

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @shunping for label python.
R: @damccorm for label build.

Available commands:

stop reviewer notifications - opt out of the automated review tooling
remind me after tests pass - tag the comment author after tests pass
waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

damccorm

Just had one comment, otherwise LGTM

damccorm · 2024-12-16T19:35:36Z

.github/workflows/beam_CostBenchmark_Python_TF_MNIST_Classification_Dataflow.yml

+            -PloadTest.mainClass=apache_beam.testing.benchmarks.inference.tensorflow_mnist_classification_cost_benchmark \
+            -Prunner=DataflowRunner \
+            -PpythonVersion=3.10 \
+            '-PloadTest.args=${{ env.beam_Inference_Python_Benchmarks_Dataflow_test_arguments_1 }} --job_name=benchmark-tests-tf-mnist-classification-python-${{env.NOW_UTC}} --input_file=gs://apache-beam-ml/testing/inputs/it_mnist_data.csv --output_file=gs://temp-storage-for-end-to-end-tests/wordcount/result_tf_mnist-${{env.NOW_UTC}}.txt --model=gs://apache-beam-ml/models/tensorflow/mnist/' \


Should we consider running multiple benchmarks in the same workflow instead of a workflow per test? The advantage would be having fewer things to monitor/maintain

Eventually we can bundle them either by framework or put them all together in one workflow, this is largely just me building on the wordcount example benchmark by having a RunInference-specific instance (the most important distinction is the need to include a requirements file for Dataflow workers, but the pattern will largely hold for custom containers with CUDA deps too.)

If we wanted to go ahead and choose one of those routes we could go ahead and do that now + set the workflow up for cron scheduling, I'm not opposed to that.

I'd be in favor of just doing that now - I think a single workflow will end up being easier to manage, and we can always parallelize via jobs within the workflow if needed

Can we move https://github.com/apache/beam/blob/master/.github/workflows/beam_Wordcount_Python_Cost_Benchmark_Dataflow.yml into this file as well?

easy enough, done

damccorm

Thanks!

jrmccluskey · 2024-12-17T19:44:03Z

Build wheel failure is unrelated, merging

* Add TF MNIST classification cost benchmark * linting * Generalize to single workflow file for cost benchmarks * fix incorrect UTC time in comment * move wordcount to same workflow * update workflow job name

Add TF MNIST classification cost benchmark

8b5ae3f

github-actions bot added python build labels Dec 16, 2024

linting

80b01ad

github-actions bot added the Next Action: Reviewers label Dec 16, 2024

damccorm reviewed Dec 16, 2024

View reviewed changes

jrmccluskey added 4 commits December 17, 2024 10:14

Generalize to single workflow file for cost benchmarks

e8b189d

fix incorrect UTC time in comment

d9fd867

move wordcount to same workflow

6976360

update workflow job name

a2a60fb

damccorm approved these changes Dec 17, 2024

View reviewed changes

jrmccluskey merged commit 0e37501 into master Dec 17, 2024
89 of 94 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TF MNIST classification cost benchmark #33391

Add TF MNIST classification cost benchmark #33391

jrmccluskey commented Dec 16, 2024

github-actions bot commented Dec 16, 2024

damccorm left a comment

damccorm Dec 16, 2024

jrmccluskey Dec 16, 2024

damccorm Dec 16, 2024

jrmccluskey Dec 17, 2024

damccorm Dec 17, 2024

jrmccluskey Dec 17, 2024

damccorm left a comment

jrmccluskey commented Dec 17, 2024

Add TF MNIST classification cost benchmark #33391

Add TF MNIST classification cost benchmark #33391

Conversation

jrmccluskey commented Dec 16, 2024

GitHub Actions Tests Status (on master branch)

github-actions bot commented Dec 16, 2024

damccorm left a comment

Choose a reason for hiding this comment

damccorm Dec 16, 2024

Choose a reason for hiding this comment

jrmccluskey Dec 16, 2024

Choose a reason for hiding this comment

damccorm Dec 16, 2024

Choose a reason for hiding this comment

jrmccluskey Dec 17, 2024

Choose a reason for hiding this comment

damccorm Dec 17, 2024

Choose a reason for hiding this comment

jrmccluskey Dec 17, 2024

Choose a reason for hiding this comment

damccorm left a comment

Choose a reason for hiding this comment

jrmccluskey commented Dec 17, 2024