Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: fix the header of the scaling test table #4507

Merged
merged 1 commit into from
Dec 26, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions doc/train/parallel-training.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,14 @@ In some cases, it won't work well when scaling the learning rate by worker count
### Scaling test

Testing `examples/water/se_e2_a` on an 8-GPU host, linear acceleration can be observed with the increasing number of cards.

| Num of GPU cards | Seconds every 100 samples | Samples per second | Speed up |
| ---------------- | ------------------------- | ------------------ | -------- |
| 1 | 1.4515 | 68.89 | 1.00 |
| 2 | 1.5962 | 62.65\*2 | 1.82 |
| 4 | 1.7635 | 56.71\*4 | 3.29 |
| 8 | 1.7267 | 57.91\*8 | 6.72 |
In this example, the number of samples per batch on a single GPU card ({ref}`batch_size <training/training_data/batch_size>`) is set to `1`.

| Num of GPU cards | Samples per batch | Seconds every 100 batches | Samples per second | Speed up |
| ---------------- | ----------------- | ------------------------- | ------------------ | -------- |
| 1 | 1 | 1.4515 | 68.89 | 1.00 |
| 2 | 2 | 1.5962 | 62.65\*2 | 1.82 |
| 4 | 4 | 1.7635 | 56.71\*4 | 3.29 |
| 8 | 8 | 1.7267 | 57.91\*8 | 6.72 |

### How to use

Expand Down
Loading