From bd2395cf7a40afd90dc0f203583bdb836f06feda Mon Sep 17 00:00:00 2001 From: Jinzhe Zeng Date: Thu, 26 Dec 2024 00:22:39 -0500 Subject: [PATCH] docs: fix the header of the scaling test table (#4507) Fix #4494. ## Summary by CodeRabbit - **Documentation** - Updated the parallel training documentation for TensorFlow and PyTorch to enhance clarity. - Expanded explanations on parallel training processes and data loading utilities. - Introduced a flowchart to illustrate data flow and modified the scaling tests table format for better understanding. Signed-off-by: Jinzhe Zeng --- doc/train/parallel-training.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/train/parallel-training.md b/doc/train/parallel-training.md index 9ea92b4751..00df0a63f0 100644 --- a/doc/train/parallel-training.md +++ b/doc/train/parallel-training.md @@ -27,13 +27,14 @@ In some cases, it won't work well when scaling the learning rate by worker count ### Scaling test Testing `examples/water/se_e2_a` on an 8-GPU host, linear acceleration can be observed with the increasing number of cards. - -| Num of GPU cards | Seconds every 100 samples | Samples per second | Speed up | -| ---------------- | ------------------------- | ------------------ | -------- | -| 1 | 1.4515 | 68.89 | 1.00 | -| 2 | 1.5962 | 62.65\*2 | 1.82 | -| 4 | 1.7635 | 56.71\*4 | 3.29 | -| 8 | 1.7267 | 57.91\*8 | 6.72 | +In this example, the number of samples per batch on a single GPU card ({ref}`batch_size `) is set to `1`. + +| Num of GPU cards | Samples per batch | Seconds every 100 batches | Samples per second | Speed up | +| ---------------- | ----------------- | ------------------------- | ------------------ | -------- | +| 1 | 1 | 1.4515 | 68.89 | 1.00 | +| 2 | 2 | 1.5962 | 62.65\*2 | 1.82 | +| 4 | 4 | 1.7635 | 56.71\*4 | 3.29 | +| 8 | 8 | 1.7267 | 57.91\*8 | 6.72 | ### How to use