diff --git a/docs/data/how-to/cooperative_groups/thread_hierarchy_coop.svg b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop.svg deleted file mode 100644 index 199f92306a..0000000000 --- a/docs/data/how-to/cooperative_groups/thread_hierarchy_coop.svg +++ /dev/null @@ -1 +0,0 @@ -Grid
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
GridMulti Grid
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Text is not SVG - cannot display
\ No newline at end of file diff --git a/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.drawio b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.drawio new file mode 100644 index 0000000000..4f1ff494f2 --- /dev/null +++ b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.drawio @@ -0,0 +1,904 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.svg b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.svg new file mode 100644 index 0000000000..298cd48218 --- /dev/null +++ b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.svg @@ -0,0 +1 @@ +Block
Thread-block tile
Thread-block tile
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Thread-block tile
Thread-block tile
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Coalesced group
Coalesced group
Warp
Warp
Warp
Warp
Warp
Warp
Coalesced group
Coalesced group
Warp
Warp
Warp
Warp
Warp
Warp
Grid
Text is not SVG - cannot display
\ No newline at end of file diff --git a/docs/data/how-to/cooperative_groups/thread_hierarchy_coop.drawio b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_top.drawio similarity index 100% rename from docs/data/how-to/cooperative_groups/thread_hierarchy_coop.drawio rename to docs/data/how-to/cooperative_groups/thread_hierarchy_coop_top.drawio diff --git a/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_top.svg b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_top.svg new file mode 100644 index 0000000000..ebe4794576 --- /dev/null +++ b/docs/data/how-to/cooperative_groups/thread_hierarchy_coop_top.svg @@ -0,0 +1 @@ +Grid
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
GridMulti Grid
Block
Block
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Warp
Text is not SVG - cannot display
\ No newline at end of file diff --git a/docs/how-to/cooperative_groups.rst b/docs/how-to/cooperative_groups.rst index e70c9d718b..9568ff9f9d 100644 --- a/docs/how-to/cooperative_groups.rst +++ b/docs/how-to/cooperative_groups.rst @@ -21,19 +21,18 @@ The API is accessible in the ``cooperative_groups`` namespace after the ``hip_c Cooperative groups thread model =============================== -The thread hierarchy abstraction of cooperative groups is in -:numref:`coop_thread_hierarchy`. +The thread hierarchy abstraction of cooperative groups are in :ref:`grid hierarchy ` and :ref:`block hierarchy `. -.. _coop_thread_hierarchy: +.. _coop_thread_top_hierarchy: -.. figure:: ../data/how-to/cooperative_groups/thread_hierarchy_coop.svg +.. figure:: ../data/how-to/cooperative_groups/thread_hierarchy_coop_top.svg :alt: Diagram depicting nested rectangles of varying color. The outermost one titled "Grid", inside sets of different sized rectangles layered on one another titled "Block". Each "Block" containing sets of uniform rectangles layered on one another titled "Warp". Each of the "Warp" titled rectangles filled with downward pointing arrows inside. - Cooperative group thread hierarchy. + Cooperative group thread hierarchy in grids. The **multi grid** is an abstraction of potentially multiple simultaneous launches of the same kernel over multiple devices (Deprecated since 5.0). The **grid** in cooperative groups is a single dispatch of kernels for execution like the original grid. @@ -47,8 +46,19 @@ The **block** is the same as the :ref:`inherent_thread_model` block entity. Explicit warp-level thread handling is absent from the Cooperative Groups API. In order to exploit the known hardware SIMD width on which built-in functionality translates to simpler logic, you can use the group partitioning part of the API, such as ``tiled_partition``. +.. _coop_thread_bottom_hierarchy: + +.. figure:: ../data/how-to/cooperative_groups/thread_hierarchy_coop_bottom.svg + :alt: The new level between block thread and threads. + + Cooperative group thread hierarchy in blocks. + +The cooperative groups API introduce a new level between block thread and threads. The :ref:`thread-block tile ` give the opportunity to have tiles in the thread block, while the :ref:`coalesced group ` holds the active threads of the parent group. These groups further discussed in the :ref:`groups types ` section. + For details on memory model, check the :ref:`memory model description `. +.. _coop_group_types: + Group types =========== @@ -109,6 +119,8 @@ The ``num_grids()`` , ``grid_rank()`` , ``thread_rank()``, ``size()``, ``cg_type and ``sync()`` member functions are public of the ``multi_grid_group`` class. For further details check the :ref:`multi_grid_group references ` . +.. _coop_thread_block_tile: + Thread-block tile ------------------ @@ -135,10 +147,20 @@ Constructed via: The ``thread_rank()`` , ``size()``, ``cg_type()``, ``is_valid()``, ``sync()``, ``meta_group_rank()``, ``meta_group_size()``, ``shfl()``, ``shfl_down()``, ``shfl_up()``, ``shfl_xor()``, ``ballot()``, ``any()``, ``all()``, ``match_any()`` and ``match_all()`` member functions are public of the ``thread_block_tile`` class. For further details, check the :ref:`thread_block_tile references ` . +.. _coop_coalesced_groups: + Coalesced groups ------------------ -Threads (64 threads on CDNA and 32 threads on RDNA or NVIDIA GPUs) in a warp cannot execute different instructions simultaneously, so conditional branches are executed serially within the warp. When threads encounter a conditional branch, they can diverge, resulting in some threads being disabled, if they do not meet the condition to execute that branch. The active threads referred as coalesced, and coalesced group represents an active thread group within a warp. +Threads (64 threads on CDNA and 32 threads on RDNA) in a warp cannot execute different instructions simultaneously, so conditional branches are executed serially within the warp. When threads encounter a conditional branch, they can diverge, resulting in some threads being disabled, if they do not meet the condition to execute that branch. The active threads referred as coalesced, and coalesced group represents an active thread group within a warp. + +.. note:: + + The NVIDIA GPU's independent thread scheduling presents the appearance that threads on different branches execute concurrently. + +.. warning:: + + AMD GPUs do not support independent thread scheduling. Some CUDA application can rely on this feature and the ported HIP version on AMD GPUs can deadlock, when they try to make use of independent thread scheduling. This group type also supports sub-wave level intrinsics.