From bff9e7d0233ecfce4d626e0aa07a888ad359401a Mon Sep 17 00:00:00 2001 From: Istvan Kiss Date: Mon, 18 Nov 2024 21:19:02 +0100 Subject: [PATCH] Fix doc links and fix spelling --- docs/how-to/hip_runtime_api.rst | 4 ++-- docs/how-to/hip_runtime_api/error_handling.rst | 14 +++++++------- docs/how-to/hip_runtime_api/initialization.rst | 8 ++++---- .../memory_management/coherence_control.rst | 10 ++++------ 4 files changed, 17 insertions(+), 19 deletions(-) diff --git a/docs/how-to/hip_runtime_api.rst b/docs/how-to/hip_runtime_api.rst index f065554cbf..0dcdb857a9 100644 --- a/docs/how-to/hip_runtime_api.rst +++ b/docs/how-to/hip_runtime_api.rst @@ -10,7 +10,7 @@ HIP runtime API The HIP runtime API provides C and C++ functionalities to manage event, stream, and memory on GPUs. On the AMD platform, the HIP runtime uses -:doc:`Compute Language Runtime (CLR) `, while on NVIDIA +:doc:`Compute Language Runtime (CLR) <./understand/amd_clr>`, while on NVIDIA CUDA platform, it is only a thin layer over the CUDA runtime or Driver API. - **CLR** contains source code for AMD's compute language runtimes: ``HIP`` and @@ -23,7 +23,7 @@ CUDA platform, it is only a thin layer over the CUDA runtime or Driver API. implementation. - The **CUDA runtime** is built on top of the CUDA driver API, which is a C API with lower-level access to NVIDIA GPUs. For details about the CUDA driver and - runtime API with reference to HIP, see :doc:`CUDA driver API porting guide `. + runtime API with reference to HIP, see :doc:`CUDA driver API porting guide <./how-to/hip_porting_driver_api>`. The backends of HIP runtime API under AMD and NVIDIA platform are summarized in the following figure: diff --git a/docs/how-to/hip_runtime_api/error_handling.rst b/docs/how-to/hip_runtime_api/error_handling.rst index 564020ff7b..e28420dfc7 100644 --- a/docs/how-to/hip_runtime_api/error_handling.rst +++ b/docs/how-to/hip_runtime_api/error_handling.rst @@ -9,20 +9,20 @@ Error handling HIP provides functionality to detect, report, and manage errors that occur during the execution of HIP runtime functions or when launching kernels. Every HIP runtime function, apart from launching kernels, has :cpp:type:`hipError_t` -as return type. :cpp:func:`hipGetLastError()` and :cpp:func:`hipPeekAtLastError()` +as return type. :cpp:func:`hipGetLastError` and :cpp:func:`hipPeekAtLastError` can be used for catching errors from kernel launches, as kernel launches don't return an error directly. HIP maintains an internal state, that includes the last error code. :cpp:func:`hipGetLastError` returns and resets that error to -hipSuccess, while :cpp:func:`hipPeekAtLastError` just returns the error without -changing it. To get a human readable version of the errors, -:cpp:func:`hipGetErrorString()` and :cpp:func:`hipGetErrorName()` can be used. +``hipSuccess``, while :cpp:func:`hipPeekAtLastError` just returns the error +without changing it. To get a human readable version of the errors, +:cpp:func:`hipGetErrorString` and :cpp:func:`hipGetErrorName` can be used. .. note:: :cpp:func:`hipGetLastError` returns the returned error code of the last HIP - runtime API call even if it's hipSuccess, while ``cudaGetLastError`` returns - the error returned by any of the preceding CUDA APIs in the same host thread. - :cpp:func:`hipGetLastError` behavior will be matched with + runtime API call even if it's ``hipSuccess``, while ``cudaGetLastError`` + returns the error returned by any of the preceding CUDA APIs in the same + host thread. :cpp:func:`hipGetLastError` behavior will be matched with ``cudaGetLastError`` in ROCm release 7.0. Best practices of HIP error handling: diff --git a/docs/how-to/hip_runtime_api/initialization.rst b/docs/how-to/hip_runtime_api/initialization.rst index 0e88c1895e..846932681c 100644 --- a/docs/how-to/hip_runtime_api/initialization.rst +++ b/docs/how-to/hip_runtime_api/initialization.rst @@ -39,7 +39,7 @@ your program. .. note:: - You can use :cpp:func:`hipDeviceReset()` to delete all streams created, memory + You can use :cpp:func:`hipDeviceReset` to delete all streams created, memory allocated, kernels running and events created by the current process. Any new HIP API call initializes the HIP runtime again. @@ -55,9 +55,9 @@ Querying GPUs -------------------------------------------------------------------------------- The properties of a GPU can be queried using :cpp:func:`hipGetDeviceProperties`, -which returns a struct of :cpp:struct:`hipDeviceProp_t`. The properties in the struct can be -used to identify a device or give an overview of hardware characteristics, that -might make one GPU better suited for the task than others. +which returns a struct of :cpp:struct:`hipDeviceProp_t`. The properties in the +struct can be used to identify a device or give an overview of hardware +characteristics, that might make one GPU better suited for the task than others. The :cpp:func:`hipGetDeviceCount` function returns the number of available GPUs, which can be used to loop over the available GPUs. diff --git a/docs/how-to/hip_runtime_api/memory_management/coherence_control.rst b/docs/how-to/hip_runtime_api/memory_management/coherence_control.rst index 8cacdf6809..7754add29a 100644 --- a/docs/how-to/hip_runtime_api/memory_management/coherence_control.rst +++ b/docs/how-to/hip_runtime_api/memory_management/coherence_control.rst @@ -31,8 +31,6 @@ two different types of coherence: To achieve fine-grained coherence, many AMD GPUs use a limited cache policy, such as leaving these allocations uncached by the GPU or making them read-only. -.. TODO: Is this still valid? What about Mi300? - Mi200 accelerator's hardware based floating point instructions work on coarse-grained memory regions. Coarse-grained coherence is typically useful in reducing host-device interconnect communication. @@ -161,10 +159,10 @@ fine- and coarse-grained memory types are listed here: - Depends on the used event. - No -You can control the release scope for hipEvents. By default, the GPU performs a -device-scope acquire and release operation with each recorded event. This makes -the host and device memory visible to other commands executing on the same -device. +You can control the release scope for ``hipEvents``. By default, the GPU +performs a device-scope acquire and release operation with each recorded event. +This makes the host and device memory visible to other commands executing on the +same device. :cpp:func:`hipEventCreateWithFlags`: You can specify a stronger system-level fence by creating the event with ``hipEventCreateWithFlags``: