diff --git a/docs/data/programming_model/understand/cdna3_cu_dark.png b/docs/data/programming_model/understand/cdna3_cu_dark.png
deleted file mode 100644
index 3fada0d43f..0000000000
Binary files a/docs/data/programming_model/understand/cdna3_cu_dark.png and /dev/null differ
diff --git a/docs/data/programming_model/understand/rdna3_cu.drawio b/docs/data/programming_model/understand/rdna3_cu.drawio
deleted file mode 100644
index e69de29bb2..0000000000
diff --git a/docs/data/hardware_implementation/cdna2_gcd.png b/docs/data/understand/hardware_implementation/cdna2_gcd.png
similarity index 100%
rename from docs/data/hardware_implementation/cdna2_gcd.png
rename to docs/data/understand/hardware_implementation/cdna2_gcd.png
diff --git a/docs/data/hardware_implementation/cdna3_cu.png b/docs/data/understand/hardware_implementation/cdna3_cu.png
similarity index 100%
rename from docs/data/hardware_implementation/cdna3_cu.png
rename to docs/data/understand/hardware_implementation/cdna3_cu.png
diff --git a/docs/data/hardware_implementation/compute_unit.drawio b/docs/data/understand/hardware_implementation/compute_unit.drawio
similarity index 100%
rename from docs/data/hardware_implementation/compute_unit.drawio
rename to docs/data/understand/hardware_implementation/compute_unit.drawio
diff --git a/docs/data/hardware_implementation/compute_unit.svg b/docs/data/understand/hardware_implementation/compute_unit.svg
similarity index 100%
rename from docs/data/hardware_implementation/compute_unit.svg
rename to docs/data/understand/hardware_implementation/compute_unit.svg
diff --git a/docs/data/hardware_implementation/rdna3_cu.png b/docs/data/understand/hardware_implementation/rdna3_cu.png
similarity index 100%
rename from docs/data/hardware_implementation/rdna3_cu.png
rename to docs/data/understand/hardware_implementation/rdna3_cu.png
diff --git a/docs/data/programming_model/understand/cdna2_gcd.png b/docs/data/understand/programming_model/cdna2_gcd.png
similarity index 100%
rename from docs/data/programming_model/understand/cdna2_gcd.png
rename to docs/data/understand/programming_model/cdna2_gcd.png
diff --git a/docs/data/programming_model/understand/cdna3_cu.png b/docs/data/understand/programming_model/cdna3_cu.png
similarity index 100%
rename from docs/data/programming_model/understand/cdna3_cu.png
rename to docs/data/understand/programming_model/cdna3_cu.png
diff --git a/docs/data/programming_model/understand/rdna3_cu.png b/docs/data/understand/programming_model/rdna3_cu.png
similarity index 100%
rename from docs/data/programming_model/understand/rdna3_cu.png
rename to docs/data/understand/programming_model/rdna3_cu.png
diff --git a/docs/data/programming_model/understand/simt.drawio b/docs/data/understand/programming_model/simt.drawio
similarity index 100%
rename from docs/data/programming_model/understand/simt.drawio
rename to docs/data/understand/programming_model/simt.drawio
diff --git a/docs/data/programming_model/understand/simt.svg b/docs/data/understand/programming_model/simt.svg
similarity index 100%
rename from docs/data/programming_model/understand/simt.svg
rename to docs/data/understand/programming_model/simt.svg
diff --git a/docs/data/programming_model/reference/memory_hierarchy.drawio b/docs/data/understand/programming_model_reference/memory_hierarchy.drawio
similarity index 100%
rename from docs/data/programming_model/reference/memory_hierarchy.drawio
rename to docs/data/understand/programming_model_reference/memory_hierarchy.drawio
diff --git a/docs/data/programming_model/reference/memory_hierarchy.svg b/docs/data/understand/programming_model_reference/memory_hierarchy.svg
similarity index 100%
rename from docs/data/programming_model/reference/memory_hierarchy.svg
rename to docs/data/understand/programming_model_reference/memory_hierarchy.svg
diff --git a/docs/data/programming_model/reference/thread_hierarchy.drawio b/docs/data/understand/programming_model_reference/thread_hierarchy.drawio
similarity index 100%
rename from docs/data/programming_model/reference/thread_hierarchy.drawio
rename to docs/data/understand/programming_model_reference/thread_hierarchy.drawio
diff --git a/docs/data/programming_model/reference/thread_hierarchy.svg b/docs/data/understand/programming_model_reference/thread_hierarchy.svg
similarity index 100%
rename from docs/data/programming_model/reference/thread_hierarchy.svg
rename to docs/data/understand/programming_model_reference/thread_hierarchy.svg
diff --git a/docs/data/programming_model/reference/thread_hierarchy_coop.drawio b/docs/data/understand/programming_model_reference/thread_hierarchy_coop.drawio
similarity index 100%
rename from docs/data/programming_model/reference/thread_hierarchy_coop.drawio
rename to docs/data/understand/programming_model_reference/thread_hierarchy_coop.drawio
diff --git a/docs/data/programming_model/reference/thread_hierarchy_coop.svg b/docs/data/understand/programming_model_reference/thread_hierarchy_coop.svg
similarity index 100%
rename from docs/data/programming_model/reference/thread_hierarchy_coop.svg
rename to docs/data/understand/programming_model_reference/thread_hierarchy_coop.svg
diff --git a/docs/understand/hardware_implementation.rst b/docs/understand/hardware_implementation.rst
index f95d8fc6b4..8ee3e0e08c 100644
--- a/docs/understand/hardware_implementation.rst
+++ b/docs/understand/hardware_implementation.rst
@@ -46,7 +46,7 @@ The amount of warps that can reside concurrently on a CU, known
 as occupancy, is determined by the warp's resource usage of registers and
 shared memory.
 
-.. figure:: ../data/hardware_implementation/compute_unit.svg
+.. figure:: ../data/understand/hardware_implementation/compute_unit.svg
     :alt: Diagram depicting the general structure of a compute unit of an AMD
           GPU.
 
@@ -110,9 +110,9 @@ The general structure of CUs stays mostly as it is in GCN
 architectures. The most prominent change is the addition of matrix ALUs, which
 can greatly improve the performance of algorithms involving matrix
 multiply-accumulate operations for
-:doc:`int8, float16, bfloat16 or float32<rocm:about/compatibility/precision-support>`.
+:doc:`int8, float16, bfloat16 or float32<rocm:compatibility/precision-support>`.
 
-.. figure:: ../data/hardware_implementation/cdna3_cu.png
+.. figure:: ../data/understand/hardware_implementation/cdna3_cu.png
   :alt: Block diagram showing the structure of a CDNA3 compute unit. It includes
         Shader Cores, the Matrix Core Unit, a Local Data Share used for sharing
         memory between threads in a block, an L1 Cache and a Scheduler. The
@@ -136,7 +136,7 @@ It also adds an extra layer of cache to the WGP, shared by the CUs
 within it. This cache is referred to as L1 cache, promoting the per-CU cache to
 an L0 cache.
 
-.. figure:: ../data/hardware_implementation/rdna3_cu.png
+.. figure:: ../data/understand/hardware_implementation/rdna3_cu.png
   :alt: Block diagram showing the structure of an RDNA3 Compute Unit. It
         consists of four SIMD units, each including a vector and scalar register
         file, with the corresponding scalar and vector ALUs. All four SIMDs
@@ -152,7 +152,7 @@ For hardware implementation's sake, multiple CUs are grouped
 together into a Shader Engine or Compute Engine, typically sharing some fixed
 function units or memory subsystem resources.
 
-.. figure:: ../data/hardware_implementation/cdna2_gcd.png
+.. figure:: ../data/understand/hardware_implementation/cdna2_gcd.png
   :alt: Block diagram showing four Compute Engines each with 28 Compute Units
         inside. These four Compute Engines share one block of L2 Cache. Around
         them are four Memory Controllers. To the top and bottom of all these are
diff --git a/docs/understand/programming_model.rst b/docs/understand/programming_model.rst
index 092cf6796c..4307226064 100644
--- a/docs/understand/programming_model.rst
+++ b/docs/understand/programming_model.rst
@@ -30,7 +30,7 @@ AMD block diagrams, or as streaming multiprocessor (SM).
 
 .. _rdna3_cu:
 
-.. figure:: ../data/programming_model/understand/rdna3_cu.png
+.. figure:: ../data/understand/programming_model/rdna3_cu.png
   :alt: Block diagram showing the structure of an RDNA3 Compute Unit. It
         consists of four SIMD units, each including a vector and scalar register
         file, with the corresponding scalar and vector ALUs. All four SIMDs
@@ -41,7 +41,7 @@ AMD block diagrams, or as streaming multiprocessor (SM).
 
 .. _cdna3_cu:
 
-.. figure:: ../data/programming_model/understand/cdna3_cu.png
+.. figure:: ../data/understand/programming_model/cdna3_cu.png
   :alt: Block diagram showing the structure of a CDNA3 compute unit. It includes
         Shader Cores, the Matrix Core Unit, a Local Data Share used for sharing
         memory between threads in a block, an L1 Cache and a Scheduler. The
@@ -56,7 +56,7 @@ memory subsystem resources.
 
 .. _cdna2_gcd:
 
-.. figure:: ../data/programming_model/understand/cdna2_gcd.png
+.. figure:: ../data/understand/programming_model/cdna2_gcd.png
   :alt: Block diagram showing four Compute Engines each with 28 Compute Units
         inside. These four Compute Engines share one block of L2 Cache. Around
         them are four Memory Controllers. To the top and bottom of all these are
@@ -103,7 +103,7 @@ typically look the following:
 
 .. _simt:
 
-.. figure:: ../data/programming_model/understand/simt.svg
+.. figure:: ../data/understand/programming_model/simt.svg
   :alt: Image representing the instruction flow of a SIMT program. Two identical
         arrows pointing downward with blocks representing the instructions
         inside and ellipsis between the arrows. The instructions represented in
diff --git a/docs/understand/programming_model_reference.rst b/docs/understand/programming_model_reference.rst
index 600fcad3da..1120728dad 100644
--- a/docs/understand/programming_model_reference.rst
+++ b/docs/understand/programming_model_reference.rst
@@ -34,7 +34,7 @@ The thread hierarchy inherent to how AMD GPUs operate is depicted in
 
 .. _inherent_thread_hierarchy:
 
-.. figure:: ../data/programming_model/reference/thread_hierarchy.svg
+.. figure:: ../data/understand/programming_model_reference/thread_hierarchy.svg
   :alt: Diagram depicting nested rectangles of varying color. The outermost one
         titled "Grid", inside sets of uniform rectangles layered on one another
         titled "Block". Each "Block" containing sets of uniform rectangles
@@ -93,7 +93,7 @@ The thread hierarchy abstraction of Cooperative Groups manifest as depicted in
 
 .. _coop_thread_hierarchy:
 
-.. figure:: ../data/programming_model/reference/thread_hierarchy_coop.svg
+.. figure:: ../data/understand/programming_model_reference/thread_hierarchy_coop.svg
   :alt: Diagram depicting nested rectangles of varying color. The outermost one
         titled "Grid", inside sets of different sized rectangles layered on
         one another titled "Block". Each "Block" containing sets of uniform
@@ -134,7 +134,7 @@ how they relate to the various levels of the threading model.
 
 .. _memory_hierarchy:
 
-.. figure:: ../data/programming_model/reference/memory_hierarchy.svg
+.. figure:: ../data/understand/programming_model_reference/memory_hierarchy.svg
   :alt: Diagram depicting nested rectangles of varying color. The outermost one
         titled "Grid", inside on the upper half a rectangle titled "Cluster".
         Inside it are two identical rectangles titled "Block", inside them are