fix code format

Signed-off-by: youkaichao <[email protected]>
vllm-project · Nov 24, 2024 · c7d5d4a · c7d5d4a
1 parent da41521
commit c7d5d4a
Showing 1 changed file with 8 additions and 2 deletions.
diff --git a/docs/source/models/adding_model.rst b/docs/source/models/adding_model.rst
@@ -43,7 +43,10 @@ For instance, vLLM's `OPT model <https://github.com/vllm-project/vllm/blob/main/
 
 To ensure compatibility with vLLM, your model must meet the following requirements:
 
-1. Initialization Code: All vLLM modules within the model must include a ``prefix`` argument in their constructor. This ``prefix`` is typically the full name of the module in the model's state dictionary and is crucial for:
+Initialization Code
+^^^^^^^^^^^^^^^^^^^
+
+All vLLM modules within the model must include a ``prefix`` argument in their constructor. This ``prefix`` is typically the full name of the module in the model's state dictionary and is crucial for:
 
 * Runtime support: vLLM's attention operators are registered in a model's state by their full names. Each attention operator must have a unique prefix as its layer name to avoid conflicts.
 * Non-uniform quantization support: A quantized checkpoint can selectively quantize certain layers while keeping others in full precision. By providing the ``prefix`` during initialization, vLLM can match the current layer's ``prefix`` with the quantization configuration to determine if the layer should be initialized in quantized mode.
@@ -78,7 +81,10 @@ The initialization code should look like this:
             super().__init__()
             self.model = MyModel(vllm_config, prefix=f"{prefix}.model")
 
-2. Computation Code: Rewrite the :meth:`~torch.nn.Module.forward` method of your model to remove any unnecessary code, such as training-specific code. Modify the input parameters to treat `input_ids` and `positions` as flattened tensors with a single batch size dimension, without a max-sequence length dimension.
+Computation Code
+^^^^^^^^^^^^^^^^
+
+Rewrite the :meth:`~torch.nn.Module.forward` method of your model to remove any unnecessary code, such as training-specific code. Modify the input parameters to treat `input_ids` and `positions` as flattened tensors with a single batch size dimension, without a max-sequence length dimension.
 
 .. code-block:: python