diff --git a/wasmedge-ggml-llama-interactive/README.md b/wasmedge-ggml-llama-interactive/README.md
index a3124f2..59785b8 100644
--- a/wasmedge-ggml-llama-interactive/README.md
+++ b/wasmedge-ggml-llama-interactive/README.md
@@ -143,7 +143,7 @@ Supported parameters include:
 - `stream-stdout`: Set it to true to print the inferred tokens to standard output. (default: `false`)
 - `ctx-size`: Set the context size, the same as the `--ctx-size` parameter in llama.cpp. (default: `512`)
 - `n-predict`: Set the number of tokens to predict, the same as the `--n-predict` parameter in llama.cpp. (default: `512`)
-- `n-gpu-layers`: Set the number of layers to store in VRAM, the same as the `--n-gpu-layers` parameter in llama.cpp. (default: `0`)
+- `n-gpu-layers`: Set the number of layers to store in VRAM, the same as the `--n-gpu-layers` parameter in llama.cpp. When using Metal support in macOS, please set `n-gpu-layers` to `0` or do not set it for the default value. (default: `0`)
 - `reverse-prompt`: Set it to the token at which you want to halt the generation. Similar to the `--reverse-prompt` parameter in llama.cpp.  (default: `""`)
 - `batch-size`: Set the number of batch size for prompt processing, the same as the `--batch-size` parameter in llama.cpp.  (default: `512`)