Skip to content

Commit

Permalink
[Example] Update README for n-gpu-layers in macOS
Browse files Browse the repository at this point in the history
Signed-off-by: dm4 <[email protected]>
  • Loading branch information
dm4 authored and hydai committed Nov 3, 2023
1 parent e57a678 commit 3f93dd0
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion wasmedge-ggml-llama-interactive/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ Supported parameters include:
- `stream-stdout`: Set it to true to print the inferred tokens to standard output. (default: `false`)
- `ctx-size`: Set the context size, the same as the `--ctx-size` parameter in llama.cpp. (default: `512`)
- `n-predict`: Set the number of tokens to predict, the same as the `--n-predict` parameter in llama.cpp. (default: `512`)
- `n-gpu-layers`: Set the number of layers to store in VRAM, the same as the `--n-gpu-layers` parameter in llama.cpp. (default: `0`)
- `n-gpu-layers`: Set the number of layers to store in VRAM, the same as the `--n-gpu-layers` parameter in llama.cpp. When using Metal support in macOS, please set `n-gpu-layers` to `0` or do not set it for the default value. (default: `0`)
- `reverse-prompt`: Set it to the token at which you want to halt the generation. Similar to the `--reverse-prompt` parameter in llama.cpp. (default: `""`)
- `batch-size`: Set the number of batch size for prompt processing, the same as the `--batch-size` parameter in llama.cpp. (default: `512`)

Expand Down

0 comments on commit 3f93dd0

Please sign in to comment.