Skip to content

Commit

Permalink
update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
hahuyhoang411 committed Nov 9, 2023
1 parent 5667ea8 commit fba6fe5
Showing 1 changed file with 19 additions and 3 deletions.
22 changes: 19 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
</p>

<p align="center">
<a href="https://docs.jan.ai/">Getting Started</a> - <a href="https://docs.jan.ai">Docs</a>
<a href="https://jan.ai/nitro">Getting Started</a> - <a href="https://jan.ai/nitro">Docs</a>
- <a href="https://docs.jan.ai/changelog/">Changelog</a> - <a href="https://github.com/janhq/nitro/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
</p>

Expand Down Expand Up @@ -67,11 +67,27 @@ curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \
"llama_model_path": "/path/to/your_model.gguf",
"ctx_len": 2048,
"ngl": 100,
"embedding": true
"embedding": true,
"n_parallel": 4,
"pre_prompt": "A chat between a curious user and an artificial intelligence",
"user_prompt": "what is AI?"
}'
```

`ctx_len` and `ngl` are typical llama C++ parameters, and `embedding` determines whether to enable the embedding endpoint or not.
Table of parameters

| Parameter | Type | Description |
|------------------|---------|--------------------------------------------------------------|
| `llama_model_path` | String | The file path to the LLaMA model. |
| `ngl` | Integer | The number of GPU layers to use. |
| `ctx_len` | Integer | The context length for the model operations. |
| `embedding` | Boolean | Whether to use embedding in the model. |
| `n_parallel` | Integer | The number of parallel operations. Uses Drogon thread count if not set. |
| `cont_batching` | Boolean | Whether to use continuous batching. |
| `user_prompt` | String | The prompt to use for the user. |
| `ai_prompt` | String | The prompt to use for the AI assistant. |
| `system_prompt` | String | The prompt to use for system rules. |
| `pre_prompt` | String | The prompt to use for internal configuration. |

**Step 4: Perform Inference on Nitro for the First Time**

Expand Down

0 comments on commit fba6fe5

Please sign in to comment.