diff --git a/docs/docs/features/embed.md b/docs/docs/features/embed.md index 3e571996a..58bde7269 100644 --- a/docs/docs/features/embed.md +++ b/docs/docs/features/embed.md @@ -89,4 +89,4 @@ The example response used the output from model [llama2 Chat 7B Q5 (GGUF)](https -The embedding feature in Nitro demonstrates a high level of compatibility with OpenAI, simplifying the transition between using OpenAI and local AI models. For more detailed information and advanced use cases, refer to the comprehensive [API Reference](https://nitro.jan.ai/api-reference)). +The embedding feature in Nitro demonstrates a high level of compatibility with OpenAI, simplifying the transition between using OpenAI and local AI models. For more detailed information and advanced use cases, refer to the comprehensive [API Reference](https://nitro.jan.ai/api-reference). diff --git a/docs/docs/features/prompt.md b/docs/docs/features/prompt.md index bf2a07d3c..0dbefc663 100644 --- a/docs/docs/features/prompt.md +++ b/docs/docs/features/prompt.md @@ -22,7 +22,7 @@ Nitro enables developers to configure dialogs and implement advanced prompt engi To illustrate, let's create a "Pirate assistant": -> NOTE: "ai_prompt" and "user_prompt" are prefixes indicating the role. Configure them based on your model. +> NOTE: "ai_prompt", "user_prompt" and "system_prompt" are prefixes indicating the role. Configure them based on your model. ### Prompt Configuration @@ -33,6 +33,7 @@ curl http://localhost:3928/inferences/llamacpp/loadmodel \ "ctx_len": 128, "ngl": 100, "pre_prompt": "You are a Pirate. Using drunk language with a lot of Arr...", + "system_prompt": "ASSISTANT'S RULE: ", "user_prompt": "USER:", "ai_prompt": "ASSISTANT: " }' diff --git a/docs/docs/new/about.md b/docs/docs/new/about.md index e33bf9873..aee8ad51b 100644 --- a/docs/docs/new/about.md +++ b/docs/docs/new/about.md @@ -119,3 +119,7 @@ Nitro welcomes contributions in various forms, not just coding. Here are some wa - [drogon](https://github.com/drogonframework/drogon): The fast C++ web framework - [llama.cpp](https://github.com/ggerganov/llama.cpp): Inference of LLaMA model in pure C/C++ + +## FAQ +:::info COMING SOON +::: \ No newline at end of file diff --git a/docs/docs/new/architecture.md b/docs/docs/new/architecture.md index 01acb8093..ea30fc94e 100644 --- a/docs/docs/new/architecture.md +++ b/docs/docs/new/architecture.md @@ -4,8 +4,6 @@ title: Architecture ![Nitro Architecture](img/architecture.drawio.png) -### Details element example - ## Key Concepts ## Inference Server diff --git a/docs/docs/new/quickstart.md b/docs/docs/new/quickstart.md index 9ec8dec69..61745ef6c 100644 --- a/docs/docs/new/quickstart.md +++ b/docs/docs/new/quickstart.md @@ -7,18 +7,16 @@ title: Quickstart ### For Linux and MacOS Open your terminal and enter the following command. This will download and install Nitro on your system. - -```bash -curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh -o /tmp/install.sh && chmod +x /tmp/install.sh && sudo bash /tmp/install.sh --gpu && rm /tmp/install.sh -``` + ```bash + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh | sudo /bin/bash - + ``` ### For Windows Open PowerShell and execute the following command. This will perform the same actions as for Linux and MacOS but is tailored for Windows. - -```bash -powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat --gpu; Remove-Item -Path 'install.bat' }" -``` + ```bash + powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat; Remove-Item -Path 'install.bat' }" + ``` > **NOTE:**Installing Nitro will add new files and configurations to your system to enable it to run. diff --git a/docs/openapi/NitroAPI.yaml b/docs/openapi/NitroAPI.yaml index 3b85557d3..3ddb32a06 100644 --- a/docs/openapi/NitroAPI.yaml +++ b/docs/openapi/NitroAPI.yaml @@ -167,7 +167,7 @@ paths: operationId: createChatCompletion tags: - Chat Completion - summary: Create an chat with the model. + summary: Create a chat with the model. requestBody: content: application/json: @@ -544,11 +544,11 @@ components: stream: type: boolean default: true - description: Enables continuous output generation, allowing for streaming of model responses. + description: Enables continuous output generation, allowing for streaming of model responses model: type: string example: "gpt-3.5-turbo" - description: Specifies the model being used for inference or processing tasks. + description: Specifies the model being used for inference or processing tasks max_tokens: type: number default: 2048 @@ -556,11 +556,11 @@ components: stop: type: arrays example: ["hello"] - description: Defines specific tokens or phrases at which the model will stop generating further output. + description: Defines specific tokens or phrases at which the model will stop generating further output frequency_penalty: type: number default: 0 - description: Adjusts the likelihood of the model repeating words or phrases in its output. + description: Adjusts the likelihood of the model repeating words or phrases in its output presence_penalty: type: number default: 0 @@ -571,6 +571,12 @@ components: min: 0 max: 1 description: Controls the randomness of the model's output + top_p: + type: number + default: 0.95 + min: 0 + max: 1 + description: Set probability threshold for more relevant outputs ChatCompletionResponse: type: object diff --git a/docs/openapi/OpenAIAPI.yaml b/docs/openapi/OpenAIAPI.yaml index d6a9faa17..3402eb5a6 100644 --- a/docs/openapi/OpenAIAPI.yaml +++ b/docs/openapi/OpenAIAPI.yaml @@ -9869,3 +9869,4 @@ x-oaiMeta: - type: endpoint key: createEdit path: create +