Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

810 docs add modeljson and revamp models specs page #816

Merged
merged 12 commits into from
Dec 4, 2023
60 changes: 60 additions & 0 deletions docs/docs/specs/engineering/engine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Engine
slug: /specs/engine
---

:::caution

Currently Under Development

:::

## Overview

In the Jan application, engines serve as primary entities with the following capabilities:

- Engine will be installed through `inference-extensions`.
- Models will depend on engines to do [inference](https://en.wikipedia.org/wiki/Inference_engine).
- Engine configuration and required metadata will be stored in a json file.

## Folder Structure

- Default parameters for engines are stored in JSON files located in the `/engines` folder.
- These parameter files are named uniquely with `engine_id`.
- Engines are referenced directly using `engine_id` in the `model.json` file.

```yaml
jan/
engines/
nitro.json
openai.json
.....
```

## Engine Default Parameter Files

- Each inference engine requires default parameters to function in cases where user-provided parameters are absent.
- These parameters are stored in JSON files, structured as simple key-value pairs.

### Example

Here is an example of an engine file for `engine_id` `nitro`:

```js
{
"ctx_len": 512,
"ngl": 100,
"embedding": false,
"n_parallel": 1,
"cont_batching": false
"prompt_template": "<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant"
}
```

For detailed engine parameters, refer to: [Nitro's Model Settings](https://nitro.jan.ai/features/load-unload#table-of-parameters)

## Adding an Engine

- Engine parameter files are automatically generated upon installing an `inference-extension` in the Jan application.

---
12 changes: 7 additions & 5 deletions docs/docs/specs/engineering/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ jan/ # Jan root folder

Here's a standard example `model.json` for a GGUF model.

- `source_url`: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/.

```js
{
"id": "zephyr-7b", // Defaults to foldername
"object": "model", // Defaults to "model"
"source_url": "https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/blob/main/zephyr-7b-beta.Q4_K_M.gguf",
Expand All @@ -64,13 +64,14 @@ Here's a standard example `model.json` for a GGUF model.
"description": null, // Defaults to null
"state": enum[null, "downloading", "ready", "starting", "stopping", ...]
"format": "ggufv3", // Defaults to "ggufv3"
"settings": { // Models are initialized with settings
"ctx_len": 2048,
"egine": "nitro", // engine_id specified in jan/engine folder
tikikun marked this conversation as resolved.
Show resolved Hide resolved
"engine_parameters": { // Engine parameters inside model.json can override
"ctx_len": 2048, // the value inside the base engine.json
"ngl": 100,
"embedding": true,
"n_parallel": 4,
},
"parameters": { // Models are called parameters
"model_parameters": { // Models are called parameters
"stream": true,
"max_tokens": 2048,
"stop": ["<endofstring>"], // This usually can be left blank, only used with specific need from model author
Expand All @@ -83,9 +84,10 @@ Here's a standard example `model.json` for a GGUF model.
"assets": [ // Defaults to current dir
"file://.../zephyr-7b-q4_k_m.bin",
]
}
```

The model settings in the example can be found at: [Nitro's model settings](https://nitro.jan.ai/features/load-unload#table-of-parameters)
The engine parameters in the example can be found at: [Nitro's model settings](https://nitro.jan.ai/features/load-unload#table-of-parameters)

The model parameters in the example can be found at: [Nitro's model parameters](https://nitro.jan.ai/api-reference#tag/Chat-Completion)

Expand Down
1 change: 1 addition & 0 deletions docs/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ const sidebars = {
items: [
"specs/engineering/chats",
"specs/engineering/models",
"specs/engineering/engine",
"specs/engineering/threads",
"specs/engineering/messages",
"specs/engineering/assistants",
Expand Down