chore: Adding correct example for model parameters with examples #741

tikikun · 2023-11-28T03:42:00Z

Problem
We do not have clear example at the jan docs page now

Success Criteria
Simple example of one model loading case

tikikun · 2023-11-28T04:18:58Z

Some clarification about ctx_len and max_tokens

Q: Why there is no ctx_len in OpenAI API?

OpenAI has pre-defined ctx_len for each of their model (for example gpt-3.5-turbo has ctx_len of 4096) and all the models are already loaded which mean you just need to use their chat/completions endpoint.
-> No settings for ctx_len because each model already has a fixed ctx_len
-> Nitro can set your own ctx_len because you load you own model

Q: What is the difference between ctx_len and max_token

max_token: Maximum number of tokens that you allow the model to generate during inferencing

Example: You max_token = 10, then if the chat is outputting more than 10 words -> truncate (and lead to truncate issue last time)

ctx_len: The upper limit of token that can be processed during inference on the backend, the relationship between ctx_len and max_token:

$$ \text{max token} + \text{chat token} < \text{ctx len} $$

Q: What will happen if I input a chat that is longer than the ctx len ( max_token + chat_token > ctx_len )

In this scenario a context shift will happen, the inference will cut the extra context that is not fitted into ctx_len and keep doing inferencing normally, but it might lose some memory outside of ctx_len

Q: What value should i add as default params for max_token

In practice you should both set ctx_len and max_token to be the same value, and it should follow the maximum token of the model. An example can be checked at: https://huggingface.co/TheBloke/neural-chat-7B-v3-1-AWQ.

So normally just set both values to the values that is specified on where you download the model

tikikun added the type: docs Improvements or additions to documentation label Nov 28, 2023

tikikun self-assigned this Nov 28, 2023

tikikun added this to Jan & Cortex Nov 28, 2023

tikikun linked a pull request Nov 28, 2023 that will close this issue

chore: clarification changes to the model settings and model parameters #742

Merged

tikikun closed this as completed in #742 Nov 29, 2023

github-project-automation bot moved this to Done in Jan & Cortex Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Adding correct example for model parameters with examples #741

chore: Adding correct example for model parameters with examples #741

tikikun commented Nov 28, 2023

tikikun commented Nov 28, 2023 •

edited

Loading

chore: Adding correct example for model parameters with examples #741

chore: Adding correct example for model parameters with examples #741

Comments

tikikun commented Nov 28, 2023

tikikun commented Nov 28, 2023 • edited Loading

Q: Why there is no ctx_len in OpenAI API?

Q: What is the difference between ctx_len and max_token

Q: What will happen if I input a chat that is longer than the ctx len ( max_token + chat_token > ctx_len )

Q: What value should i add as default params for max_token

tikikun commented Nov 28, 2023 •

edited

Loading