diff --git a/README.md b/README.md index 0e7933ddc..724fc3a1d 100644 --- a/README.md +++ b/README.md @@ -60,6 +60,7 @@ Download a llama model to try running the llama C++ integration. You can find a Double-click on Nitro to run it. After downloading your model, make sure it's saved to a specific path. Then, make an API call to load your model into Nitro. + ```zsh curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \ -H 'Content-Type: application/json' \ @@ -90,6 +91,15 @@ Table of parameters | `system_prompt` | String | The prompt to use for system rules. | | `pre_prompt` | String | The prompt to use for internal configuration. | + +***OPTIONAL***: You can run Nitro on a different port like 5000 instead of 3928 by running it manually in terminal +```zsh +./nitro 1 127.0.0.1 5000 ([thread_num] [host] [port]) +``` +- thread_num : the number of thread that nitro webserver needs to have +- host : host value normally 127.0.0.1 or 0.0.0.0 +- port : the port that nitro got deployed onto + **Step 4: Perform Inference on Nitro for the First Time** ```zsh diff --git a/docs/docs/api.md b/docs/docs/api.md new file mode 100644 index 000000000..e69de29bb diff --git a/docs/docs/api/overview.md b/docs/docs/api/overview.md deleted file mode 100644 index 2ed3c8202..000000000 --- a/docs/docs/api/overview.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: Overview ---- - -:::info Comming Soon -Updating... -::: \ No newline at end of file diff --git a/docs/docs/community/changelog.md b/docs/docs/community/changelog.md deleted file mode 100644 index 66b750854..000000000 --- a/docs/docs/community/changelog.md +++ /dev/null @@ -1,4 +0,0 @@ ---- -title: Nitro Changelog ---- - diff --git a/docs/docs/community/coc.md b/docs/docs/community/coc.md deleted file mode 100644 index b74e5a9ec..000000000 --- a/docs/docs/community/coc.md +++ /dev/null @@ -1,54 +0,0 @@ ---- -title: Code of Conduct ---- - -# Maximize Signal-to-Noise Ratio - -## 1. Don't Waste Time -- **Efficient Contributions**: Ensure your posts, commits, or comments are efficient and valuable. Avoid low-effort or trivial contributions. -- **Consequence of Non-Compliance**: Repeated low-value contributions may lead to removal or a ban. -- **Impact**: This policy aims to minimize distractions and time-wasting (Noise). - -## 2. Add Value -- **Contribute Meaningfully**: Your contributions should be concise, impactful, and helpful. -- **Focus on Clarity**: Be clear and to the point. Avoid lengthy, complex explanations where simpler ones will do. - -## 3. Do No Harm -- **Respectful Interaction**: No tolerance for insults, trolling, or disruptive debates. -- **Maintain Positive Environment**: Harassing behavior or needless argumentation will lead to an immediate, irrevocable ban. - -### Guidelines for Constructive Contributions - -**Effective Use of Time**: -- Time is invaluable. Use it wisely and respect others' time. -- Make sure reading your contribution is worthwhile and not a drain on cognitive resources. - -**What to Do**: -- **High-Quality Posts**: Well-organized, clear, concise, and to the point. -- **Efficient Demonstrations**: Convey information quickly and clearly. -- **Useful Examples**: Illustrate how and why things work without unnecessary detail. - -**What to Avoid**: -- **Unproductive Debates**: Avoid arguing over minor details or off-topic issues. -- **Shifting Focus**: Don’t constantly change topics or goals. -- **Excessive Length or Irrelevance**: Long-winded, opinionated, or irrelevant posts. - -### Encouraging Valuable Contributions - -**Focus on Adding Value**: -- Contribute only when it enhances understanding, solves problems, or is genuinely helpful. -- If unsure whether your contribution adds value, consider refining or omitting it. - -**Positive Contributions Include**: -- Solving problems efficiently. -- Adding or improving code. -- Sharing resources that benefit the collective understanding. - -### Maintaining a Respectful Environment - -**Zero Tolerance for Disruptive Behavior**: -- Any form of trolling, flaming, or griefing is strictly prohibited. -- Disruptive behavior leads to immediate and permanent exclusion. - -**Participation is a Privilege**: -- Remember that being part of this project and community is a privilege. Act responsibly and respectfully to maintain a productive and inclusive environment. diff --git a/docs/docs/community/contribuiting.md b/docs/docs/community/contribuiting.md deleted file mode 100644 index 215ac9d65..000000000 --- a/docs/docs/community/contribuiting.md +++ /dev/null @@ -1,41 +0,0 @@ ---- -title: Contributing to Nitro ---- - -Nitro is an open-source, fast, lightweight, and embeddable inference engine. It's used in [Jan](https://jan.ai/). This document guides you through the process of contributing to Nitro, whether you’re new to open source or an experienced contributor. - -- For New Contributors, please check out [How to Contribute to Open Source](https://opensource.guide/how-to-contribute/) for a general guide on open-source contribution. - -## Code of Conduct -Before contributing, please read our [Code of Conduct](coc) to understand the rules and expectations in our community. - -## Get Involved -There are many ways to contribute to Nitro, and not all involve coding. Here's a few ideas to get started: - -- Begin by going through the [Getting Started](nitro/overview) guide. If you encounter issues or have suggestions, let us know by [opening an issue](https://github.com/janhq/nitro/issues). - -- Browse [open issues](https://github.com/janhq/nitro/issues). You can offer workarounds, clarification, or suggest labels to help organize issues. If you find an issue you’d like to resolve, feel free to [open a pull request](https://github.com/janhq/nitro/pulls). Start with issues tagged as `Good first issue`. - -- Read through Nitro's documentation. If something is confusing or can be improved, click “Edit this page” at the bottom of most docs to propose changes directly on GitHub. - -- Check out feature requests from the community. You can contribute by opening a [pull request](https://github.com/janhq/nitro/pulls) for something you’re interested in working on. - -### Join our Discord Channel -We have the [#nitro-dev](https://discord.gg/FTk2MvZwJH) channel on [Discord](https://discord.gg/FTk2MvZwJH) to discuss all things about Nitro development. You can also be of great help by helping other users in the help channel. - -## How to Contribute -### Reporting Issues -- If you encounter problems with Nitro, create a [GitHub issue](https://github.com/janhq/nitro). -- Describe the issue in detail, including error logs and steps to reproduce it. - -### Feature Requests -- For new features, submit a request on [Nitro’s official GitHub](https://github.com/janhq/nitro). Avoid duplicate requests and clearly explain the benefits of your proposed feature. - -### Pull Requests -- You can submit one Pull Request (PR) per day. -- Make sure your PR has a clear description and adheres to Nitro's code style and structure. -- Avoid unnecessary reformatting or refactoring. PRs not following these guidelines will be considered non-compliant and may be rejected. - -### Triaging Issues and PRs -- Help manage incoming issues and PRs by asking for more information, suggesting labels, flagging stale issues, or asking for test plans. -- Review code if you can and provide constructive feedback. \ No newline at end of file diff --git a/docs/docs/community/support.md b/docs/docs/community/support.md deleted file mode 100644 index e0b701971..000000000 --- a/docs/docs/community/support.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -title: Support ---- - -On this page we've listed some Nitro-related communities that you can be a part of; see the other pages in this section for additional online and in-person learning materials. - -## Discord - -Join our [Discord community](https://discord.gg/FTk2MvZwJH) for real-time discussions and support: -- Use `#nitro-dev` for development-related questions. - -## Github - -Join our [Github](https://github.com/janhq/nitro) for understanding codebase. Here you can: - -- Browse source code and documentation. -- Track development progress and upcoming features. -- Report bugs or issues, and suggest improvements. - -## News -For the latest news about Nitro, follow [Nitro Discord](https://discord.gg/FTk2MvZwJH) and the [official Nitro blog](https://nitro.jan.ai) on this website. - ---- - -To understand the full potential of Nitro, we recommend experiencing it in action through [Jan](https://jan.ai/). Jan utilizes Nitro's capabilities, offering a practical demonstration of its efficiency and versatility. Explore Jan to see how Nitro powers real-world applications. - -[Explore Jan](https://jan.ai/) \ No newline at end of file diff --git a/docs/docs/examples/llm.md b/docs/docs/examples/llm.md new file mode 100644 index 000000000..3e06dde3c --- /dev/null +++ b/docs/docs/examples/llm.md @@ -0,0 +1,54 @@ +--- +title: Simple chatbot with Nitro +--- + +This guide provides instructions to create a chatbot powered by Nitro using the GGUF model. + +## Step 1: Download the Model + +First, you'll need to download the chatbot model. + +1. **Navigate to the Models Folder** + - Open your project directory. + - Locate and open the `models` folder within the directory. + +2. **Select a GGUF Model** + - Visit the Hugging Face repository at [TheBloke's Models](https://huggingface.co/TheBloke). + - Browse through the available models. + - Choose the model that best fits your needs. + +3. **Download the Model** + - Once you've selected a model, download it using a command like the one below. Replace `` with the path of your chosen model. + + +```bash title="Downloading Zephyr 7B Model" +wget https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q5_K_M.gguf?download=true +``` + +## Step 2: Load model +Now, you'll set up the model in your application. + +1. **Open `app.py` File** + + - In your project directory, find and open the app.py file. + +2. **Configure the Model Path** + + - Modify the model path in app.py to point to your downloaded model. + - Update the configuration parameters as necessary. + +```bash title="Example Configuration" {2} +dat = { + "llama_model_path": "nitro/interface/models/zephyr-7b-beta.Q5_K_M.gguf", + "ctx_len": 2048, + "ngl": 100, + "embedding": True, + "n_parallel": 4, + "pre_prompt": "A chat between a curious user and an artificial intelligence", + "user_prompt": "USER: ", + "ai_prompt": "ASSISTANT: "} +``` + +Congratulations! Your Nitro chatbot is now set up. Feel free to experiment with different configuration parameters to tailor the chatbot to your needs. + +For more information on parameter settings and their effects, please refer to Run Nitro(using-nitro) for a comprehensive parameters table. \ No newline at end of file diff --git a/docs/docs/features/chat.md b/docs/docs/features/chat.md new file mode 100644 index 000000000..b880ccbc3 --- /dev/null +++ b/docs/docs/features/chat.md @@ -0,0 +1,172 @@ +--- +title: Chat Completion +--- + +The Chat Completion feature in Nitro provides a flexible way to interact with any local Large Language Model (LLM). + +## Single Request Example + +To send a single query to your chosen LLM, follow these steps: + +
+ +```bash title="Nitro" +curl http://localhost:3928/inferences/llamacpp/chat_completion \ + -H "Content-Type: application/json" \ + -d '{ + "model": "", + "messages": [ + { + "role": "user", + "content": "Hello" + }, + ] + }' + +``` +
+ +
+ +```bash title="OpenAI" +curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + { + "role": "user", + "content": "Hello" + } + ] + }' +``` +
+ +This command sends a request to your local LLM, querying about the winner of the 2020 World Series. + +### Dialog Request Example + +For ongoing conversations or multiple queries, the dialog request feature is ideal. Here’s how to structure a multi-turn conversation: + +
+ +```bash title="Nitro" +curl http://localhost:3928/inferences/llamacpp/chat_completion \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Who won the world series in 2020?" + }, + { + "role": "assistant", + "content": "The Los Angeles Dodgers won the World Series in 2020." + }, + { + "role": "user", + "content": "Where was it played?" + } + ] + }' + +``` +
+ +
+ +```bash title="OpenAI" +curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Who won the world series in 2020?" + }, + { + "role": "assistant", + "content": "The Los Angeles Dodgers won the World Series in 2020." + }, + { + "role": "user", + "content": "Where was it played?" + } + ] + }' +``` +
+ +### Chat Completion Response + +Below are examples of responses from both the Nitro server and OpenAI: + +
+ +```js title="Nitro" +{ + "choices": [ + { + "finish_reason": null, + "index": 0, + "message": { + "content": "Hello, how may I assist you this evening?", + "role": "assistant" + } + } + ], + "created": 1700215278, + "id": "sofpJrnBGUnchO8QhA0s", + "model": "_", + "object": "chat.completion", + "system_fingerprint": "_", + "usage": { + "completion_tokens": 13, + "prompt_tokens": 90, + "total_tokens": 103 + } +} +``` +
+ +
+ +```js title="OpenAI" +{ + "choices": [ + { + "finish_reason": "stop" + "index": 0, + "message": { + "role": "assistant", + "content": "Hello there, how may I assist you today?", + } + } + ], + "created": 1677652288, + "id": "chatcmpl-123", + "model": "gpt-3.5-turbo-0613", + "object": "chat.completion", + "system_fingerprint": "fp_44709d6fcb", + "usage": { + "completion_tokens": 12, + "prompt_tokens": 9, + "total_tokens": 21 + } +} +``` +
+ + +The chat completion feature in Nitro showcases compatibility with OpenAI, making the transition between using OpenAI and local AI models more straightforward. For further details and advanced usage, please refer to the [API reference](https://nitro.jan.ai/api). diff --git a/docs/docs/features/cont-batch.md b/docs/docs/features/cont-batch.md new file mode 100644 index 000000000..537c50dd4 --- /dev/null +++ b/docs/docs/features/cont-batch.md @@ -0,0 +1,37 @@ +--- +title: Continuous Batching +--- + +## What is continous batching? + +Continuous batching is a powerful technique that significantly boosts throughput in large language model (LLM) inference while minimizing latency. This process dynamically groups multiple inference requests, allowing for more efficient GPU utilization. + +## Why Continuous Batching? + +Traditional static batching methods can lead to underutilization of GPU resources, as they wait for all sequences in a batch to complete before moving on. Continuous batching overcomes this by allowing new sequences to start processing as soon as others finish, ensuring more consistent and efficient GPU usage. + +## Benefits of Continuous Batching + +- **Increased Throughput:** Improvement over traditional batching methods. +- **Reduced Latency:** Lower p50 latency, leading to faster response times. +- **Efficient Resource Utilization:** Maximizes GPU memory and computational capabilities. + +## How to use continous batching +Nitro's `continuous batching` feature allows you to combine multiple requests for the same model execution, enhancing throughput and efficiency. + +```bash title="Enable Batching" {6,7} +curl http://localhost:3928/inferences/llamacpp/loadmodel \ + -H 'Content-Type: application/json' \ + -d '{ + "llama_model_path": "/path/to/your_model.gguf", + "ctx_len": 512, + "cont_batching": true, + "n_parallel": 4, + }' +``` + +For optimal performance, ensure that the `n_parallel` value is set to match the `thread_num`, as detailed in the [Multithreading](features/multi-thread.md) documentation. + +### Benchmark and Compare + +To understand the impact of continuous batching on your system, perform benchmarks comparing it with traditional batching methods. This [article](https://www.anyscale.com/blog/continuous-batching-llm-inference) will help you quantify improvements in throughput and latency. \ No newline at end of file diff --git a/docs/docs/features/embed.md b/docs/docs/features/embed.md new file mode 100644 index 000000000..a27978e00 --- /dev/null +++ b/docs/docs/features/embed.md @@ -0,0 +1,89 @@ +--- +title: Embedding +--- + +## What are embeddings? + +Embeddings are lists of numbers (floats). To find how similar two embeddings are, we measure the [distance](https://en.wikipedia.org/wiki/Cosine_similarity) between them. Shorter distances mean they're more similar; longer distances mean less similarity. + +## Activating Embedding Feature + +To utilize the embedding feature, include the JSON parameter `"embedding": true` in your [load model request](features/load-unload.md). This action enables Nitro to process inferences with embedding capabilities. + +### Embedding Request + +Here’s an example showing how to get the embedding result from the model: + +
+ +```bash title="Nitro" {1} +curl http://localhost:3928/inferences/llamacpp/embedding \ + -H 'Content-Type: application/json' \ + -d '{ + "input": "Hello", + "model":"Llama-2-7B-Chat-GGUF", + "encoding_format": "float" + }' + +``` +
+
+ +```bash title="OpenAI request" {1} +curl https://api.openai.com/v1/embeddings \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "input": "Hello", + "model": "text-embedding-ada-002", + "encoding_format": "float" + }' +``` +
+ +## Embedding Reponse + +The example response used the output from model [llama2 Chat 7B Q5 (GGUF)](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/tree/main) loaded to Nitro server. + +
+ +```js title="Nitro" +{ + "data": [ + { + "embedding": [ + -0.9874749, + 0.2965493, + ... + -0.253227 + ], + "index": 0, + "object": "embedding" + } + ] +} +``` +
+ +
+ +```js title="OpenAI" +{ + "embedding": [ + 0.0023064255, + -0.009327292, + .... (1536 floats total for ada-002) + -0.0028842222, + ], + "index": 0, + "object": "embedding" +} + + + + +``` +
+ + +The embedding feature in Nitro demonstrates a high level of compatibility with OpenAI, simplifying the transition between using OpenAI and local AI models. For more detailed information and advanced use cases, refer to the comprehensive [API Reference]((https://nitro.jan.ai/api)). \ No newline at end of file diff --git a/docs/docs/features/feat.md b/docs/docs/features/feat.md new file mode 100644 index 000000000..334a0daa7 --- /dev/null +++ b/docs/docs/features/feat.md @@ -0,0 +1,22 @@ +--- +title: Nitro Features +--- + +Nitro enhances the `llama.cpp` research base, optimizing it for production environments with advanced features: + +### Ease of Use +- **1-Click Install**: Simplified setup process, making it accessible for non-technical users. +- **HTTP Interface**: Easy integration with no complex bindings required. + +### Cross-Platform and Hardware Compatibility +- **Runs on Multiple OS**: Supports Windows, MacOS, and Linux. +- **Wide Hardware Support**: Compatible with arm64, x86 CPUs, and NVIDIA GPUs. + +### Performance and Scalability +- **Separate Process Operation**: Runs independently, ensuring no interference with main app processes. +- **Multi-Threaded Server**: Capable of handling multiple users concurrently. +- **Efficient Binary Size**: Lightweight footprint with a small binary size (~3mb compressed). + +### Developer and Industry Compatibility +- **OpenAI Compatibility**: Seamless integration with OpenAI models and standards. +- **No Hardware Dependencies**: Flexibility in deployment without specific hardware requirements. diff --git a/docs/docs/features/load-unload.md b/docs/docs/features/load-unload.md new file mode 100644 index 000000000..536a13690 --- /dev/null +++ b/docs/docs/features/load-unload.md @@ -0,0 +1,76 @@ +--- +title: Load and Unload models +--- + +## Load model + +The `loadmodel` in Nitro lets you load a local model into the server. It's an upgrade from `llama.cpp`, offering more features and customization options. + +You can load the model using: + +```bash title="Load Model" {1} +curl http://localhost:3928/inferences/llamacpp/loadmodel \ + -H 'Content-Type: application/json' \ + -d '{ + "llama_model_path": "/path/to/your_model.gguf", + "ctx_len": 512, + }' +``` + +For more detail on the loading model, please refer to [Table of parameters].(#table-of-parameters). + +### Enabling GPU Inference + +To enable GPU inference in Nitro, a simple POST request is used. This request will instruct Nitro to load the specified model into the GPU, significantly boosting the inference throughput. + +```bash title="GPU enable" {5} +curl http://localhost:3928/inferences/llamacpp/loadmodel \ + -H 'Content-Type: application/json' \ + -d '{ + "llama_model_path": "/path/to/your_model.gguf", + "ctx_len": 512, + "ngl": 100, + }' +``` + +You can adjust the `ngl` parameter based on your requirements and GPU capabilities. + +## Unload model +To unload a model, you can use a similar `curl` command as loading the model, adjusting the endpoint to `/unloadmodel.` + +```bash title="Unload the model" {1} +curl http://localhost:3928/inferences/llamacpp/unloadmodel +``` + +## Status +The `modelStatus` function provides the current status of the model, including whether it is loaded and its properties. This function offers improved monitoring capabilities compared to `llama.cpp`. + +```bash title="Check Model Status" {1} +curl http://localhost:3928/inferences/llamacpp/modelstatus +``` + +If you load the model correctly, the response would be + +```js title="Load Model Sucessfully" +{"message":"Model loaded successfully", "code": "ModelloadedSuccessfully"} +``` + +In case you got error while loading models. Please check for the correct model path. +```js title="Load Model Failed" +{"message":"No model loaded", "code": "NoModelLoaded"} +``` + +### Table of parameters + +| Parameter | Type | Description | +|------------------|---------|--------------------------------------------------------------| +| `llama_model_path` | String | The file path to the LLaMA model. | +| `ngl` | Integer | The number of GPU layers to use. | +| `ctx_len` | Integer | The context length for the model operations. | +| `embedding` | Boolean | Whether to use embedding in the model. | +| `n_parallel` | Integer | The number of parallel operations. Uses Drogon thread count if not set. | +| `cont_batching` | Boolean | Whether to use continuous batching. | +| `user_prompt` | String | The prompt to use for the user. | +| `ai_prompt` | String | The prompt to use for the AI assistant. | +| `system_prompt` | String | The prompt for system rules. | +| `pre_prompt` | String | The prompt to use for internal configuration. | \ No newline at end of file diff --git a/docs/docs/features/multi-thread.md b/docs/docs/features/multi-thread.md new file mode 100644 index 000000000..5ea4328ef --- /dev/null +++ b/docs/docs/features/multi-thread.md @@ -0,0 +1,52 @@ +--- +title: Multithreading +--- + +## What is Multithreading? + +Multithreading is a programming concept where a process executes multiple threads simultaneously, improving efficiency and performance. It allows concurrent execution of tasks, such as data processing or user interface updates. This technique is crucial for optimizing hardware usage and enhancing application responsiveness. + +## Drogon's Threading Model + +Nitro powered by Drogon, a high-speed C++ web application framework, utilizes a thread pool where each thread possesses its own event loop. These event loops are central to Drogon's functionality: + +- **Main Loop**: Runs on the main thread, responsible for starting worker loops. +- **Worker Loops**: Handle tasks and network events, ensuring efficient task execution without blocking. + +## Why it's important + +Understanding and effectively using multithreading in Drogon is crucial for several reasons: + +1. **Optimized Performance**: Multithreading enhances application efficiency by enabling simultaneous task execution for faster response times. + +2. **Non-blocking IO Operations**: Utilizing multiple threads prevents long-running tasks from blocking the entire application, ensuring high responsiveness. + +3. **Deadlock Avoidance**: Event loops and threads helps prevent deadlocks, ensuring smoother and uninterrupted application operation. + +4. **Effective Resource Utilization**: Distributing tasks across multiple threads leads to more efficient use of server resources, improving overall performance. + +5. **Async Programming** + +6. **Scalability** + +## Enabling More Threads on Nitro + +To increase the number of threads used by Nitro, use the following command syntax: + +```js +nitro [thread_num] [host] [port] +``` + +- **thread_num:** Specifies the number of threads for the Nitro server. +- **host:** The host address normally `127.0.0.1` (localhost) or `0.0.0.0` (all interfaces). +- **port:** The port number where Nitro is to be deployed. + +To launch Nitro with 4 threads, enter this command in the terminal: +```js +nitro 4 127.0.0.1 5000 +``` + +> After enabling multithreading, monitor your system's performance. Adjust the `thread_num` as needed to optimize throughput and latency based on your workload. + +## Acknowledgements +For more information on Drogon's threading, visit [Drogon's Documentation](https://github.com/drogonframework/drogon/wiki/ENG-FAQ-1-Understanding-drogon-threading-model). \ No newline at end of file diff --git a/docs/docs/features/prompt.md b/docs/docs/features/prompt.md new file mode 100644 index 000000000..53ea4be7b --- /dev/null +++ b/docs/docs/features/prompt.md @@ -0,0 +1,57 @@ +--- +title: Prompt Role Support +--- + +System, user, and assistant prompt is crucial for effectively utilizing the Large Language Model. These prompts work together to create a coherent and functional conversational flow. + +Nitro enables developers to configure dialogs and implement advanced prompt engineering, such as [few-shot learning](https://arxiv.org/abs/2005.14165). + +## System Prompt +- **Definition**: Sets up the assistant's behavior. +- **Example**: `pre_prompt: "You are a Pirate"` + +## User Prompt +- **Definition**: Requests or comments directed towards the assistant, forming the conversation's core. +- **Example**: `user_prompt: "USER:"` + +## Assistant Prompt +- **Definition**: Responses generated by the assistant, including stored responses or developer-provided examples. +- **Example**: `ai_prompt: "ASSISTANT:"` + +## Example usage + +To illustrate, let's create a "Pirate assistant": + +> NOTE: "ai_prompt" and "user_prompt" are prefixes indicating the role. Configure them based on your model. + +### Prompt Configuration + +```bash title="Prompt Configuration" {6,7,8} +curl http://localhost:3928/inferences/llamacpp/loadmodel \ + -H 'Content-Type: application/json' \ + -d '{ + "ctx_len": 128, + "ngl": 100, + "pre_prompt": "You are a Pirate. Using drunk language with a lot of Arr...", + "user_prompt": "USER:", + "ai_prompt": "ASSISTANT: " + }' +``` + +### Testing the Assistant + +```bash title="Pirate Assistant" +curl http://localhost:3928/inferences/llamacpp/chat_completion \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [ + { + "role": "user", + "content": "Hello, who is your captain?" + }, + ] + }' +``` + + + diff --git a/docs/docs/features/warmup.md b/docs/docs/features/warmup.md new file mode 100644 index 000000000..d39c16193 --- /dev/null +++ b/docs/docs/features/warmup.md @@ -0,0 +1,18 @@ +--- +title: Warming Up Model +--- + +## What is Model Warming Up? + +Model warming up is the process of running pre-requests through a model to optimize its components for production use. This step is crucial for reducing initialization and optimization delays during the first few inference requests. + +## What are the Benefits? + +Warming up an AI model offers several key benefits: + +- **Enhanced Initial Performance:** Unlike in `llama.cpp`, where the first inference can be very slow, warming up reduces initial latency, ensuring quicker response times from the start. +- **Consistent Response Times:** Especially beneficial for systems updating models frequently, like those with real-time training, to avoid performance lags with new snapshots. + +## How to Enable Model Warming Up? + +On the Nitro server, model warming up is automatically enabled whenever a new model is loaded. This means that the server handles the warm-up process behind the scenes, ensuring that the model is ready for efficient and effective performance from the first inference request. diff --git a/docs/docs/guides/overview.md b/docs/docs/guides/overview.md deleted file mode 100644 index b989559e3..000000000 --- a/docs/docs/guides/overview.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: Overview ---- - -:::info Comming Soon -Updating... -::: diff --git a/docs/docs/guides/troubleshooting.md b/docs/docs/guides/troubleshooting.md deleted file mode 100644 index d2db9d72a..000000000 --- a/docs/docs/guides/troubleshooting.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: Troubleshooting Nitro ---- - -:::info Comming Soon -Updating... -::: \ No newline at end of file diff --git a/docs/docs/new/about.md b/docs/docs/new/about.md new file mode 100644 index 000000000..54a27ce85 --- /dev/null +++ b/docs/docs/new/about.md @@ -0,0 +1,114 @@ +--- +title: About Nitro +slug: /docs +--- + +Nitro is a high-efficiency C++ inference engine for edge computing, powering [Jan](https://jan.ai/). It is lightweight and embeddable, ideal for product integration. + +Learn more on [GitHub](https://github.com/janhq/nitro). + +## Why Nitro? + +- **Fast Inference:** Built on top of the cutting-edge inference library `llama.cpp`, modified to be production ready. +- **Lightweight:** Only 3MB, ideal for resource-sensitive environments. +- **Easily Embeddable:** Simple integration into existing applications, offering flexibility. +- **Quick Setup:** Approximately 10-second initialization for swift deployment. +- **Enhanced Web Framework:** Incorporates `drogon cpp` to boost web service efficiency. + +### OpenAI-compatible API + +One of the significant advantages of using Nitro is its compatibility with OpenAI's API structure. The command format for making inference calls with Nitro is very similar to that used with OpenAI's API. This similarity ensures a transition for users who are already familiar with OpenAI's system. + +For instance, compare the Nitro inference call: + +
+ +```bash title="Nitro chat completion" +curl http://localhost:3928/inferences/llamacpp/chat_completion \ + -H "Content-Type: application/json" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Who won the world series in 2020?" + }, + ] + }' + +``` +
+ +
+ +```bash title="OpenAI API chat completion" +curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Who won the world series in 2020?" + }, + ] + }' +``` +
+ +- **Extends OpenAI's API with helpful model methods:** + - [Unload model](features/load-unload#unload-model) + - [Checking model status](features/load-unload/#status) + +### Cross-Platform + +- **Operating Systems**: Nitro Supports Windows, Linux, and MacOS. +- **Hardware Compatibility**: + - CPUs: ARM, x86. + - GPUs: Nvidia, AMD. +- **Detailed Resources**: [Windows Installation Guide](install/#windows), [Linux and MacOS Installation Guide](install/#linux-and-macos). + +### Multi-modal Capabilities + +- **Coming Soon**: Expansion to multi-modal functionalities - enabling Nitro to process and generate images, and audio. +- **Features to Expect**: + - Large Language-and-Vision Assistant. + - Speech recognition and transcription. + +## Architecture + +- **Overview**: Nitro's architecture is designed for scalability and efficiency, utilizing a modular framework that supports diverse AI functionalities. +- **Detailed Specifications**: For an in-depth understanding of Nitro's internal workings, components, and design philosophy, refer to our [Architecture Specifications](architecture.md). + +## Support +### GitHub Issue Tracking +- **Report Problems**: Encounter an issue with Nitro? File a [GitHub issue](https://github.com/janhq/nitro). Please include detailed error logs and steps to reproduce the problem. + +### Discord Community +- **Join the Conversation**: Discuss Nitro development and seek peer support in our [#nitro-dev](https://discord.gg/FTk2MvZwJH) channel on Discord. + +## Contributing + +### How to Contribute +Nitro welcomes contributions in various forms, not just coding. Here are some ways you can get involved: + +- **Understand Nitro**: Start with the [Getting Started](nitro/overview) guide. Found an issue or have a suggestion? [Open an issue](https://github.com/janhq/nitro/issues) to let us know. + +- **Feature Development**: Engage with community feature requests. Bring ideas to life by opening a [pull request](https://github.com/janhq/nitro/pulls) for features that interest you. + +### Links +- [Nitro GitHub Repository](https://github.com/janhq/nitro) + +## Acknowledgements + +- [drogon](https://github.com/drogonframework/drogon): The fast C++ web framework +- [llama.cpp](https://github.com/ggerganov/llama.cpp): Inference of LLaMA model in pure C/C++ \ No newline at end of file diff --git a/docs/docs/nitro/key-concepts.md b/docs/docs/new/architecture.md similarity index 88% rename from docs/docs/nitro/key-concepts.md rename to docs/docs/new/architecture.md index 361bccee1..510b4e103 100644 --- a/docs/docs/nitro/key-concepts.md +++ b/docs/docs/new/architecture.md @@ -1,7 +1,12 @@ --- -title: Key Concepts +title: Architecture --- +![Nitro Architecture](img/architecture.drawio.png) + +### Details element example + +## Key Concepts ## Inference Server An inference server is a type of server designed to process requests for running large language models and to return predictions. This server acts as the backbone for AI-powered applications, providing real-time execution of models to analyze data and make decisions. @@ -24,4 +29,10 @@ Drogon is an HTTP application framework based on C++14/17, designed for its spee - **Asynchronous Operations**: The framework supports non-blocking operations, permitting the server to continue processing other tasks while awaiting responses from databases or external services. -- **Scalability**: Drogon's architecture is built to scale, capable of managing numerous connections at once, suitable for applications with high traffic loads. \ No newline at end of file +- **Scalability**: Drogon's architecture is built to scale, capable of managing numerous connections at once, suitable for applications with high traffic loads. + + + +We should only have 1 document +- [ ] Refactor system/architecture +- [ ] Refactor system/key-concepts \ No newline at end of file diff --git a/docs/docs/new/build-source.md b/docs/docs/new/build-source.md new file mode 100644 index 000000000..e9c87cd8d --- /dev/null +++ b/docs/docs/new/build-source.md @@ -0,0 +1,102 @@ +--- +title: Build From Source +--- + +This guide provides step-by-step instructions for building Nitro from source on Linux, macOS, and Windows systems. + +## Clone the Repository + +First, you need to clone the Nitro repository: + +```bash +git clone --recurse https://github.com/janhq/nitro +``` + +If you don't have git, you can download the source code as a file archive from [Nitro GitHub](https://github.com/janhq/nitro). Each [release](https://github.com/caddyserver/caddy/releases) also has source snapshots. + +## Install Dependencies +Next, let's install the necessary dependencies. + +- **On MacOS with Apple Silicon:** + ```bash + ./install_deps.sh + ``` + +- **On Windows:** + + ```bash + cmake -S ./nitro_deps -B ./build_deps/nitro_deps + cmake --build ./build_deps/nitro_deps --config Release + ``` + +This creates a `build_deps` folder. + +## Generate build file + +Now, let's generate the build files. + +- **On MacOS, Linux, and Windows:** + + ```bash + mkdir build && cd build + cmake .. + ``` + +- **On MacOS with Intel processors:** + + ```bash + mkdir build && cd build + cmake -DLLAMA_METAL=OFF .. + ``` + +- **On Linux with CUDA:** + + ```bash + mkdir build && cd build + cmake -DLLAMA_CUBLAS=ON .. + ``` + +## Build the Application + +Time to build Nitro! + +- **On MacOS:** + + ```bash + make -j $(sysctl -n hw.physicalcpu) + ``` + +- **On Linux:** + + ```bash + make -j $(%NUMBER_OF_PROCESSORS%) + ``` + +- **On Windows:** + + ```bash + cmake --build . --config Release + ``` + +## Start process + +Finally, let's start Nitro. + +- **On MacOS and Linux:** + + ```bash + ./nitro + ``` + +- **On Windows:** + + ```bash + cd Release + copy ..\..\build_deps\_install\bin\zlib.dll . + nitro.exe + ``` + +To verify if the build was successful: +```bash +curl http://localhost:3928/healthz +``` \ No newline at end of file diff --git a/docs/docs/nitro/img/architecture.drawio.png b/docs/docs/new/img/architecture.drawio.png similarity index 100% rename from docs/docs/nitro/img/architecture.drawio.png rename to docs/docs/new/img/architecture.drawio.png diff --git a/docs/docs/new/install.md b/docs/docs/new/install.md new file mode 100644 index 000000000..41262a595 --- /dev/null +++ b/docs/docs/new/install.md @@ -0,0 +1,182 @@ +--- +title: Installation +slug: /install +--- + +# Nitro Installation Guide + +This guide provides instructions for installing Nitro using the provided [install.sh](https://github.com/janhq/nitro/blob/main/install.sh) and [install.bat](https://github.com/janhq/nitro/blob/main/install.bat) scripts for Linux, macOS, and Windows systems. + +## Features + +The installation script offers the following features: + +1. **Root Privilege Check**: Ensures the script is run with root privileges to avoid permission issues. +2. **Dependency Check**: Checks for and advises on the installation of `jq` and `unzip`. +3. **Automated Nitro Installation**: Downloads and installs the appropriate Nitro version based on the user's OS and architecture. +4. **Uninstall Script Creation**: Generates an uninstall script for easy removal of Nitro if needed. +5. **Enhanced User Experience**: Offers clear and colored output messages during the installation process. + +## Prerequisites + +- **Linux and macOS**: `jq`, `curl` and `sudo` are required. If `sudo` is not available, the user must have passwordless sudo privileges. If `jq` or `curl` are not available, the script will attempt to suggest installation commands for these packages. +- **Windows**: `PowerShell` are required. + +- **GPU Version**: GPU is supported on Linux and Windows only. [nvidia-cuda-toolkits-12.x](https://developer.nvidia.com/cuda-toolkit) is required on both Linux and Windows. + +## Installation Instructions + +### Linux and macOS + +- **Latest version (CPU is default):** + + ```bash + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh | sudo /bin/bash - + ``` + +- **Specific Version Installation:** + ```bash + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh -o /tmp/install.sh && chmod +x /tmp/install.sh && sudo bash /tmp/install.sh --version 0.1.7 && rm /tmp/install.sh + ``` + +- **GPU Version Installation:** + ```bash + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh -o /tmp/install.sh && chmod +x /tmp/install.sh && sudo bash /tmp/install.sh --gpu && rm /tmp/install.sh + ``` + +- **GPU Version Installation Specific Version:** + ```bash + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh -o /tmp/install.sh && chmod +x /tmp/install.sh && sudo bash /tmp/install.sh --gpu --version 0.1.7 && rm /tmp/install.sh + ``` + +- **Manual Installation by downloaing the script loacally and run with different arguments:** + + ```bash + # Download the script + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh -o ./install.sh + + # Make the script executable + chmod +x ./install.sh + + # Arguments supported + # --version: Specify the version to install for example "--version 0.1.7", default is latest, list version of nitro can be found in https://github.com/janhq/nitro/releases + # --gpu: Install the GPU version of nitro, default is CPU version + + # Run one of the following commands + + # Download and install the latest version of nitro + sudo ./install.sh + + # Download and install the specific version of nitro + sudo ./install.sh --version 0.1.7 + + # Download and install the GPU version of nitro + sudo ./install.sh --gpu + + # Download and install the GPU version of nitro with specific version + sudo ./install.sh --gpu --version 0.1.7 + ``` +### Windows +- **Latest version (CPU is default)** + ```bash + powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat; Remove-Item -Path 'install.bat' }" + ``` + +- **Specific Version Installation:** + ```bash + powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat --version 0.1.7; Remove-Item -Path 'install.bat' }" + ``` + +- **GPU Version Installation:** + ```bash + powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat --gpu; Remove-Item -Path 'install.bat' }" + ``` + +- **GPU Version Installation Specific Version:** + ```bash + powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat --gpu --version 0.1.7; Remove-Item -Path 'install.bat' }" + ``` +- **Manual Installation by downloaing the script loacally and run with different arguments** + + ```bash + # Download the script + Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat' + + # Arguments supported + # --version: Specify the version to install for example "--version 0.1.7", default is latest, list version of nitro can be found in https://github.com/janhq/nitro/releases + # --gpu: Install the GPU version of nitro, default is CPU version + # Run one of the following commands + # Download and install the latest version of nitro + .\install.bat + + # Download and install the specific version of nitro + .\install.bat --version 0.1.7 + + # Download and install the GPU version of nitro + .\install.bat --gpu + + # Download and install the GPU version of nitro with specific version + .\install.bat --gpu --version 0.1.7 + ``` +## Usage +After installation, launch Nitro by typing `nitro` (or `nitro.exe` on Windows) in a new terminal or PowerShell window. This will start the Nitro server. + +Simple testcase with nitro, after starting the server, you can run the following command to test the server in a new terminal or powershell session: + +- **On Linux and MacOS:** + ```bash title="Linux and Macos" + # Download tiny model + DOWNLOAD_URL=https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v0.3-GGUF/resolve/main/tinyllama-1.1b-chat-v0.3.Q2_K.gguf + # Check if /tmp/testmodel exists, if not, download it + if [[ ! -f "/tmp/testmodel" ]]; then + wget $DOWNLOAD_URL -O /tmp/testmodel + fi + # Load the model to nitro + curl -s --location 'http://localhost:3928/inferences/llamacpp/loadModel' \ + --header 'Content-Type: application/json' \ + --data '{ + "llama_model_path": "/tmp/testmodel", + "ctx_len": 2048, + "ngl": 32, + "embedding": false + }' + # Send a prompt request to nitro + curl -s --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \ + --header 'Content-Type: application/json' \ + --data '{ + "messages": [ + {"content": "Hello there", "role": "assistant"}, + {"content": "Write a long and sad story for me", "role": "user"} + ], + "stream": true, + "max_tokens": 100, + "stop": ["hello"], + "frequency_penalty": 0, + "presence_penalty": 0, + "temperature": 0.7 + }' + ``` + +- **On Windows:** + ```bash title="Windows" + # Download tiny model + set "MODEL_PATH=%TEMP%\testmodel" + if not exist "%MODEL_PATH%" ( + bitsadmin.exe /transfer "DownloadTestModel" %DOWNLOAD_URL% "%MODEL_PATH%" + ) + + # Load the model to nitro + call set "MODEL_PATH_STRING=%%MODEL_PATH:\=\\%%" + set "curl_data1={\"llama_model_path\":\"%MODEL_PATH_STRING%\"}" + curl.exe -s -w "%%{http_code}" --location "http://localhost:3928/inferences/llamacpp/loadModel" --header "Content-Type: application/json" --data "%curl_data1%" + + # Send a prompt request to nitro + set "curl_data2={\"messages\":[{\"content\":\"Hello there\",\"role\":\"assistant\"},{\"content\":\"Write a long and sad story for me\",\"role\":\"user\"}],\"stream\":true,\"model\":\"gpt-3.5-turbo\",\"max_tokens\":100,\"stop\":[\"hello\"],\"frequency_penalty\":0,\"presence_penalty\":0,\"temperature\":0.7}" + curl.exe -s -w "%%{http_code}" --location "http://localhost:3928/inferences/llamacpp/chat_completion" ^ + --header "Content-Type: application/json" ^ + --data "%curl_data2%" + ``` + +## Uninstallation +- **Linux and macOS**: Run `sudo uninstall_nitro.sh` from anywhere (the script is added to PATH). +- **Windows**: Open PowerShell and run `uninstallnitro.bat` from anywhere (the script is added to PATH). \ No newline at end of file diff --git a/docs/docs/new/model-cycle.md b/docs/docs/new/model-cycle.md new file mode 100644 index 000000000..85ed3f6ed --- /dev/null +++ b/docs/docs/new/model-cycle.md @@ -0,0 +1,16 @@ +--- +title: Model Life Cycle +--- + +## Load model + +### Warm up model + +## Inference + +## Unload model + +## Load model + +## Shut down server + diff --git a/docs/docs/new/quickstart.md b/docs/docs/new/quickstart.md new file mode 100644 index 000000000..542e8e36e --- /dev/null +++ b/docs/docs/new/quickstart.md @@ -0,0 +1,67 @@ +--- +title: Quickstart +--- +## Step 1: Install Nitro + +### For Linux and MacOS +Open your terminal and enter the following command. This will download and install Nitro on your system. + ```bash + curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh -o /tmp/install.sh && chmod +x /tmp/install.sh && sudo bash /tmp/install.sh --gpu && rm /tmp/install.sh + ``` + +### For Windows +Open PowerShell and execute the following command. This will perform the same actions as for Linux and MacOS but is tailored for Windows. + ```bash + powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat --gpu; Remove-Item -Path 'install.bat' }" + ``` + +> **NOTE:**Installing Nitro will add new files and configurations to your system to enable it to run. + +For a manual installation process, see: [Install from Source](install.md) + +## Step 2: Downloading a Model + +Next, we need to download a model. For this example, we'll use the [Llama2 7B chat model](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/tree/main). + +- Create a `/model` and navigate into it: +```bash +mkdir model && cd model +wget https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf?download=true +``` + +## Step 3: Run Nitro server + +To start using Nitro, you need to run its server. + +```bash title="Run Nitro server" +nitro +``` + +To check if the Nitro server is running: + +```bash title="Nitro Health Status" +curl http://localhost:3928/healthz +``` + +## Step 4: Making an Inference + +Finally, let's make an actual inference call using Nitro. + +- In your terminal, execute: + +```bash title="Nitro Inference" +curl http://localhost:3928/inferences/llamacpp/chat_completion \ + -H "Content-Type: application/json" \ + -d '{ + "messages": [ + { + "role": "user", + "content": "Who won the world series in 2020?" + }, + ] + }' +``` + +This command sends a request to Nitro, asking it about the 2020 World Series winner. + +- As you can see, A key benefit of Nitro is its alignment with [OpenAI's API structure](https://platform.openai.com/docs/guides/text-generation?lang=curl). Its inference call syntax closely mirrors that of OpenAI's API, facilitating an easier shift for those accustomed to OpenAI's framework. \ No newline at end of file diff --git a/docs/docs/nitro/architecture.md b/docs/docs/nitro/architecture.md deleted file mode 100644 index a127910ca..000000000 --- a/docs/docs/nitro/architecture.md +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: Architecture ---- - -:::info -This document is being updated. Please stay tuned. -::: - -![Nitro Architecture](img/architecture.drawio.png) - -### Components - -- **Nitro CLI**: A command-line interface that manages model conversion and compilation for deployment. - - - **Converter**: Transforms the model into a compatible format (GGUF) for the Nitro system. - - **Compiler**: Optimizes the converted model for efficient execution. - -- **TensorRT - LLM**: A specialized component for large language models using NVIDIA's TensorRT optimization. - -- **Triton Inference Server**: Serves the optimized model, facilitating scalable and efficient inference requests via gRPC. - -- **Nitro.cpp**: The C++ implementation handling the deployment and interfacing of the models. - - - **Adapters**: - - **llama.cpp**: - - **Triton LMDeploy**:. - - **Inference File Server**: Manages files necessary for inference operations. - - **Cache**: Stores temporary data to improve performance. - - **Apps**: - - **Interface**: - - **JanAPI**: - - **OpenAI Compatible**: Ensures compatibility with OpenAI standards for ease of integration. - diff --git a/docs/docs/nitro/installation.mdx b/docs/docs/nitro/installation.mdx deleted file mode 100644 index 0118a62bd..000000000 --- a/docs/docs/nitro/installation.mdx +++ /dev/null @@ -1,101 +0,0 @@ -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; - -# Installation -## How to install - -This guide provides step-by-step instructions for installing the software on various operating systems. Please follow the steps that correspond to your system. - -### Step 1: Install dependencies - -First, you need to install the necessary dependencies. - - - -

Open the terminal, navigate to the project directory, and run:

-
./install_deps.sh
-
- -

Open Command Prompt as Administrator, navigate to the project directory, and execute:

-
cmake -S ./nitro_deps -B ./build_deps/nitro_deps
-cmake --build ./build_deps/nitro_deps --config Release
-
-
- -A new folder named `build_deps` will be created. - -### Step 2: Generate Build Files - -You will now generate the build files needed to compile the software. - - - -

In the terminal, run:

-
mkdir build && cd build
-cmake -DLLAMA_METAL=OFF ..
-
- -

In the terminal, run:

-
mkdir build && cd build
-cmake -DLLAMA_CUBLAS=ON ..
-
- -

In the terminal, run:

-
mkdir build && cd build
-cmake ..
-
- -

In the terminal (or Command Prompt for Windows), run:

-
mkdir build && cd build
-cmake ..
-
-
- -### Step 3: Build the Application - -Now it's time to compile the software. - - - -

Open the terminal and enter:

-
make -j $(sysctl -n hw.physicalcpu)
-
- -

Open the terminal and enter:

-
make -j $(nproc)
-
- -

In Command Prompt, run:

-
cmake --build . --config Release
-
-
- -### Step 4: Run the Application - -To start the application: - - - -

In the terminal, navigate to the build directory and run:

-
./nitro
-
- -

In the terminal, navigate to the build directory and run:

-
./nitro
-
- -

In Command Prompt, navigate to the Release directory within the build folder and execute:

-
copy ..\..\build_deps\_install\bin\zlib.dll .
-nitro.exe
-
-
- -### Verification: - -To verify if the build was successful, open your web browser and go to: - -``` -localhost:8080/test -``` - -You should see a confirmation page if the installation was successful. diff --git a/docs/docs/nitro/overview.md b/docs/docs/nitro/overview.md deleted file mode 100644 index 876b4b7e6..000000000 --- a/docs/docs/nitro/overview.md +++ /dev/null @@ -1,48 +0,0 @@ ---- -title: Introduction ---- - -Nitro, is the lightweight inference engine that powers Jan. Nitro is written in C++, optimized for edge deployment. - -⚡ Explore Nitro's codebase: [GitHub](https://github.com/janhq/nitro) - -## Dependencies and Acknowledgements: - -- [llama.cpp](https://github.com/ggerganov/llama.cpp): Nitro wraps Llama.cpp, which runs Llama models in C++ -- [drogon](https://github.com/drogonframework/drogon): Nitro runs Drogon, which is a fast, C++17/20 HTTP application framework. -- (Coming soon) tensorrt-llm support. - -## Features - -In addition to the above features, Nitro also provides: - -- OpenAI compatibility -- HTTP interface with no bindings needed -- Runs as a separate process, not interfering with main app processes -- Multi-threaded server supporting concurrent users -- 1-click install -- No hardware dedendencies -- Ships as a small binary (~3mb compressed on average) -- Runs on Windows, MacOS, and Linux -- Compatible with arm64, x86, and NVIDIA GPUs - -## HTTP Interface - -Nitro offers a straightforward HTTP interface. With compatibility for multiple standard APIs, including OpenAI formats. - -```bash -curl --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \ - --header 'Content-Type: application/json' \ - --header 'Accept: text/event-stream' \ - --header 'Access-Control-Allow-Origin: *' \ - --data '{ - "messages": [ - {"content": "Hello there 👋", "role": "assistant"}, - {"content": "Can you write a long story", "role": "user"} - ], - "stream": true, - "model": "gpt-3.5-turbo", - "max_tokens": 2000 - }' -``` - diff --git a/docs/docs/nitro/using-nitro.md b/docs/docs/nitro/using-nitro.md deleted file mode 100644 index 573ea9975..000000000 --- a/docs/docs/nitro/using-nitro.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -title: Quick start ---- - -## Step 1: Download Nitro - -To use Nitro, download the released binaries from the release page below: - -🔗 [Download Nitro](https://github.com/janhq/nitro/releases) - -After downloading the release, double-click on the Nitro binary. - -## Step 2: Download a Model - -Download a llama model to try running the llama C++ integration. You can find a "GGUF" model on The Bloke's page below: - -🔗 [Download Model](https://huggingface.co/TheBloke) - -## Step 3: Run Nitro - -Double-click on Nitro to run it. After downloading your model, make sure it's saved to a specific path. Then, make an API call to load your model into Nitro. - -```zsh -curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \ - -H 'Content-Type: application/json' \ - -d '{ - "llama_model_path": "/path/to/your_model.gguf", - "ctx_len": 2048, - "ngl": 100, - "embedding": true, - "n_parallel": 4, - "pre_prompt": "A chat between a curious user and an artificial intelligence", - "user_prompt": "USER: ", - "ai_prompt": "ASSISTANT: " - }' -``` - -Configure your system by setting the following parameters in your `jsonBody`. The table below outlines each parameter and its description: - -| Parameter | Type | Description | -|------------------|---------|--------------------------------------------------------------| -| `llama_model_path` | String | The file path to the LLaMA model. | -| `ngl` | Integer | The number of GPU layers to use. | -| `ctx_len` | Integer | The context length for the model operations. | -| `embedding` | Boolean | Whether to use embedding in the model. | -| `n_parallel` | Integer | The number of parallel operations. Uses Drogon thread count if not set. | -| `cont_batching` | Boolean | Whether to use continuous batching. | -| `user_prompt` | String | The prompt to use for the user. | -| `ai_prompt` | String | The prompt to use for the AI assistant. | -| `system_prompt` | String | The prompt to use for system rules. | -| `pre_prompt` | String | The prompt to use for internal configuration. | - -## Step 4: Nitro Inference - -```zsh -curl --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \ - --header 'Content-Type: application/json' \ - --header 'Accept: text/event-stream' \ - --header 'Access-Control-Allow-Origin: *' \ - --data '{ - "messages": [ - {"content": "Hello there 👋", "role": "assistant"}, - {"content": "Can you write a long story", "role": "user"} - ], - "stream": true, - "model": "gpt-3.5-turbo", - "max_tokens": 2000 - }' -``` - -Nitro server is compatible with the OpenAI format, so you can expect the same output as the OpenAI ChatGPT API. \ No newline at end of file diff --git a/docs/docusaurus.config.js b/docs/docusaurus.config.js index 90884e843..6f9f2d25a 100644 --- a/docs/docusaurus.config.js +++ b/docs/docusaurus.config.js @@ -51,8 +51,8 @@ const config = { [ "posthog-docusaurus", { - apiKey: process.env.POSTHOG_PROJECT_API_KEY, - appUrl: process.env.POSTHOG_APP_URL, // optional + apiKey: process.env.POSTHOG_PROJECT_API_KEY || "XXX", + appUrl: process.env.POSTHOG_APP_URL || "XXX", // optional enableInDevelopment: false, // optional }, ], @@ -101,7 +101,8 @@ const config = { { specs: [ { - spec: "openapi/OpenAPISpec.json", // can be local file, url, or parsed json object + spec: "openapi/NitroAPI.yaml", // can be local file, url, or parsed json object + // spec: "openapi/OpenAIAPI.yaml", route: "/api/", }, ], @@ -143,19 +144,19 @@ const config = { position: "left", label: "API Reference", }, - { - type: "docSidebar", - sidebarId: "communitySidebar", - position: "left", - label: "Community", - }, + // { + // type: "docSidebar", + // sidebarId: "communitySidebar", + // position: "left", + // label: "Community", + // }, // Navbar right - { - type: "docSidebar", - sidebarId: "blogSidebar", - position: "right", - label: "Blog", - }, + // { + // type: "docSidebar", + // sidebarId: "blogSidebar", + // position: "right", + // label: "Blog", + // }, ], }, prism: { diff --git a/docs/openapi/NitroAPI.yaml b/docs/openapi/NitroAPI.yaml new file mode 100644 index 000000000..f86863df1 --- /dev/null +++ b/docs/openapi/NitroAPI.yaml @@ -0,0 +1,636 @@ +openapi: 3.0.0 +info: + title: Nitro API + description: Please see https://nitro.jan.ai/ for documentation. +version: "0.1.8" +contact: + name: Nitro Discord + url: https://github.com/janhq/nitro +license: + name: AGPLv3 + url: https://github.com/janhq/nitro/blob/main/LICENSE +servers: + - url: https://localhost:3928/ +tags: + - name: Chat Completion + description: Given a list of messages comprising a conversation, the model will return a response. + - name: Embeddings + description: Get a vector representation of a given input. + - name: Health Check + description: Check current status of the Nitro server. + - name: Load Model + description: Load model to Nitro Inference Server. + - name: Unload Model + description: Unload model out of Nitro Inference Server. + - name: Status + description: Check current status of the model. +x-tagGroups: + - name: OpenAI Compatible + tags: + - Chat Completion + - Embeddings + - name: Nitro Operations + tags: + - Health Check + - Load Model + - Unload Model + - Status +paths: + # Note: When adding an endpoint, make sure you also add it in the `groups` section, in the end of this file, + # under the appropriate group + /healthz: + get: + operationId: HeathCheck + tags: + - Health Check + summary: Check the status of Nitro Server. + # requestBody: + # content: + # application/json: + # schema: + # $ref: "#/components/schemas/HealthcheckRequest" + x-codeSamples: + - lang: "curl" + source: | + curl http://localhost:3928/healthz + responses: + "200": + description: Nitro health check + content: + application/json: + schema: + $ref: "#/components/schemas/HealthcheckResponse" + + /inferences/llamacpp/loadmodel: + post: + operationId: loadmodel + tags: + - Load Model + summary: Load model to Nitro Inference Server. + requestBody: + content: + application/json: + schema: + $ref: "#/components/schemas/LoadModelRequest" + x-codeSamples: + - lang: "curl" + source: | + curl http://localhost:3928/inferences/llamacpp/loadmodel \ + -H 'Content-Type: application/json' \ + -d '{ + "llama_model_path": "/path/to/your_model.gguf", + "ctx_len": 512, + }' + responses: + "200": + description: Model loaded + content: + application/json: + schema: + $ref: "#/components/schemas/LoadModelResponse" + + /inferences/llamacpp/unloadmodel: + get: + operationId: unloadmodel + tags: + - Unload Model + summary: Unload model from Nitro Inference Server. + # requestBody: + # content: + # application/json: + # schema: + # $ref: "#/components/schemas/UnloadModelRequest" + x-codeSamples: + - lang: "curl" + source: | + curl http://localhost:3928/inferences/llamacpp/unloadmodel + responses: + "200": + description: Model unloaded + content: + application/json: + schema: + $ref: "#/components/schemas/UnloadModelResponse" + + /inferences/llamacpp/modelstatus: + get: + operationId: modelstatus + tags: + - Status + summary: Check status of the model on Nitro server + content: + application/json: + schema: + $ref: "#/components/schemas/StatusRequest" + x-codeSamples: + - lang: "curl" + source: | + curl http://localhost:3928/inferences/llamacpp/modelstatus + responses: + "200": + description: Check status + content: + application/json: + schema: + $ref: "#/components/schemas/StatusResponse" + + /inferences/llamacpp/embedding: + post: + operationId: createEmbedding + tags: + - Embeddings + summary: Creates an embedding vector representing the input text. + requestBody: + content: + application/json: + schema: + $ref: "#/components/schemas/CreateEmbeddingRequest" + x-codeSamples: + - lang: "curl" + source: | + curl http://localhost:3928/inferences/llamacpp/embedding \ + -H 'Content-Type: application/json' \ + -d '{ + "input": "hello", + "encoding_format": "float" + }' + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateEmbeddingResponse" + + /inferences/llamacpp/chat_completion: + post: + operationId: createChatCompletion + tags: + - Chat Completion + summary: Create an chat with the model. + requestBody: + content: + application/json: + schema: + $ref: "#/components/schemas/ChatCompletionRequest" + x-codeSamples: + - lang: "curl" + source: | + curl -X POST 'http://localhost:3928/inferences/llamacpp/chat_completion' \ + -H "Content-Type: application/json" \ + -d '{ + "llama_model_path": "/path/to/your/model.gguf", + "messages": [ + { + "role": "user", + "content": "hello" + }, + ] + }' + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ChatCompletionResponse" + +####################################################### +####################################################### +components: + schemas: + LoadModelRequest: + type: object + properties: + llama_model_path: + type: string + required: true + description: Path to your local LLM + example: "nitro/model/zephyr-7b-beta.Q5_K_M.gguf" + ngl: + type: number + default: 100 + minimum: 0 + maximum: 100 + nullable: true + description: The number of layers to load onto the GPU for acceleration. + ctx_len: + type: number + default: 2048 + nullable: true + description: The context length for model operations varies; the maximum depends on the specific model used. + embedding: + default: true + type: boolean + nullable: true + description: Whether to enable embedding. + cont_batching: + type: boolean + default: false + nullable: true + description: Whether to use continuous batching. + n_parallel: + type: integer + default: Automatically set to Dragon threads + example: 4 + nullable: true + description: The number of parallel operations. Only set when enable continuous batching. + pre_prompt: + type: string + default: A chat between a curious user and an artificial intelligence assistant. The assistant follows the given rules no matter what. + nullable: true + description: The prompt to use for internal configuration. + system_prompt: + type: string + default: "ASSISTANT's RULE:" + nullable: true + description: The prefix for system prompt + user_prompt: + type: string + default: "USER:" + nullable: true + description: The prefix for user prompt. + ai_prompt: + type: string + default: "ASSISTANT:" + nullable: true + description: The prefix for assistant prompt. + + required: + - llama_model_path + + LoadModelResponse: + type: object + properties: + message: + example: Model loaded successfully + description: A status indicator for when the model is successfully loaded. + anyOf: + - type: string + title: Success + description: The output will be "Model loaded successfully" + - type: string + title: Failed + description: The output will be "No model loaded" + code: + example: Model loaded successfully + description: A response code for Localization Support. + anyOf: + - type: string + title: Success + description: The output will be "Model loaded successfully" + - type: string + title: Failed + description: The output will be "No model loaded" + + HealthcheckRequest: + type: object + + HealthcheckResponse: + type: object + properties: + message: + example: Nitro is alive!!! + description: A status indicator for when the model is successfully loaded. + anyOf: + - type: string + title: Success + description: The output will be "Nitro is alive!!!" + - type: string + title: Failed + description: "curl: (7) Failed to connect to localhost port 3928 after 0 ms: Connection refused" + + UnloadModelRequest: + type: object + properties: + message: + example: TODO + description: TODO + + UnloadModelResponse: + type: object + properties: + message: + example: Model unloaded successfully + description: A status for successful model unloading. + anyOf: + - type: string + title: Success + description: The output will be "Model unloaded successfully" + - type: string + title: Failed + description: The output will be "No model loaded" + + StatusRequest: + type: object + properties: + message: + example: Model unloaded successfully + description: A status for successful model unloading. + + StatusResponse: + type: object + description: State of the loaded model + properties: + model_data: + type: object + description: Configuration data of the model + properties: + model_loaded: + type: boolean + example: true + nullable: true + description: A status for loading model to Nitro server. + frequency_penalty: + type: number + description: Adjusts likelihood of repeating words in the output, with a higher value discouraging repetition. + default: 0 + nullable: true + max: 2 + min: 0 + grammar: + type: string + default: "" + nullable: true + description: Specifies grammar constraints to be applied, with an empty string implying no constraints. + ignore_eos: + type: boolean + default: false + nullable: true + description: Determines if the model should consider end-of-sequence tokens, with false indicating they are considered. + logit_bias: + type: arrays + default: [] + description: An array for applying biases to certain tokens' logits to affect their selection probability. + mirostat: + type: number + default: 0 + nullable: true + description: Enables or disables the Mirostat algorithm for controlling output diversity. + mirostat_eta: + type: number + default: 0.1 + nullable: true + description: Parameter related to output diversity. + mirostat_tau: + type: number + default: 5.0 + nullable: true + description: Controls the temperature for the mirostat. + model: + type: string + example: "nitro/model/zephyr-7b-beta.Q5_K_M.gguf" + nullable: true + description: This is automatically set to the model you've loaded on the Nitro server. + n_ctx: + type: number + default: 42 + nullable: true + description: Number of tokens in the model's context window. + n_keep: + type: number + default: 0 + nullable: true + description: Number of tokens to keep from the beginning of the input. + n_predict: + type: number + default: 100 + nullable: true + description: Number of tokens the model should predict, with -1 indicating no specific limit. + n_probs: + type: number + default: 0 + nullable: true + description: Controls the number of probabilities returned by the model. + penalize_nl: + type: boolean + default: true + nullable: true + description: Penalizes new lines in the output to make them less likely. + presence_penalty: + type: number + default: 0 + nullable: true + description: Adjusts likelihood of introducing new concepts in the output. + repeat_last_n: + type: number + default: 64 + nullable: true + description: Number of tokens to check for repetition. + repeat_penalty: + type: number + default: 1.1 + nullable: true + description: Penalizes repetitions of phrases in the last `repeat_last_n` tokens. + seed: + type: number + default: 4294967295 + nullable: true + description: Random seed for ensuring reproducibility. + stop: + type: arrays + default: ["hello", "USER: "] + nullable: true + description: A list of tokens that signal the model to stop generating further output. + stream: + type: boolean + default: true + nullable: true + description: Determines if output generation is in a streaming manner. + temp: + type: number + default: 0.7 + min: 0 + max: 1 + nullable: true + description: Controls randomness of the output. + tfs_z: + type: number + default: 1.0 + nullable: true + description: A parameter likely related to internal model processing. + top_k: + type: number + default: 40 + nullable: true + description: Limits the number of highest probability tokens considered at each generation step. + top_p: + type: number + default: 0.95 + min: 0 + max: 1 + nullable: true + description: Chooses from the top tokens cumulatively making up a specified probability. + typical_p: + type: number + default: 1.0 + nullable: true + description: Controls output diversity, typically used alongside `top_p`. + + CreateEmbeddingRequest: + type: object + additionalProperties: false + properties: + input: + description: Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. + example: "hello" + encoding_format: + description: Encoding format + example: float + + CreateEmbeddingResponse: + type: object + description: Response containing embeddings and related information + properties: + data: + type: array + description: Array of embedding objects + items: + type: object + properties: + embedding: + type: array + description: Array representing the embedding vector + items: + type: arrays + example: [0.067819312214851379,0.17273959517478943,-0.31053683161735535, ... ,0.36176943778991699] + index: + type: integer + description: Index of the embedding in the array + example: 0 + object: + type: string + description: Type of the object + example: embedding + model: + type: string + description: Model identifier + example: "_" + object: + type: string + description: Type of the overall response object + example: list + usage: + type: object + description: Information about token usage in the request + properties: + prompt_tokens: + type: integer + description: Number of tokens used in the prompt + example: 33 + total_tokens: + type: integer + description: Total number of tokens involved in the operation + example: 533 + + + + ChatCompletionRequest: + type: object + properties: + messages: + type: arrays + description: Contains input data or prompts for the model to process + example: [{"content": "Hello there :wave:","role": "assistant"},{"content": "Can you write a long story","role": "user"}] + stream: + type: boolean + default: true + description: Enables continuous output generation, allowing for streaming of model responses. + model: + type: string + example: "gpt-3.5-turbo" + description: Specifies the model being used for inference or processing tasks. + max_tokens: + type: number + default: 2048 + description: The maximum number of tokens the model will generate in a single response + stop: + type: arrays + example: ['hello'] + description: Defines specific tokens or phrases at which the model will stop generating further output. + frequency_penalty: + type: number + default: 0 + description: Adjusts the likelihood of the model repeating words or phrases in its output. + presence_penalty: + type: number + default: 0 + description: Influences the generation of new and varied concepts in the model's output + temperature: + type: number + default: 0.7 + min: 0 + max: 1 + description: Controls the randomness of the model's output + + ChatCompletionResponse: + type: object + description: Description of the response structure + properties: + choices: + type: array + description: Array of choice objects + items: + type: object + properties: + finish_reason: + type: string + nullable: true + example: null + description: Reason for finishing the response, if applicable + index: + type: integer + example: 0 + description: Index of the choice + message: + type: object + properties: + content: + type: string + example: "Hello user. What can I help you with?" + description: Content of the message + role: + type: string + example: assistant + description: Role of the sender + created: + type: integer + example: 1700193928 + description: Timestamp of when the response was created + id: + type: string + example: ebwd2niJvJB1Q2Whyvkz + description: Unique identifier of the response + model: + type: string + nullable: true + example: _ + description: Model used for generating the response + object: + type: string + example: chat.completion + description: Type of the response object + system_fingerprint: + type: string + nullable: true + example: _ + description: System fingerprint + usage: + type: object + description: Information about the usage of tokens + properties: + completion_tokens: + type: integer + example: 500 + description: Number of tokens used for completion + prompt_tokens: + type: integer + example: 33 + description: Number of tokens used in the prompt + total_tokens: + type: integer + example: 533 + description: Total number of tokens used + +################################### +################################# diff --git a/docs/openapi/OpenAIAPI.yaml b/docs/openapi/OpenAIAPI.yaml new file mode 100644 index 000000000..bd4a9cab1 --- /dev/null +++ b/docs/openapi/OpenAIAPI.yaml @@ -0,0 +1,9871 @@ +openapi: 3.0.0 +info: + title: OpenAI API + description: The OpenAI REST API. Please see https://platform.openai.com/docs/api-reference for more details. + version: "2.0.0" + termsOfService: https://openai.com/policies/terms-of-use + contact: + name: OpenAI Support + url: https://help.openai.com/ + license: + name: MIT + url: https://github.com/openai/openai-openapi/blob/master/LICENSE +servers: + - url: https://api.openai.com/v1 +tags: + - name: Assistants + description: Build Assistants that can call models and use tools. + - name: Audio + description: Learn how to turn audio into text or text into audio. + - name: Chat + description: Given a list of messages comprising a conversation, the model will return a response. + - name: Completions + description: Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position. + - name: Embeddings + description: Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. + - name: Fine-tuning + description: Manage fine-tuning jobs to tailor a model to your specific training data. + - name: Files + description: Files are used to upload documents that can be used with features like Assistants and Fine-tuning. + - name: Images + description: Given a prompt and/or an input image, the model will generate a new image. + - name: Models + description: List and describe the various models available in the API. + - name: Moderations + description: Given a input text, outputs if the model classifies it as violating OpenAI's content policy. + - name: Fine-tunes + description: Manage legacy fine-tuning jobs to tailor a model to your specific training data. + - name: Edits + description: Given a prompt and an instruction, the model will return an edited version of the prompt. +paths: + # Note: When adding an endpoint, make sure you also add it in the `groups` section, in the end of this file, + # under the appropriate group + /chat/completions: + post: + operationId: createChatCompletion + tags: + - Chat + summary: Creates a model response for the given chat conversation. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateChatCompletionRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateChatCompletionResponse" + + x-oaiMeta: + name: Create chat completion + group: chat + returns: | + Returns a [chat completion](/docs/api-reference/chat/object) object, or a streamed sequence of [chat completion chunk](/docs/api-reference/chat/streaming) objects if the request is streamed. + path: create + examples: + - title: Default + request: + curl: | + curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "VAR_model_id", + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Hello!" + } + ] + }' + python: | + from openai import OpenAI + client = OpenAI() + + completion = client.chat.completions.create( + model="VAR_model_id", + messages=[ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Hello!"} + ] + ) + + print(completion.choices[0].message) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const completion = await openai.chat.completions.create({ + messages: [{ role: "system", content: "You are a helpful assistant." }], + model: "VAR_model_id", + }); + + console.log(completion.choices[0]); + } + + main(); + response: &chat_completion_example | + { + "id": "chatcmpl-123", + "object": "chat.completion", + "created": 1677652288, + "model": "gpt-3.5-turbo-0613", + "system_fingerprint": "fp_44709d6fcb", + "choices": [{ + "index": 0, + "message": { + "role": "assistant", + "content": "\n\nHello there, how may I assist you today?", + }, + "finish_reason": "stop" + }], + "usage": { + "prompt_tokens": 9, + "completion_tokens": 12, + "total_tokens": 21 + } + } + - title: Image input + request: + curl: | + curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "gpt-4-vision-preview", + "messages": [ + { + "role": "user", + "content": [ + { + "type": "text", + "text": "What’s in this image?" + }, + { + "type": "image_url", + "image_url": { + "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" + } + } + ] + } + ], + "max_tokens": 300 + }' + python: | + from openai import OpenAI + + client = OpenAI() + + response = client.chat.completions.create( + model="gpt-4-vision-preview", + messages=[ + { + "role": "user", + "content": [ + {"type": "text", "text": "What’s in this image?"}, + { + "type": "image_url", + "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg", + }, + ], + } + ], + max_tokens=300, + ) + + print(response.choices[0]) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const response = await openai.chat.completions.create({ + model: "gpt-4-vision-preview", + messages: [ + { + role: "user", + content: [ + { type: "text", text: "What’s in this image?" }, + { + type: "image_url", + image_url: + "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg", + }, + ], + }, + ], + }); + console.log(response.choices[0]); + } + main(); + response: &chat_completion_image_example | + { + "id": "chatcmpl-123", + "object": "chat.completion", + "created": 1677652288, + "model": "gpt-3.5-turbo-0613", + "system_fingerprint": "fp_44709d6fcb", + "choices": [{ + "index": 0, + "message": { + "role": "assistant", + "content": "\n\nHello there, how may I assist you today?", + }, + "finish_reason": "stop" + }], + "usage": { + "prompt_tokens": 9, + "completion_tokens": 12, + "total_tokens": 21 + } + } + - title: Streaming + request: + curl: | + curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "VAR_model_id", + "messages": [ + { + "role": "system", + "content": "You are a helpful assistant." + }, + { + "role": "user", + "content": "Hello!" + } + ], + "stream": true + }' + python: | + from openai import OpenAI + client = OpenAI() + + completion = client.chat.completions.create( + model="VAR_model_id", + messages=[ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Hello!"} + ], + stream=True + ) + + for chunk in completion: + print(chunk.choices[0].delta) + + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const completion = await openai.chat.completions.create({ + model: "VAR_model_id", + messages: [ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Hello!"} + ], + stream: true, + }); + + for await (const chunk of completion) { + console.log(chunk.choices[0].delta.content); + } + } + + main(); + response: &chat_completion_chunk_example | + {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0613", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]} + + {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0613", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]} + + {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0613", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]} + + .... + + {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0613", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":" today"},"finish_reason":null}]} + + {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0613", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"?"},"finish_reason":null}]} + + {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0613", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{},"finish_reason":"stop"}]} + - title: Function calling + request: + curl: | + curl https://api.openai.com/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + { + "role": "user", + "content": "What is the weather like in Boston?" + } + ], + "functions": [ + { + "name": "get_current_weather", + "description": "Get the current weather in a given location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The city and state, e.g. San Francisco, CA" + }, + "unit": { + "type": "string", + "enum": ["celsius", "fahrenheit"] + } + }, + "required": ["location"] + } + } + ], + "function_call": "auto" + }' + python: | + from openai import OpenAI + client = OpenAI() + + functions = [ + { + "name": "get_current_weather", + "description": "Get the current weather in a given location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The city and state, e.g. San Francisco, CA", + }, + "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}, + }, + "required": ["location"], + }, + } + ] + messages = [{"role": "user", "content": "What's the weather like in Boston today?"}] + completion = client.chat.completions.create( + model="VAR_model_id", + messages=messages, + functions=functions, + function_call="auto" + ) + + print(completion) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]; + const functions = [ + { + "name": "get_current_weather", + "description": "Get the current weather in a given location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The city and state, e.g. San Francisco, CA", + }, + "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}, + }, + "required": ["location"], + }, + } + ]; + + const response = await openai.chat.completions.create({ + model: "gpt-3.5-turbo", + messages: messages, + functions: functions, + function_call: "auto", // auto is default, but we'll be explicit + }); + + console.log(response); + } + + main(); + response: &chat_completion_function_example | + { + "choices": [ + { + "finish_reason": "function_call", + "index": 0, + "message": { + "content": null, + "function_call": { + "arguments": "{\n \"location\": \"Boston, MA\"\n}", + "name": "get_current_weather" + }, + "role": "assistant" + } + } + ], + "created": 1694028367, + "model": "gpt-3.5-turbo-0613", + "system_fingerprint": "fp_44709d6fcb", + "object": "chat.completion", + "usage": { + "completion_tokens": 18, + "prompt_tokens": 82, + "total_tokens": 100 + } + } + /completions: + post: + operationId: createCompletion + tags: + - Completions + summary: Creates a completion for the provided prompt and parameters. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateCompletionRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateCompletionResponse" + x-oaiMeta: + name: Create completion + returns: | + Returns a [completion](/docs/api-reference/completions/object) object, or a sequence of completion objects if the request is streamed. + legacy: true + examples: + - title: No streaming + request: + curl: | + curl https://api.openai.com/v1/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "VAR_model_id", + "prompt": "Say this is a test", + "max_tokens": 7, + "temperature": 0 + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.completions.create( + model="VAR_model_id", + prompt="Say this is a test", + max_tokens=7, + temperature=0 + ) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const completion = await openai.completions.create({ + model: "VAR_model_id", + prompt: "Say this is a test.", + max_tokens: 7, + temperature: 0, + }); + + console.log(completion); + } + main(); + response: | + { + "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7", + "object": "text_completion", + "created": 1589478378, + "model": "VAR_model_id", + "system_fingerprint": "fp_44709d6fcb", + "choices": [ + { + "text": "\n\nThis is indeed a test", + "index": 0, + "logprobs": null, + "finish_reason": "length" + } + ], + "usage": { + "prompt_tokens": 5, + "completion_tokens": 7, + "total_tokens": 12 + } + } + - title: Streaming + request: + curl: | + curl https://api.openai.com/v1/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "VAR_model_id", + "prompt": "Say this is a test", + "max_tokens": 7, + "temperature": 0, + "stream": true + }' + python: | + from openai import OpenAI + client = OpenAI() + + for chunk in client.completions.create( + model="VAR_model_id", + prompt="Say this is a test", + max_tokens=7, + temperature=0, + stream=True + ): + print(chunk.choices[0].text) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const stream = await openai.completions.create({ + model: "VAR_model_id", + prompt: "Say this is a test.", + stream: true, + }); + + for await (const chunk of stream) { + console.log(chunk.choices[0].text) + } + } + main(); + response: | + { + "id": "cmpl-7iA7iJjj8V2zOkCGvWF2hAkDWBQZe", + "object": "text_completion", + "created": 1690759702, + "choices": [ + { + "text": "This", + "index": 0, + "logprobs": null, + "finish_reason": null + } + ], + "model": "gpt-3.5-turbo-instruct" + "system_fingerprint": "fp_44709d6fcb", + } + /edits: + post: + operationId: createEdit + deprecated: true + tags: + - Edits + summary: Creates a new edit for the provided input, instruction, and parameters. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateEditRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateEditResponse" + x-oaiMeta: + name: Create edit + returns: | + Returns an [edit](/docs/api-reference/edits/object) object. + group: edits + examples: + request: + curl: | + curl https://api.openai.com/v1/edits \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "VAR_model_id", + "input": "What day of the wek is it?", + "instruction": "Fix the spelling mistakes" + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.edits.create( + model="VAR_model_id", + input="What day of the wek is it?", + instruction="Fix the spelling mistakes" + ) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const edit = await openai.edits.create({ + model: "VAR_model_id", + input: "What day of the wek is it?", + instruction: "Fix the spelling mistakes.", + }); + + console.log(edit); + } + + main(); + response: &edit_example | + { + "object": "edit", + "created": 1589478378, + "choices": [ + { + "text": "What day of the week is it?", + "index": 0, + } + ], + "usage": { + "prompt_tokens": 25, + "completion_tokens": 32, + "total_tokens": 57 + } + } + + /images/generations: + post: + operationId: createImage + tags: + - Images + summary: Creates an image given a prompt. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateImageRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ImagesResponse" + x-oaiMeta: + name: Create image + returns: Returns a list of [image](/docs/api-reference/images/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/images/generations \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "model": "dall-e-3", + "prompt": "A cute baby sea otter", + "n": 1, + "size": "1024x1024" + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.images.generate( + model="dall-e-3", + prompt="A cute baby sea otter", + n=1, + size="1024x1024" + ) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const image = await openai.images.generate({ model: "dall-e-3", prompt: "A cute baby sea otter" }); + + console.log(image.data); + } + main(); + response: | + { + "created": 1589478378, + "data": [ + { + "url": "https://..." + }, + { + "url": "https://..." + } + ] + } + /images/edits: + post: + operationId: createImageEdit + tags: + - Images + summary: Creates an edited or extended image given an original image and a prompt. + requestBody: + required: true + content: + multipart/form-data: + schema: + $ref: "#/components/schemas/CreateImageEditRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ImagesResponse" + x-oaiMeta: + name: Create image edit + returns: Returns a list of [image](/docs/api-reference/images/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/images/edits \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -F image="@otter.png" \ + -F mask="@mask.png" \ + -F prompt="A cute baby sea otter wearing a beret" \ + -F n=2 \ + -F size="1024x1024" + python: | + from openai import OpenAI + client = OpenAI() + + client.images.edit( + image=open("otter.png", "rb"), + mask=open("mask.png", "rb"), + prompt="A cute baby sea otter wearing a beret", + n=2, + size="1024x1024" + ) + node.js: |- + import fs from "fs"; + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const image = await openai.images.edit({ + image: fs.createReadStream("otter.png"), + mask: fs.createReadStream("mask.png"), + prompt: "A cute baby sea otter wearing a beret", + }); + + console.log(image.data); + } + main(); + response: | + { + "created": 1589478378, + "data": [ + { + "url": "https://..." + }, + { + "url": "https://..." + } + ] + } + /images/variations: + post: + operationId: createImageVariation + tags: + - Images + summary: Creates a variation of a given image. + requestBody: + required: true + content: + multipart/form-data: + schema: + $ref: "#/components/schemas/CreateImageVariationRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ImagesResponse" + x-oaiMeta: + name: Create image variation + returns: Returns a list of [image](/docs/api-reference/images/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/images/variations \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -F image="@otter.png" \ + -F n=2 \ + -F size="1024x1024" + python: | + from openai import OpenAI + client = OpenAI() + + response = client.images.create_variation( + image=open("image_edit_original.png", "rb"), + n=2, + size="1024x1024" + ) + node.js: |- + import fs from "fs"; + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const image = await openai.images.createVariation({ + image: fs.createReadStream("otter.png"), + }); + + console.log(image.data); + } + main(); + response: | + { + "created": 1589478378, + "data": [ + { + "url": "https://..." + }, + { + "url": "https://..." + } + ] + } + + /embeddings: + post: + operationId: createEmbedding + tags: + - Embeddings + summary: Creates an embedding vector representing the input text. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateEmbeddingRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateEmbeddingResponse" + x-oaiMeta: + name: Create embeddings + returns: A list of [embedding](/docs/api-reference/embeddings/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/embeddings \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "input": "The food was delicious and the waiter...", + "model": "text-embedding-ada-002", + "encoding_format": "float" + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.embeddings.create( + model="text-embedding-ada-002", + input="The food was delicious and the waiter...", + encoding_format="float" + ) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const embedding = await openai.embeddings.create({ + model: "text-embedding-ada-002", + input: "The quick brown fox jumped over the lazy dog", + encoding_format: "float", + }); + + console.log(embedding); + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "object": "embedding", + "embedding": [ + 0.0023064255, + -0.009327292, + .... (1536 floats total for ada-002) + -0.0028842222, + ], + "index": 0 + } + ], + "model": "text-embedding-ada-002", + "usage": { + "prompt_tokens": 8, + "total_tokens": 8 + } + } + + /audio/speech: + post: + operationId: createSpeech + tags: + - Audio + summary: Generates audio from the input text. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateSpeechRequest" + responses: + "200": + description: OK + headers: + Transfer-Encoding: + schema: + type: string + description: chunked + content: + application/octet-stream: + schema: + type: string + format: binary + x-oaiMeta: + name: Create speech + returns: The audio file content. + examples: + request: + curl: | + curl https://api.openai.com/v1/audio/speech \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "model": "tts-1", + "input": "The quick brown fox jumped over the lazy dog.", + "voice": "alloy" + }' \ + --output speech.mp3 + python: | + from pathlib import Path + import openai + + speech_file_path = Path(__file__).parent / "speech.mp3" + response = openai.audio.speech.create( + model="tts-1", + voice="alloy", + input="The quick brown fox jumped over the lazy dog." + ) + response.stream_to_file(speech_file_path) + node: | + import fs from "fs"; + import path from "path"; + import OpenAI from "openai"; + + const openai = new OpenAI(); + + const speechFile = path.resolve("./speech.mp3"); + + async function main() { + const mp3 = await openai.audio.speech.create({ + model: "tts-1", + voice: "alloy", + input: "Today is a wonderful day to build something people love!", + }); + console.log(speechFile); + const buffer = Buffer.from(await mp3.arrayBuffer()); + await fs.promises.writeFile(speechFile, buffer); + } + main(); + /audio/transcriptions: + post: + operationId: createTranscription + tags: + - Audio + summary: Transcribes audio into the input language. + requestBody: + required: true + content: + multipart/form-data: + schema: + $ref: "#/components/schemas/CreateTranscriptionRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateTranscriptionResponse" + x-oaiMeta: + name: Create transcription + returns: The transcribed text. + examples: + request: + curl: | + curl https://api.openai.com/v1/audio/transcriptions \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: multipart/form-data" \ + -F file="@/path/to/file/audio.mp3" \ + -F model="whisper-1" + python: | + from openai import OpenAI + client = OpenAI() + + audio_file = open("speech.mp3", "rb") + transcript = client.audio.transcriptions.create( + model="whisper-1", + file=audio_file + ) + node: | + import fs from "fs"; + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const transcription = await openai.audio.transcriptions.create({ + file: fs.createReadStream("audio.mp3"), + model: "whisper-1", + }); + + console.log(transcription.text); + } + main(); + response: | + { + "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that." + } + /audio/translations: + post: + operationId: createTranslation + tags: + - Audio + summary: Translates audio into English. + requestBody: + required: true + content: + multipart/form-data: + schema: + $ref: "#/components/schemas/CreateTranslationRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateTranslationResponse" + x-oaiMeta: + name: Create translation + returns: The translated text. + examples: + request: + curl: | + curl https://api.openai.com/v1/audio/translations \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "Content-Type: multipart/form-data" \ + -F file="@/path/to/file/german.m4a" \ + -F model="whisper-1" + python: | + from openai import OpenAI + client = OpenAI() + + audio_file = open("speech.mp3", "rb") + transcript = client.audio.translations.create( + model="whisper-1", + file=audio_file + ) + node: | + const { Configuration, OpenAIApi } = require("openai"); + const configuration = new Configuration({ + apiKey: process.env.OPENAI_API_KEY, + }); + const openai = new OpenAIApi(configuration); + const resp = await openai.createTranslation( + fs.createReadStream("audio.mp3"), + "whisper-1" + ); + response: | + { + "text": "Hello, my name is Wolfgang and I come from Germany. Where are you heading today?" + } + + /files: + get: + operationId: listFiles + tags: + - Files + summary: Returns a list of files that belong to the user's organization. + parameters: + - in: query + name: purpose + required: false + schema: + type: string + description: Only return files with the given purpose. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListFilesResponse" + x-oaiMeta: + name: List files + returns: A list of [File](/docs/api-reference/files/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/files \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.files.list() + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const list = await openai.files.list(); + + for await (const file of list) { + console.log(file); + } + } + + main(); + response: | + { + "data": [ + { + "id": "file-abc123", + "object": "file", + "bytes": 175, + "created_at": 1613677385, + "filename": "salesOverview.pdf", + "purpose": "assistants", + }, + { + "id": "file-abc123", + "object": "file", + "bytes": 140, + "created_at": 1613779121, + "filename": "puppy.jsonl", + "purpose": "fine-tune", + } + ], + "object": "list" + } + post: + operationId: createFile + tags: + - Files + summary: | + Upload a file that can be used across various endpoints/features. The size of all the files uploaded by one organization can be up to 100 GB. + + The size of individual files for can be a maximum of 512MB. See the [Assistants Tools guide](/docs/assistants/tools) to learn more about the types of files supported. The Fine-tuning API only supports `.jsonl` files. + + Please [contact us](https://help.openai.com/) if you need to increase these storage limits. + requestBody: + required: true + content: + multipart/form-data: + schema: + $ref: "#/components/schemas/CreateFileRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/OpenAIFile" + x-oaiMeta: + name: Upload file + returns: The uploaded [File](/docs/api-reference/files/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/files \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -F purpose="fine-tune" \ + -F file="@mydata.jsonl" + python: | + from openai import OpenAI + client = OpenAI() + + client.files.create( + file=open("mydata.jsonl", "rb"), + purpose="fine-tune" + ) + node.js: |- + import fs from "fs"; + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const file = await openai.files.create({ + file: fs.createReadStream("mydata.jsonl"), + purpose: "fine-tune", + }); + + console.log(file); + } + + main(); + response: | + { + "id": "file-BK7bzQj3FfZFXr7DbL6xJwfo", + "object": "file", + "bytes": 120000, + "created_at": 1677610602, + "filename": "mydata.jsonl", + "purpose": "fine-tune", + } + /files/{file_id}: + delete: + operationId: deleteFile + tags: + - Files + summary: Delete a file. + parameters: + - in: path + name: file_id + required: true + schema: + type: string + description: The ID of the file to use for this request. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/DeleteFileResponse" + x-oaiMeta: + name: Delete file + returns: Deletion status. + examples: + request: + curl: | + curl https://api.openai.com/v1/files/file-abc123 \ + -X DELETE \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.files.delete("file-oaG6vwLtV3v3mWpvxexWDKxq") + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const file = await openai.files.del("file-abc123"); + + console.log(file); + } + + main(); + response: | + { + "id": "file-abc123", + "object": "file", + "deleted": true + } + get: + operationId: retrieveFile + tags: + - Files + summary: Returns information about a specific file. + parameters: + - in: path + name: file_id + required: true + schema: + type: string + description: The ID of the file to use for this request. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/OpenAIFile" + x-oaiMeta: + name: Retrieve file + returns: The [File](/docs/api-reference/files/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/files/file-BK7bzQj3FfZFXr7DbL6xJwfo \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.files.retrieve("file-BK7bzQj3FfZFXr7DbL6xJwfo") + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const file = await openai.files.retrieve("file-BK7bzQj3FfZFXr7DbL6xJwfo"); + + console.log(file); + } + + main(); + response: | + { + "id": "file-BK7bzQj3FfZFXr7DbL6xJwfo", + "object": "file", + "bytes": 120000, + "created_at": 1677610602, + "filename": "mydata.jsonl", + "purpose": "fine-tune", + } + /files/{file_id}/content: + get: + operationId: downloadFile + tags: + - Files + summary: Returns the contents of the specified file. + parameters: + - in: path + name: file_id + required: true + schema: + type: string + description: The ID of the file to use for this request. + responses: + "200": + description: OK + content: + application/json: + schema: + type: string + x-oaiMeta: + name: Retrieve file content + returns: The file content. + examples: + request: + curl: | + curl https://api.openai.com/v1/files/file-BK7bzQj3FfZFXr7DbL6xJwfo/content \ + -H "Authorization: Bearer $OPENAI_API_KEY" > file.jsonl + python: | + from openai import OpenAI + client = OpenAI() + + content = client.files.retrieve_content("file-BK7bzQj3FfZFXr7DbL6xJwfo") + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const file = await openai.files.retrieveContent("file-BK7bzQj3FfZFXr7DbL6xJwfo"); + + console.log(file); + } + + main(); + + /fine_tuning/jobs: + post: + operationId: createFineTuningJob + tags: + - Fine-tuning + summary: | + Creates a job that fine-tunes a specified model from a given dataset. + + Response includes details of the enqueued job including job status and the name of the fine-tuned models once complete. + + [Learn more about fine-tuning](/docs/guides/fine-tuning) + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateFineTuningJobRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/FineTuningJob" + x-oaiMeta: + name: Create fine-tuning job + returns: A [fine-tuning.job](/docs/api-reference/fine-tuning/object) object. + examples: + - title: No hyperparameters + request: + curl: | + curl https://api.openai.com/v1/fine_tuning/jobs \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "training_file": "file-BK7bzQj3FfZFXr7DbL6xJwfo", + "model": "gpt-3.5-turbo" + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.create( + training_file="file-abc123", + model="gpt-3.5-turbo" + ) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTuning.jobs.create({ + training_file: "file-abc123" + }); + + console.log(fineTune); + } + + main(); + response: | + { + "object": "fine_tuning.job", + "id": "ftjob-abc123", + "model": "gpt-3.5-turbo-0613", + "created_at": 1614807352, + "fine_tuned_model": null, + "organization_id": "org-123", + "result_files": [], + "status": "queued", + "validation_file": null, + "training_file": "file-abc123", + } + - title: Hyperparameters + request: + curl: | + curl https://api.openai.com/v1/fine_tuning/jobs \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "training_file": "file-abc123", + "model": "gpt-3.5-turbo", + "hyperparameters": { + "n_epochs": 2 + } + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.create( + training_file="file-abc123", + model="gpt-3.5-turbo", + hyperparameters={ + "n_epochs":2 + } + ) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTuning.jobs.create({ + training_file: "file-abc123", + model: "gpt-3.5-turbo", + hyperparameters: { n_epochs: 2 } + }); + + console.log(fineTune); + } + + main(); + response: | + { + "object": "fine_tuning.job", + "id": "ftjob-abc123", + "model": "gpt-3.5-turbo-0613", + "created_at": 1614807352, + "fine_tuned_model": null, + "organization_id": "org-123", + "result_files": [], + "status": "queued", + "validation_file": null, + "training_file": "file-abc123", + "hyperparameters": {"n_epochs": 2}, + } + - title: Validation file + request: + curl: | + curl https://api.openai.com/v1/fine_tuning/jobs \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "training_file": "file-abc123", + "validation_file": "file-abc123", + "model": "gpt-3.5-turbo" + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.create( + training_file="file-abc123", + validation_file="file-def456", + model="gpt-3.5-turbo" + ) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTuning.jobs.create({ + training_file: "file-abc123", + validation_file: "file-abc123" + }); + + console.log(fineTune); + } + + main(); + response: | + { + "object": "fine_tuning.job", + "id": "ftjob-abc123", + "model": "gpt-3.5-turbo-0613", + "created_at": 1614807352, + "fine_tuned_model": null, + "organization_id": "org-123", + "result_files": [], + "status": "queued", + "validation_file": "file-abc123", + "training_file": "file-abc123", + } + get: + operationId: listPaginatedFineTuningJobs + tags: + - Fine-tuning + summary: | + List your organization's fine-tuning jobs + parameters: + - name: after + in: query + description: Identifier for the last job from the previous pagination request. + required: false + schema: + type: string + - name: limit + in: query + description: Number of fine-tuning jobs to retrieve. + required: false + schema: + type: integer + default: 20 + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListPaginatedFineTuningJobsResponse" + x-oaiMeta: + name: List fine-tuning jobs + returns: A list of paginated [fine-tuning job](/docs/api-reference/fine-tuning/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine_tuning/jobs?limit=2 \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.list() + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const list = await openai.fineTuning.jobs.list(); + + for await (const fineTune of list) { + console.log(fineTune); + } + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "object": "fine_tuning.job.event", + "id": "ft-event-TjX0lMfOniCZX64t9PUQT5hn", + "created_at": 1689813489, + "level": "warn", + "message": "Fine tuning process stopping due to job cancellation", + "data": null, + "type": "message" + }, + { ... }, + { ... } + ], "has_more": true + } + /fine_tuning/jobs/{fine_tuning_job_id}: + get: + operationId: retrieveFineTuningJob + tags: + - Fine-tuning + summary: | + Get info about a fine-tuning job. + + [Learn more about fine-tuning](/docs/guides/fine-tuning) + parameters: + - in: path + name: fine_tuning_job_id + required: true + schema: + type: string + example: ft-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the fine-tuning job. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/FineTuningJob" + x-oaiMeta: + name: Retrieve fine-tuning job + returns: The [fine-tuning](/docs/api-reference/fine-tunes/object) object with the given ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine_tuning/jobs/ft-AF1WoRqd3aJAHsqc9NY7iL8F \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.retrieve("ftjob-abc123") + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTuning.jobs.retrieve("ftjob-abc123"); + + console.log(fineTune); + } + + main(); + response: &fine_tuning_example | + { + "object": "fine_tuning.job", + "id": "ftjob-abc123", + "model": "davinci-002", + "created_at": 1692661014, + "finished_at": 1692661190, + "fine_tuned_model": "ft:davinci-002:my-org:custom_suffix:7q8mpxmy", + "organization_id": "org-123", + "result_files": [ + "file-abc123" + ], + "status": "succeeded", + "validation_file": null, + "training_file": "file-abc123", + "hyperparameters": { + "n_epochs": 4, + }, + "trained_tokens": 5768 + } + /fine_tuning/jobs/{fine_tuning_job_id}/events: + get: + operationId: listFineTuningEvents + tags: + - Fine-tuning + summary: | + Get status updates for a fine-tuning job. + parameters: + - in: path + name: fine_tuning_job_id + required: true + schema: + type: string + example: ft-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the fine-tuning job to get events for. + - name: after + in: query + description: Identifier for the last event from the previous pagination request. + required: false + schema: + type: string + - name: limit + in: query + description: Number of events to retrieve. + required: false + schema: + type: integer + default: 20 + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListFineTuningJobEventsResponse" + x-oaiMeta: + name: List fine-tuning events + returns: A list of fine-tuning event objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine_tuning/jobs/ftjob-abc123/events \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.list_events( + fine_tuning_job_id="ftjob-abc123", + limit=2 + ) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const list = await openai.fineTuning.list_events(id="ftjob-abc123", limit=2); + + for await (const fineTune of list) { + console.log(fineTune); + } + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "object": "fine_tuning.job.event", + "id": "ft-event-ddTJfwuMVpfLXseO0Am0Gqjm", + "created_at": 1692407401, + "level": "info", + "message": "Fine tuning job successfully completed", + "data": null, + "type": "message" + }, + { + "object": "fine_tuning.job.event", + "id": "ft-event-tyiGuB72evQncpH87xe505Sv", + "created_at": 1692407400, + "level": "info", + "message": "New fine-tuned model created: ft:gpt-3.5-turbo:openai::7p4lURel", + "data": null, + "type": "message" + } + ], + "has_more": true + } + /fine_tuning/jobs/{fine_tuning_job_id}/cancel: + post: + operationId: cancelFineTuningJob + tags: + - Fine-tuning + summary: | + Immediately cancel a fine-tune job. + parameters: + - in: path + name: fine_tuning_job_id + required: true + schema: + type: string + example: ft-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the fine-tuning job to cancel. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/FineTuningJob" + x-oaiMeta: + name: Cancel fine-tuning + returns: The cancelled [fine-tuning](/docs/api-reference/fine-tuning/object) object. + examples: + request: + curl: | + curl -X POST https://api.openai.com/v1/fine_tuning/jobs/ftjob-abc123/cancel \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.fine_tuning.jobs.cancel("ftjob-abc123") + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTuning.jobs.cancel("ftjob-abc123"); + + console.log(fineTune); + } + main(); + response: | + { + "object": "fine_tuning.job", + "id": "ftjob-abc123", + "model": "gpt-3.5-turbo-0613", + "created_at": 1689376978, + "fine_tuned_model": null, + "organization_id": "org-123", + "result_files": [], + "hyperparameters": { + "n_epochs": "auto" + }, + "status": "cancelled", + "validation_file": "file-abc123", + "training_file": "file-abc123" + } + + /fine-tunes: + post: + operationId: createFineTune + deprecated: true + tags: + - Fine-tunes + summary: | + Creates a job that fine-tunes a specified model from a given dataset. + + Response includes details of the enqueued job including job status and the name of the fine-tuned models once complete. + + [Learn more about fine-tuning](/docs/guides/legacy-fine-tuning) + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateFineTuneRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/FineTune" + x-oaiMeta: + name: Create fine-tune + returns: A [fine-tune](/docs/api-reference/fine-tunes/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine-tunes \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "training_file": "file-abc123" + }' + python: | + # deprecated + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTunes.create({ + training_file: "file-abc123" + }); + + console.log(fineTune); + } + + main(); + response: | + { + "id": "ft-AF1WoRqd3aJAHsqc9NY7iL8F", + "object": "fine-tune", + "model": "curie", + "created_at": 1614807352, + "events": [ + { + "object": "fine-tune-event", + "created_at": 1614807352, + "level": "info", + "message": "Job enqueued. Waiting for jobs ahead to complete. Queue number: 0." + } + ], + "fine_tuned_model": null, + "hyperparams": { + "batch_size": 4, + "learning_rate_multiplier": 0.1, + "n_epochs": 4, + "prompt_loss_weight": 0.1, + }, + "organization_id": "org-123", + "result_files": [], + "status": "pending", + "validation_files": [], + "training_files": [ + { + "id": "file-abc123", + "object": "file", + "bytes": 1547276, + "created_at": 1610062281, + "filename": "my-data-train.jsonl", + "purpose": "fine-tune-results" + } + ], + "updated_at": 1614807352, + } + get: + operationId: listFineTunes + deprecated: true + tags: + - Fine-tunes + summary: | + List your organization's fine-tuning jobs + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListFineTunesResponse" + x-oaiMeta: + name: List fine-tunes + returns: A list of [fine-tune](/docs/api-reference/fine-tunes/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine-tunes \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + # deprecated + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const list = await openai.fineTunes.list(); + + for await (const fineTune of list) { + console.log(fineTune); + } + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "ft-AF1WoRqd3aJAHsqc9NY7iL8F", + "object": "fine-tune", + "model": "curie", + "created_at": 1614807352, + "fine_tuned_model": null, + "hyperparams": { ... }, + "organization_id": "org-123", + "result_files": [], + "status": "pending", + "validation_files": [], + "training_files": [ { ... } ], + "updated_at": 1614807352, + }, + { ... }, + { ... } + ] + } + /fine-tunes/{fine_tune_id}: + get: + operationId: retrieveFineTune + deprecated: true + tags: + - Fine-tunes + summary: | + Gets info about the fine-tune job. + + [Learn more about fine-tuning](/docs/guides/legacy-fine-tuning) + parameters: + - in: path + name: fine_tune_id + required: true + schema: + type: string + example: ft-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the fine-tune job + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/FineTune" + x-oaiMeta: + name: Retrieve fine-tune + returns: The [fine-tune](/docs/api-reference/fine-tunes/object) object with the given ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine-tunes/ft-AF1WoRqd3aJAHsqc9NY7iL8F \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + # deprecated + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTunes.retrieve("ft-AF1WoRqd3aJAHsqc9NY7iL8F"); + + console.log(fineTune); + } + + main(); + response: &fine_tune_example | + { + "id": "ft-AF1WoRqd3aJAHsqc9NY7iL8F", + "object": "fine-tune", + "model": "curie", + "created_at": 1614807352, + "events": [ + { + "object": "fine-tune-event", + "created_at": 1614807352, + "level": "info", + "message": "Job enqueued. Waiting for jobs ahead to complete. Queue number: 0." + }, + { + "object": "fine-tune-event", + "created_at": 1614807356, + "level": "info", + "message": "Job started." + }, + { + "object": "fine-tune-event", + "created_at": 1614807861, + "level": "info", + "message": "Uploaded snapshot: curie:ft-acmeco-2021-03-03-21-44-20." + }, + { + "object": "fine-tune-event", + "created_at": 1614807864, + "level": "info", + "message": "Uploaded result files: file-abc123." + }, + { + "object": "fine-tune-event", + "created_at": 1614807864, + "level": "info", + "message": "Job succeeded." + } + ], + "fine_tuned_model": "curie:ft-acmeco-2021-03-03-21-44-20", + "hyperparams": { + "batch_size": 4, + "learning_rate_multiplier": 0.1, + "n_epochs": 4, + "prompt_loss_weight": 0.1, + }, + "organization_id": "org-123", + "result_files": [ + { + "id": "file-abc123", + "object": "file", + "bytes": 81509, + "created_at": 1614807863, + "filename": "compiled_results.csv", + "purpose": "fine-tune-results" + } + ], + "status": "succeeded", + "validation_files": [], + "training_files": [ + { + "id": "file-abc123", + "object": "file", + "bytes": 1547276, + "created_at": 1610062281, + "filename": "my-data-train.jsonl", + "purpose": "fine-tune" + } + ], + "updated_at": 1614807865, + } + /fine-tunes/{fine_tune_id}/cancel: + post: + operationId: cancelFineTune + deprecated: true + tags: + - Fine-tunes + summary: | + Immediately cancel a fine-tune job. + parameters: + - in: path + name: fine_tune_id + required: true + schema: + type: string + example: ft-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the fine-tune job to cancel + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/FineTune" + x-oaiMeta: + name: Cancel fine-tune + returns: The cancelled [fine-tune](/docs/api-reference/fine-tunes/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine-tunes/ft-AF1WoRqd3aJAHsqc9NY7iL8F/cancel \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + # deprecated + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTunes.cancel("ft-AF1WoRqd3aJAHsqc9NY7iL8F"); + + console.log(fineTune); + } + main(); + response: | + { + "id": "ft-xhrpBbvVUzYGo8oUO1FY4nI7", + "object": "fine-tune", + "model": "curie", + "created_at": 1614807770, + "events": [ { ... } ], + "fine_tuned_model": null, + "hyperparams": { ... }, + "organization_id": "org-123", + "result_files": [], + "status": "cancelled", + "validation_files": [], + "training_files": [ + { + "id": "file-abc123", + "object": "file", + "bytes": 1547276, + "created_at": 1610062281, + "filename": "my-data-train.jsonl", + "purpose": "fine-tune" + } + ], + "updated_at": 1614807789, + } + /fine-tunes/{fine_tune_id}/events: + get: + operationId: listFineTuneEvents + deprecated: true + tags: + - Fine-tunes + summary: | + Get fine-grained status updates for a fine-tune job. + parameters: + - in: path + name: fine_tune_id + required: true + schema: + type: string + example: ft-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the fine-tune job to get events for. + - in: query + name: stream + required: false + schema: + type: boolean + default: false + description: | + Whether to stream events for the fine-tune job. If set to true, + events will be sent as data-only + [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) + as they become available. The stream will terminate with a + `data: [DONE]` message when the job is finished (succeeded, cancelled, + or failed). + + If set to false, only events generated so far will be returned. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListFineTuneEventsResponse" + x-oaiMeta: + name: List fine-tune events + returns: A list of fine-tune event objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/fine-tunes/ft-AF1WoRqd3aJAHsqc9NY7iL8F/events \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + # deprecated + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const fineTune = await openai.fineTunes.listEvents("ft-AF1WoRqd3aJAHsqc9NY7iL8F"); + + console.log(fineTune); + } + main(); + response: | + { + "object": "list", + "data": [ + { + "object": "fine-tune-event", + "created_at": 1614807352, + "level": "info", + "message": "Job enqueued. Waiting for jobs ahead to complete. Queue number: 0." + }, + { + "object": "fine-tune-event", + "created_at": 1614807356, + "level": "info", + "message": "Job started." + }, + { + "object": "fine-tune-event", + "created_at": 1614807861, + "level": "info", + "message": "Uploaded snapshot: curie:ft-acmeco-2021-03-03-21-44-20." + }, + { + "object": "fine-tune-event", + "created_at": 1614807864, + "level": "info", + "message": "Uploaded result files: file-abc123" + }, + { + "object": "fine-tune-event", + "created_at": 1614807864, + "level": "info", + "message": "Job succeeded." + } + ] + } + + /models: + get: + operationId: listModels + tags: + - Models + summary: Lists the currently available models, and provides basic information about each one such as the owner and availability. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListModelsResponse" + x-oaiMeta: + name: List models + returns: A list of [model](/docs/api-reference/models/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/models \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.models.list() + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const list = await openai.models.list(); + + for await (const model of list) { + console.log(model); + } + } + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "model-id-0", + "object": "model", + "created": 1686935002, + "owned_by": "organization-owner" + }, + { + "id": "model-id-1", + "object": "model", + "created": 1686935002, + "owned_by": "organization-owner", + }, + { + "id": "model-id-2", + "object": "model", + "created": 1686935002, + "owned_by": "openai" + }, + ], + "object": "list" + } + /models/{model}: + get: + operationId: retrieveModel + tags: + - Models + summary: Retrieves a model instance, providing basic information about the model such as the owner and permissioning. + parameters: + - in: path + name: model + required: true + schema: + type: string + # ideally this will be an actual ID, so this will always work from browser + example: gpt-3.5-turbo + description: The ID of the model to use for this request + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/Model" + x-oaiMeta: + name: Retrieve model + returns: The [model](/docs/api-reference/models/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/models/VAR_model_id \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.models.retrieve("VAR_model_id") + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const model = await openai.models.retrieve("gpt-3.5-turbo"); + + console.log(model); + } + + main(); + response: &retrieve_model_response | + { + "id": "VAR_model_id", + "object": "model", + "created": 1686935002, + "owned_by": "openai" + } + delete: + operationId: deleteModel + tags: + - Models + summary: Delete a fine-tuned model. You must have the Owner role in your organization to delete a model. + parameters: + - in: path + name: model + required: true + schema: + type: string + example: ft:gpt-3.5-turbo:acemeco:suffix:abc123 + description: The model to delete + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/DeleteModelResponse" + x-oaiMeta: + name: Delete fine-tune model + returns: Deletion status. + examples: + request: + curl: | + curl https://api.openai.com/v1/models/ft:gpt-3.5-turbo:acemeco:suffix:abc123 \ + -X DELETE \ + -H "Authorization: Bearer $OPENAI_API_KEY" + python: | + from openai import OpenAI + client = OpenAI() + + client.models.delete("ft:gpt-3.5-turbo:acemeco:suffix:abc123") + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const model = await openai.models.del("ft:gpt-3.5-turbo:acemeco:suffix:abc123"); + + console.log(model); + } + main(); + response: | + { + "id": "ft:gpt-3.5-turbo:acemeco:suffix:abc123", + "object": "model", + "deleted": true + } + + /moderations: + post: + operationId: createModeration + tags: + - Moderations + summary: Classifies if text violates OpenAI's Content Policy + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateModerationRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/CreateModerationResponse" + x-oaiMeta: + name: Create moderation + returns: A [moderation](/docs/api-reference/moderations/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/moderations \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -d '{ + "input": "I want to kill them." + }' + python: | + from openai import OpenAI + client = OpenAI() + + client.moderations.create(input="I want to kill them.") + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const moderation = await openai.moderations.create({ input: "I want to kill them." }); + + console.log(moderation); + } + main(); + response: &moderation_example | + { + "id": "modr-XXXXX", + "model": "text-moderation-005", + "results": [ + { + "flagged": true, + "categories": { + "sexual": false, + "hate": false, + "harassment": false, + "self-harm": false, + "sexual/minors": false, + "hate/threatening": false, + "violence/graphic": false, + "self-harm/intent": false, + "self-harm/instructions": false, + "harassment/threatening": true, + "violence": true, + }, + "category_scores": { + "sexual": 1.2282071e-06, + "hate": 0.010696256, + "harassment": 0.29842457, + "self-harm": 1.5236925e-08, + "sexual/minors": 5.7246268e-08, + "hate/threatening": 0.0060676364, + "violence/graphic": 4.435014e-06, + "self-harm/intent": 8.098441e-10, + "self-harm/instructions": 2.8498655e-11, + "harassment/threatening": 0.63055265, + "violence": 0.99011886, + } + } + ] + } + + # Assistants + /assistants: + get: + operationId: listAssistants + tags: + - Assistants + summary: Returns a list of assistants. + parameters: + - name: limit + in: query + description: &pagination_limit_param_description | + A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20. + required: false + schema: + type: integer + default: 20 + - name: order + in: query + description: &pagination_order_param_description | + Sort order by the `created_at` timestamp of the objects. `asc` for ascending order and `desc` for descending order. + schema: + type: string + default: desc + enum: ["asc", "desc"] + - name: after + in: query + description: &pagination_after_param_description | + A cursor for use in pagination. `after` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list. + schema: + type: string + - name: before + in: query + description: &pagination_before_param_description | + A cursor for use in pagination. `before` is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list. + schema: + type: string + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListAssistantsResponse" + x-oaiMeta: + name: List assistants + beta: true + returns: A list of [assistant](/docs/api-reference/assistants/object) objects. + examples: + request: + curl: | + curl "https://api.openai.com/v1/assistants?order=desc&limit=20" \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" + python: | + from openai import OpenAI + client = OpenAI() + + my_assistants = client.beta.assistants.list( + order="desc", + limit="20", + ) + print(my_assistants.data) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const myAssistants = await openai.beta.assistants.list({ + order: "desc", + limit: "20", + }); + + console.log(myAssistants.data); + } + + main(); + response: &list_assistants_example | + { + "object": "list", + "data": [ + { + "id": "asst_abc123", + "object": "assistant", + "created_at": 1698982736, + "name": "Coding Tutor", + "description": null, + "model": "gpt-4", + "instructions": "You are a helpful assistant designed to make me better at coding!", + "tools": [], + "file_ids": [], + "metadata": {} + }, + { + "id": "asst_abc456", + "object": "assistant", + "created_at": 1698982718, + "name": "My Assistant", + "description": null, + "model": "gpt-4", + "instructions": "You are a helpful assistant designed to make me better at coding!", + "tools": [], + "file_ids": [], + "metadata": {} + }, + { + "id": "asst_abc789", + "object": "assistant", + "created_at": 1698982643, + "name": null, + "description": null, + "model": "gpt-4", + "instructions": null, + "tools": [], + "file_ids": [], + "metadata": {} + } + ], + "first_id": "asst_abc123", + "last_id": "asst_abc789", + "has_more": false + } + post: + operationId: createAssistant + tags: + - Assistants + summary: Create an assistant with a model and instructions. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateAssistantRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/AssistantObject" + x-oaiMeta: + name: Create assistant + beta: true + returns: An [assistant](/docs/api-reference/assistants/object) object. + examples: + - title: Code Interpreter + request: + curl: | + curl "https://api.openai.com/v1/assistants" \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.", + "name": "Math Tutor" + "tools": [{"type": "code_interpreter"}], + "model": "gpt-4" + }' + + python: | + from openai import OpenAI + client = OpenAI() + + my_assistant = client.beta.assistants.create( + instructions="You are a personal math tutor. When asked a question, write and run Python code to answer the question.", + name="Math Tutor", + tools=[{"type": "code_interpreter"}], + model="gpt-4", + ) + print(my_assistant) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const myAssistant = await openai.beta.assistants.create({ + instructions: + "You are a personal math tutor. When asked a question, write and run Python code to answer the question.", + name: "Math Tutor", + tools: [{ type: "code_interpreter" }], + model: "gpt-4", + }); + + console.log(myAssistant); + } + + main(); + response: &create_assistants_example | + { + "id": "asst_abc123", + "object": "assistant", + "created_at": 1698984975, + "name": "Math Tutor", + "description": null, + "model": "gpt-4", + "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.", + "tools": [ + { + "type": "code_interpreter" + } + ], + "file_ids": [], + "metadata": {} + } + - title: Files + request: + curl: | + curl https://api.openai.com/v1/assistants \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.", + "tools": [{"type": "retrieval"}], + "model": "gpt-4", + "file_ids": ["file-abc123"] + }' + python: | + from openai import OpenAI + client = OpenAI() + + my_assistant = client.beta.assistants.create( + instructions="You are an HR bot, and you have access to files to answer employee questions about company policies.", + name="HR Helper", + tools=[{"type": "retrieval"}], + model="gpt-4", + file_ids=["file-abc123"], + ) + print(my_assistant) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const myAssistant = await openai.beta.assistants.create({ + instructions: + "You are an HR bot, and you have access to files to answer employee questions about company policies.", + name: "HR Helper", + tools: [{ type: "retrieval" }], + model: "gpt-4", + file_ids: ["file-abc123"], + }); + + console.log(myAssistant); + } + + main(); + response: | + { + "id": "asst_abc123", + "object": "assistant", + "created_at": 1699009403, + "name": "HR Helper", + "description": null, + "model": "gpt-4", + "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.", + "tools": [ + { + "type": "retrieval" + } + ], + "file_ids": [ + "file-abc123" + ], + "metadata": {} + } + + /assistants/{assistant_id}: + get: + operationId: getAssistant + tags: + - Assistants + summary: Retrieves an assistant. + parameters: + - in: path + name: assistant_id + required: true + schema: + type: string + description: The ID of the assistant to retrieve. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/AssistantObject" + x-oaiMeta: + name: Retrieve assistant + beta: true + returns: The [assistant](/docs/api-reference/assistants/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" + python: | + from openai import OpenAI + client = OpenAI() + + my_assistant = client.beta.assistants.retrieve("asst_abc123") + print(my_assistant) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const myAssistant = await openai.beta.assistants.retrieve( + "asst_abc123" + ); + + console.log(myAssistant); + } + + main(); + response: | + { + "id": "asst_abc123", + "object": "assistant", + "created_at": 1699009709, + "name": "HR Helper", + "description": null, + "model": "gpt-4", + "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.", + "tools": [ + { + "type": "retrieval" + } + ], + "file_ids": [ + "file-abc123" + ], + "metadata": {} + } + post: + operationId: modifyAssistant + tags: + - Assistant + summary: Modifies an assistant. + parameters: + - in: path + name: assistant_id + required: true + schema: + type: string + description: The ID of the assistant to modify. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/ModifyAssistantRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/AssistantObject" + x-oaiMeta: + name: Modify assistant + beta: true + returns: The modified [assistant](/docs/api-reference/assistants/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.", + "tools": [{"type": "retrieval"}], + "model": "gpt-4", + "file_ids": ["file-abc123", "file-abc456"] + }' + python: | + from openai import OpenAI + client = OpenAI() + + my_updated_assistant = client.beta.assistants.update( + "asst_abc123", + instructions="You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.", + name="HR Helper", + tools=[{"type": "retrieval"}], + model="gpt-4", + file_ids=["file-abc123", "file-abc456"], + ) + + print(my_updated_assistant) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const myUpdatedAssistant = await openai.beta.assistants.update( + "asst_abc123", + { + instructions: + "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.", + name: "HR Helper", + tools: [{ type: "retrieval" }], + model: "gpt-4", + file_ids: [ + "file-abc123", + "file-abc456", + ], + } + ); + + console.log(myUpdatedAssistant); + } + + main(); + response: | + { + "id": "asst_abc123", + "object": "assistant", + "created_at": 1699009709, + "name": "HR Helper", + "description": null, + "model": "gpt-4", + "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies. Always response with info from either of the files.", + "tools": [ + { + "type": "retrieval" + } + ], + "file_ids": [ + "file-abc123", + "file-abc456" + ], + "metadata": {} + } + delete: + operationId: deleteAssistant + tags: + - Assistants + summary: Delete an assistant. + parameters: + - in: path + name: assistant_id + required: true + schema: + type: string + description: The ID of the assistant to delete. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/DeleteAssistantResponse" + x-oaiMeta: + name: Delete assistant + beta: true + returns: Deletion status + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -X DELETE + python: | + from openai import OpenAI + client = OpenAI() + + response = client.beta.assistants.delete("asst_QLoItBbqwyAJEzlTy4y9kOMM") + print(response) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const response = await openai.beta.assistants.del("asst_QLoItBbqwyAJEzlTy4y9kOMM"); + + console.log(response); + } + main(); + response: | + { + "id": "asst_abc123", + "object": "assistant.deleted", + "deleted": true + } + + /threads: + post: + operationId: createThread + tags: + - Assistants + summary: Create a thread. + requestBody: + content: + application/json: + schema: + $ref: "#/components/schemas/CreateThreadRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ThreadObject" + x-oaiMeta: + name: Create thread + beta: true + returns: A [thread](/docs/api-reference/threads) object. + examples: + - title: Empty + request: + curl: | + curl https://api.openai.com/v1/threads \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '' + python: | + from openai import OpenAI + client = OpenAI() + + empty_thread = client.beta.threads.create() + print(empty_thread) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const emptyThread = await openai.beta.threads.create(); + + console.log(emptyThread); + } + + main(); + response: | + { + "id": "thread_abc123", + "object": "thread", + "created_at": 1699012949, + "metadata": {} + } + - title: Messages + request: + curl: | + curl https://api.openai.com/v1/threads \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "messages": [{ + "role": "user", + "content": "Hello, what is AI?", + "file_ids": ["file-abc123"] + }, { + "role": "user", + "content": "How does AI work? Explain it in simple terms." + }] + }' + python: | + from openai import OpenAI + client = OpenAI() + + message_thread = client.beta.threads.create( + messages=[ + { + "role": "user", + "content": "Hello, what is AI?", + "file_ids": ["file-abc123"], + }, + { + "role": "user", + "content": "How does AI work? Explain it in simple terms." + }, + ] + ) + + print(message_thread) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const messageThread = await openai.beta.threads.create({ + messages: [ + { + role: "user", + content: "Hello, what is AI?", + file_ids: ["file-abc123"], + }, + { + role: "user", + content: "How does AI work? Explain it in simple terms.", + }, + ], + }); + + console.log(messageThread); + } + + main(); + response: | + { + id: 'thread_abc123', + object: 'thread', + created_at: 1699014083, + metadata: {} + } + + /threads/{thread_id}: + get: + operationId: getThread + tags: + - Assistants + summary: Retrieves a thread. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to retrieve. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ThreadObject" + x-oaiMeta: + name: Retrieve thread + beta: true + returns: The [thread](/docs/api-reference/threads/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" + python: | + from openai import OpenAI + client = OpenAI() + + my_thread = client.beta.threads.retrieve("thread_abc123") + print(my_thread) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const myThread = await openai.beta.threads.retrieve( + "thread_abc123" + ); + + console.log(myThread); + } + + main(); + response: | + { + "id": "thread_abc123", + "object": "thread", + "created_at": 1699014083, + "metadata": {} + } + post: + operationId: modifyThread + tags: + - Assistants + summary: Modifies a thread. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to modify. Only the `metadata` can be modified. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/ModifyThreadRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ThreadObject" + x-oaiMeta: + name: Modify thread + beta: true + returns: The modified [thread](/docs/api-reference/threads/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "metadata": { + "modified": "true", + "user": "abc123" + } + }' + python: | + from openai import OpenAI + client = OpenAI() + + my_updated_thread = client.beta.threads.update( + "thread_abc123", + metadata={ + "modified": "true", + "user": "abc123" + } + ) + print(my_updated_thread) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const updatedThread = await openai.beta.threads.update( + "thread_abc123", + { + metadata: { modified: "true", user: "abc123" }, + } + ); + + console.log(updatedThread); + } + + main(); + response: | + { + "id": "thread_abc123", + "object": "thread", + "created_at": 1699014083, + "metadata": { + "modified": "true", + "user": "abc123" + } + } + delete: + operationId: deleteThread + tags: + - Assistants + summary: Delete a thread. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to delete. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/DeleteThreadResponse" + x-oaiMeta: + name: Delete thread + beta: true + returns: Deletion status + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -X DELETE + python: | + from openai import OpenAI + client = OpenAI() + + response = client.beta.threads.delete("thread_abc123") + print(response) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const response = await openai.beta.threads.del("thread_abc123"); + + console.log(response); + } + main(); + response: | + { + "id": "thread_abc123", + "object": "thread.deleted", + "deleted": true + } + + /threads/{thread_id}/messages: + get: + operationId: listMessages + tags: + - Assistants + summary: Returns a list of messages for a given thread. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the [thread](/docs/api-reference/threads) the messages belong to. + - name: limit + in: query + description: *pagination_limit_param_description + required: false + schema: + type: integer + default: 20 + - name: order + in: query + description: *pagination_order_param_description + schema: + type: string + default: desc + enum: ["asc", "desc"] + - name: after + in: query + description: *pagination_after_param_description + schema: + type: string + - name: before + in: query + description: *pagination_before_param_description + schema: + type: string + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListMessagesResponse" + x-oaiMeta: + name: List messages + beta: true + returns: A list of [message](/docs/api-reference/messages) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123/messages \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" + python: | + from openai import OpenAI + client = OpenAI() + + thread_messages = client.beta.threads.messages.list("thread_1OWaSqVIxJdy3KYnJLbXEWhy") + print(thread_messages.data) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const threadMessages = await openai.beta.threads.messages.list( + "thread_1OWaSqVIxJdy3KYnJLbXEWhy" + ); + + console.log(threadMessages.data); + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "msg_abc123", + "object": "thread.message", + "created_at": 1699016383, + "thread_id": "thread_abc123", + "role": "user", + "content": [ + { + "type": "text", + "text": { + "value": "How does AI work? Explain it in simple terms.", + "annotations": [] + } + } + ], + "file_ids": [], + "assistant_id": null, + "run_id": null, + "metadata": {} + }, + { + "id": "msg_abc456", + "object": "thread.message", + "created_at": 1699016383, + "thread_id": "thread_abc123", + "role": "user", + "content": [ + { + "type": "text", + "text": { + "value": "Hello, what is AI?", + "annotations": [] + } + } + ], + "file_ids": [ + "file-abc123" + ], + "assistant_id": null, + "run_id": null, + "metadata": {} + } + ], + "first_id": "msg_abc123", + "last_id": "msg_abc456", + "has_more": false + } + post: + operationId: createMessage + tags: + - Assistants + summary: Create a message. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the [thread](/docs/api-reference/threads) to create a message for. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateMessageRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/MessageObject" + x-oaiMeta: + name: Create message + beta: true + returns: A [message](/docs/api-reference/messages/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123/messages \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "role": "user", + "content": "How does AI work? Explain it in simple terms." + }' + python: | + from openai import OpenAI + client = OpenAI() + + thread_message = client.beta.threads.messages.create( + "thread_abc123", + role="user", + content="How does AI work? Explain it in simple terms.", + ) + print(thread_message) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const threadMessages = await openai.beta.threads.messages.create( + "thread_abc123", + { role: "user", content: "How does AI work? Explain it in simple terms." } + ); + + console.log(threadMessages); + } + + main(); + response: | + { + "id": "msg_abc123", + "object": "thread.message", + "created_at": 1699017614, + "thread_id": "thread_abc123", + "role": "user", + "content": [ + { + "type": "text", + "text": { + "value": "How does AI work? Explain it in simple terms.", + "annotations": [] + } + } + ], + "file_ids": [], + "assistant_id": null, + "run_id": null, + "metadata": {} + } + + /threads/{thread_id}/messages/{message_id}: + get: + operationId: getMessage + tags: + - Assistants + summary: Retrieve a message. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the [thread](/docs/api-reference/threads) to which this message belongs. + - in: path + name: message_id + required: true + schema: + type: string + description: The ID of the message to retrieve. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/MessageObject" + x-oaiMeta: + name: Retrieve message + beta: true + returns: The [message](/docs/api-reference/threads/messages/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123/messages/msg_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" + python: | + from openai import OpenAI + client = OpenAI() + + message = client.beta.threads.messages.retrieve( + message_id="msg_abc123", + thread_id="thread_abc123", + ) + print(message) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const message = await openai.beta.threads.messages.retrieve( + "thread_abc123", + "msg_abc123" + ); + + console.log(message); + } + + main(); + response: | + { + "id": "msg_abc123", + "object": "thread.message", + "created_at": 1699017614, + "thread_id": "thread_abc123", + "role": "user", + "content": [ + { + "type": "text", + "text": { + "value": "How does AI work? Explain it in simple terms.", + "annotations": [] + } + } + ], + "file_ids": [], + "assistant_id": null, + "run_id": null, + "metadata": {} + } + post: + operationId: modifyMessage + tags: + - Assistants + summary: Modifies a message. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to which this message belongs. + - in: path + name: message_id + required: true + schema: + type: string + description: The ID of the message to modify. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/ModifyMessageRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/MessageObject" + x-oaiMeta: + name: Modify message + beta: true + returns: The modified [message](/docs/api-reference/threads/messages/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_abc123/messages/msg_abc123 \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer $OPENAI_API_KEY" \ + -H "OpenAI-Beta: assistants=v1" \ + -d '{ + "metadata": { + "modified": "true", + "user": "abc123" + } + }' + python: | + from openai import OpenAI + client = OpenAI() + + message = client.beta.threads.messages.update( + message_id="msg_abc12", + thread_id="thread_abc123", + metadata={ + "modified": "true", + "user": "abc123", + }, + ) + print(message) + node.js: |- + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const message = await openai.beta.threads.messages.update( + "thread_abc123", + "msg_abc123", + { + metadata: { + modified: "true", + user: "abc123", + }, + } + }' + response: | + { + "id": "msg_abc123", + "object": "thread.message", + "created_at": 1699017614, + "thread_id": "thread_abc123", + "role": "user", + "content": [ + { + "type": "text", + "text": { + "value": "How does AI work? Explain it in simple terms.", + "annotations": [] + } + } + ], + "file_ids": [], + "assistant_id": null, + "run_id": null, + "metadata": { + "modified": "true", + "user": "abc123" + } + } + + /threads/runs: + post: + operationId: createThreadAndRun + tags: + - Assistants + summary: Create a thread and run it in one request. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateThreadAndRunRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunObject" + x-oaiMeta: + name: Create thread and run + beta: true + returns: A [run](/docs/api-reference/runs/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/runs \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' \ + -d '{ + "assistant_id": "asst_IgmpQTah3ZfPHCVZjTqAY8Kv", + "thread": { + "messages": [ + {"role": "user", "content": "Explain deep learning to a 5 year old."} + ] + } + }' + python: | + from openai import OpenAI + client = OpenAI() + + run = client.beta.threads.create_and_run( + assistant_id="asst_IgmpQTah3ZfPHCVZjTqAY8Kv", + thread={ + "messages": [ + {"role": "user", "content": "Explain deep learning to a 5 year old."} + ] + } + ) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const run = await openai.beta.threads.createAndRun({ + assistant_id: "asst_IgmpQTah3ZfPHCVZjTqAY8Kv", + thread: { + messages: [ + { role: "user", content: "Explain deep learning to a 5 year old." }, + ], + }, + }); + + console.log(run); + } + + main(); + response: | + { + "id": "run_3Qudf05GGhCleEg9ggwfJQih", + "object": "thread.run", + "created_at": 1699076792, + "assistant_id": "asst_IgmpQTah3ZfPHCVZjTqAY8Kv", + "thread_id": "thread_Ec3eKZcWI00WDZRC7FZci8hP", + "status": "queued", + "started_at": null, + "expires_at": 1699077392, + "cancelled_at": null, + "failed_at": null, + "completed_at": null, + "last_error": null, + "model": "gpt-4", + "instructions": "You are a helpful assistant.", + "tools": [], + "file_ids": [], + "metadata": {} + } + + /threads/{thread_id}/runs: + get: + operationId: listRuns + tags: + - Assistants + summary: Returns a list of runs belonging to a thread. + parameters: + - name: thread_id + in: path + required: true + schema: + type: string + description: The ID of the thread the run belongs to. + - name: limit + in: query + description: *pagination_limit_param_description + required: false + schema: + type: integer + default: 20 + - name: order + in: query + description: *pagination_order_param_description + schema: + type: string + default: desc + enum: ["asc", "desc"] + - name: after + in: query + description: *pagination_after_param_description + schema: + type: string + - name: before + in: query + description: *pagination_before_param_description + schema: + type: string + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListRunsResponse" + x-oaiMeta: + name: List runs + beta: true + returns: A list of [run](/docs/api-reference/runs/object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_BDDwIqM4KgHibXX3mqmN3Lgs/runs \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + runs = client.beta.threads.runs.list( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs" + ) + print(runs) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const runs = await openai.beta.threads.runs.list( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs" + ); + + console.log(runs); + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "run_5pyUEwhaPk11vCKiDneUWXXY", + "object": "thread.run", + "created_at": 1699075072, + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "status": "completed", + "started_at": 1699075072, + "expires_at": null, + "cancelled_at": null, + "failed_at": null, + "completed_at": 1699075073, + "last_error": null, + "model": "gpt-3.5-turbo", + "instructions": null, + "tools": [ + { + "type": "code_interpreter" + } + ], + "file_ids": [ + "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "file-dEWwUbt2UGHp3v0e0DpCzemP" + ], + "metadata": {} + }, + { + "id": "run_UWvV94U0FQYiT2rlbBrdEVmC", + "object": "thread.run", + "created_at": 1699063290, + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "status": "completed", + "started_at": 1699063290, + "expires_at": null, + "cancelled_at": null, + "failed_at": null, + "completed_at": 1699063291, + "last_error": null, + "model": "gpt-3.5-turbo", + "instructions": null, + "tools": [ + { + "type": "code_interpreter" + } + ], + "file_ids": [ + "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "file-dEWwUbt2UGHp3v0e0DpCzemP" + ], + "metadata": {} + } + ], + "first_id": "run_5pyUEwhaPk11vCKiDneUWXXY", + "last_id": "run_UWvV94U0FQYiT2rlbBrdEVmC", + "has_more": false + } + post: + operationId: createRun + tags: + - Assistants + summary: Create a run. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to run. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateRunRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunObject" + x-oaiMeta: + name: Create run + beta: true + returns: A [run](/docs/api-reference/runs/object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_BDDwIqM4KgHibXX3mqmN3Lgs/runs \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' \ + -d '{ + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ" + }' + python: | + from openai import OpenAI + client = OpenAI() + + run = client.beta.threads.runs.create( + thread_id="thread_BDDwIqM4KgHibXX3mqmN3Lgs", + assistant_id="asst_nGl00s4xa9zmVY6Fvuvz9wwQ" + ) + print(run) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const run = await openai.beta.threads.runs.create( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + { assistant_id: "asst_nGl00s4xa9zmVY6Fvuvz9wwQ" } + ); + + console.log(run); + } + + main(); + response: &run_object_example | + { + "id": "run_UWvV94U0FQYiT2rlbBrdEVmC", + "object": "thread.run", + "created_at": 1699063290, + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "status": "queued", + "started_at": 1699063290, + "expires_at": null, + "cancelled_at": null, + "failed_at": null, + "completed_at": 1699063291, + "last_error": null, + "model": "gpt-4", + "instructions": null, + "tools": [ + { + "type": "code_interpreter" + } + ], + "file_ids": [ + "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "file-dEWwUbt2UGHp3v0e0DpCzemP" + ], + "metadata": {} + } + + /threads/{thread_id}/runs/{run_id}: + get: + operationId: getRun + tags: + - Assistants + summary: Retrieves a run. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the [thread](/docs/api-reference/threads) that was run. + - in: path + name: run_id + required: true + schema: + type: string + description: The ID of the run to retrieve. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunObject" + x-oaiMeta: + name: Retrieve run + beta: true + returns: The [run](/docs/api-reference/runs/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_BDDwIqM4KgHibXX3mqmN3Lgs/runs/run_5pyUEwhaPk11vCKiDneUWXXY \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + run = client.beta.threads.runs.retrieve( + thread_id="thread_BDDwIqM4KgHibXX3mqmN3Lgs", + run_id="run_5pyUEwhaPk11vCKiDneUWXXY" + ) + print(run) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const run = await openai.beta.threads.runs.retrieve( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "run_5pyUEwhaPk11vCKiDneUWXXY" + ); + + console.log(run); + } + + main(); + response: | + { + "id": "run_5pyUEwhaPk11vCKiDneUWXXY", + "object": "thread.run", + "created_at": 1699075072, + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "status": "completed", + "started_at": 1699075072, + "expires_at": null, + "cancelled_at": null, + "failed_at": null, + "completed_at": 1699075073, + "last_error": null, + "model": "gpt-3.5-turbo", + "instructions": null, + "tools": [ + { + "type": "code_interpreter" + } + ], + "file_ids": [ + "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "file-dEWwUbt2UGHp3v0e0DpCzemP" + ], + "metadata": {} + } + post: + operationId: modifyRun + tags: + - Assistants + summary: Modifies a run. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the [thread](/docs/api-reference/threads) that was run. + - in: path + name: run_id + required: true + schema: + type: string + description: The ID of the run to modify. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/ModifyRunRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunObject" + x-oaiMeta: + name: Modify run + beta: true + returns: The modified [run](/docs/api-reference/runs/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_BDDwIqM4KgHibXX3mqmN3Lgs/runs/run_5pyUEwhaPk11vCKiDneUWXXY \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' \ + -d '{ + "metadata": { + "user_id": "user_zmVY6FvuBDDwIqM4KgH" + } + }' + python: | + from openai import OpenAI + client = OpenAI() + + run = client.beta.threads.runs.update( + thread_id="thread_BDDwIqM4KgHibXX3mqmN3Lgs", + run_id="run_5pyUEwhaPk11vCKiDneUWXXY", + metadata={"user_id": "user_zmVY6FvuBDDwIqM4KgH"}, + ) + print(run) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const run = await openai.beta.threads.runs.update( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "run_5pyUEwhaPk11vCKiDneUWXXY", + { + metadata: { + user_id: "user_zmVY6FvuBDDwIqM4KgH", + }, + } + ); + + console.log(run); + } + + main(); + response: | + { + "id": "run_5pyUEwhaPk11vCKiDneUWXXY", + "object": "thread.run", + "created_at": 1699075072, + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "status": "completed", + "started_at": 1699075072, + "expires_at": null, + "cancelled_at": null, + "failed_at": null, + "completed_at": 1699075073, + "last_error": null, + "model": "gpt-3.5-turbo", + "instructions": null, + "tools": [ + { + "type": "code_interpreter" + } + ], + "file_ids": [ + "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "file-dEWwUbt2UGHp3v0e0DpCzemP" + ], + "metadata": { + "user_id": "user_zmVY6FvuBDDwIqM4KgH" + } + } + + /threads/{thread_id}/runs/{run_id}/submit_tool_outputs: + post: + operationId: submitToolOuputsToRun + tags: + - Assistants + summary: | + When a run has the `status: "requires_action"` and `required_action.type` is `submit_tool_outputs`, this endpoint can be used to submit the outputs from the tool calls once they're all completed. All outputs must be submitted in a single request. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the [thread](/docs/api-reference/threads) to which this run belongs. + - in: path + name: run_id + required: true + schema: + type: string + description: The ID of the run that requires the tool output submission. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/SubmitToolOutputsRunRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunObject" + x-oaiMeta: + name: Submit tool outputs to run + beta: true + returns: The modified [run](/docs/api-reference/runs/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_EdR8UvCDJ035LFEJZMt3AxCd/runs/run_PHLyHQYIQn4F7JrSXslEYWwh/submit_tool_outputs \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' \ + -d '{ + "tool_outputs": [ + { + "tool_call_id": "call_MbELIQcB72cq35Yzo2MRw5qs", + "output": "28C" + } + ] + }' + python: | + from openai import OpenAI + client = OpenAI() + + run = client.beta.threads.runs.submit_tool_outputs( + thread_id="thread_EdR8UvCDJ035LFEJZMt3AxCd", + run_id="run_PHLyHQYIQn4F7JrSXslEYWwh", + tool_outputs=[ + { + "tool_call_id": "call_MbELIQcB72cq35Yzo2MRw5qs", + "output": "28C" + } + ] + ) + print(run) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const run = await openai.beta.threads.runs.submitToolOutputs( + "thread_EdR8UvCDJ035LFEJZMt3AxCd", + "run_PHLyHQYIQn4F7JrSXslEYWwh", + { + tool_outputs: [ + { + tool_call_id: "call_MbELIQcB72cq35Yzo2MRw5qs", + output: "28C", + }, + ], + } + ); + + console.log(run); + } + + main(); + response: | + { + "id": "run_PHLyHQYIQn4F7JrSXslEYWwh", + "object": "thread.run", + "created_at": 1699075592, + "assistant_id": "asst_IgmpQTah3ZfPHCVZjTqAY8Kv", + "thread_id": "thread_EdR8UvCDJ035LFEJZMt3AxCd", + "status": "queued", + "started_at": 1699075592, + "expires_at": 1699076192, + "cancelled_at": null, + "failed_at": null, + "completed_at": null, + "last_error": null, + "model": "gpt-4", + "instructions": "You tell the weather.", + "tools": [ + { + "type": "function", + "function": { + "name": "get_weather", + "description": "Determine weather in my location", + "parameters": { + "type": "object", + "properties": { + "location": { + "type": "string", + "description": "The city and state e.g. San Francisco, CA" + }, + "unit": { + "type": "string", + "enum": [ + "c", + "f" + ] + } + }, + "required": [ + "location" + ] + } + } + } + ], + "file_ids": [], + "metadata": {} + } + + /threads/{thread_id}/runs/{run_id}/cancel: + post: + operationId: cancelRun + tags: + - Assistants + summary: Cancels a run that is `in_progress`. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to which this run belongs. + - in: path + name: run_id + required: true + schema: + type: string + description: The ID of the run to cancel. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunObject" + x-oaiMeta: + name: Cancel a run + beta: true + returns: The modified [run](/docs/api-reference/runs/object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_1cjnJPXj8MFiqTx58jU9TivC/runs/run_BeRGmpGt2wb1VI22ZRniOkrR/cancel \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'OpenAI-Beta: assistants=v1' \ + -X POST + python: | + from openai import OpenAI + client = OpenAI() + + run = client.beta.threads.runs.cancel( + thread_id="thread_1cjnJPXj8MFiqTx58jU9TivC", + run_id="run_BeRGmpGt2wb1VI22ZRniOkrR" + ) + print(run) + node.js: | + import OpenAI from "openai"; + + const openai = new OpenAI(); + + async function main() { + const run = await openai.beta.threads.runs.cancel( + "thread_1cjnJPXj8MFiqTx58jU9TivC", + "run_BeRGmpGt2wb1VI22ZRniOkrR" + ); + + console.log(run); + } + + main(); + response: | + { + "id": "run_BeRGmpGt2wb1VI22ZRniOkrR", + "object": "thread.run", + "created_at": 1699076126, + "assistant_id": "asst_IgmpQTah3ZfPHCVZjTqAY8Kv", + "thread_id": "thread_1cjnJPXj8MFiqTx58jU9TivC", + "status": "cancelling", + "started_at": 1699076126, + "expires_at": 1699076726, + "cancelled_at": null, + "failed_at": null, + "completed_at": null, + "last_error": null, + "model": "gpt-4", + "instructions": "You summarize books.", + "tools": [ + { + "type": "retrieval" + } + ], + "file_ids": [], + "metadata": {} + } + + /threads/{thread_id}/runs/{run_id}/steps: + get: + operationId: listRunSteps + tags: + - Assistants + summary: Returns a list of run steps belonging to a run. + parameters: + - name: thread_id + in: path + required: true + schema: + type: string + description: The ID of the thread the run and run steps belong to. + - name: run_id + in: path + required: true + schema: + type: string + description: The ID of the run the run steps belong to. + - name: limit + in: query + description: *pagination_limit_param_description + required: false + schema: + type: integer + default: 20 + - name: order + in: query + description: *pagination_order_param_description + schema: + type: string + default: desc + enum: ["asc", "desc"] + - name: after + in: query + description: *pagination_after_param_description + schema: + type: string + - name: before + in: query + description: *pagination_before_param_description + schema: + type: string + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListRunStepsResponse" + x-oaiMeta: + name: List run steps + beta: true + returns: A list of [run step](/docs/api-reference/runs/step-object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_BDDwIqM4KgHibXX3mqmN3Lgs/runs/run_UWvV94U0FQYiT2rlbBrdEVmC/steps \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + run_steps = client.beta.threads.runs.steps.list( + thread_id="thread_BDDwIqM4KgHibXX3mqmN3Lgs", + run_id="run_UWvV94U0FQYiT2rlbBrdEVmC" + ) + print(run_steps) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const runStep = await openai.beta.threads.runs.steps.list( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "run_UWvV94U0FQYiT2rlbBrdEVmC" + ); + console.log(runStep); + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "step_QyjyrsVsysd7F4K894BZHG97", + "object": "thread.run.step", + "created_at": 1699063291, + "run_id": "run_UWvV94U0FQYiT2rlbBrdEVmC", + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "type": "message_creation", + "status": "completed", + "cancelled_at": null, + "completed_at": 1699063291, + "expired_at": null, + "failed_at": null, + "last_error": null, + "step_details": { + "type": "message_creation", + "message_creation": { + "message_id": "msg_6YmiCRmMbbE6FALYNePPHqwm" + } + } + } + ], + "first_id": "step_QyjyrsVsysd7F4K894BZHG97", + "last_id": "step_QyjyrsVsysd7F4K894BZHG97", + "has_more": false + } + + /threads/{thread_id}/runs/{run_id}/steps/{step_id}: + get: + operationId: getRunStep + tags: + - Assistants + summary: Retrieves a run step. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + description: The ID of the thread to which the run and run step belongs. + - in: path + name: run_id + required: true + schema: + type: string + description: The ID of the run to which the run step belongs. + - in: path + name: step_id + required: true + schema: + type: string + description: The ID of the run step to retrieve. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/RunStepObject" + x-oaiMeta: + name: Retrieve run step + beta: true + returns: The [run step](/docs/api-reference/runs/step-object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_BDDwIqM4KgHibXX3mqmN3Lgs/runs/run_UWvV94U0FQYiT2rlbBrdEVmC/steps/step_QyjyrsVsysd7F4K894BZHG97 \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + run_step = client.beta.threads.runs.steps.retrieve( + thread_id="thread_BDDwIqM4KgHibXX3mqmN3Lgs", + run_id="run_UWvV94U0FQYiT2rlbBrdEVmC", + step_id="step_QyjyrsVsysd7F4K894BZHG97" + ) + print(run_step) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const runStep = await openai.beta.threads.runs.steps.retrieve( + "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "run_UWvV94U0FQYiT2rlbBrdEVmC", + "step_QyjyrsVsysd7F4K894BZHG97" + ); + console.log(runStep); + } + + main(); + response: &run_step_object_example | + { + "id": "step_QyjyrsVsysd7F4K894BZHG97", + "object": "thread.run.step", + "created_at": 1699063291, + "run_id": "run_UWvV94U0FQYiT2rlbBrdEVmC", + "assistant_id": "asst_nGl00s4xa9zmVY6Fvuvz9wwQ", + "thread_id": "thread_BDDwIqM4KgHibXX3mqmN3Lgs", + "type": "message_creation", + "status": "completed", + "cancelled_at": null, + "completed_at": 1699063291, + "expired_at": null, + "failed_at": null, + "last_error": null, + "step_details": { + "type": "message_creation", + "message_creation": { + "message_id": "msg_6YmiCRmMbbE6FALYNePPHqwm" + } + } + } + + /assistants/{assistant_id}/files: + get: + operationId: listAssistantFiles + tags: + - Assistants + summary: Returns a list of assistant files. + parameters: + - name: assistant_id + in: path + description: The ID of the assistant the file belongs to. + required: true + schema: + type: string + - name: limit + in: query + description: *pagination_limit_param_description + required: false + schema: + type: integer + default: 20 + - name: order + in: query + description: *pagination_order_param_description + schema: + type: string + default: desc + enum: ["asc", "desc"] + - name: after + in: query + description: *pagination_after_param_description + schema: + type: string + - name: before + in: query + description: *pagination_before_param_description + schema: + type: string + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListAssistantFilesResponse" + x-oaiMeta: + name: List assistant files + beta: true + returns: A list of [assistant file](/docs/api-reference/assistants/file-object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_DUGk5I7sK0FpKeijvrO30z9J/files \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + assistant_files = client.beta.assistants.files.list( + assistant_id="asst_DUGk5I7sK0FpKeijvrO30z9J" + ) + print(assistant_files) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const assistantFiles = await openai.beta.assistants.files.list( + "asst_FBOFvAOHhwEWMghbMGseaPGQ" + ); + console.log(assistantFiles); + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "object": "assistant.file", + "created_at": 1699060412, + "assistant_id": "asst_DUGk5I7sK0FpKeijvrO30z9J" + }, + { + "id": "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "object": "assistant.file", + "created_at": 1699060412, + "assistant_id": "asst_DUGk5I7sK0FpKeijvrO30z9J" + } + ], + "first_id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "last_id": "file-9F1ex49ipEnKzyLUNnCA0Yzx", + "has_more": false + } + post: + operationId: createAssistantFile + tags: + - Assistants + summary: Create an assistant file by attaching a [File](/docs/api-reference/files) to an [assistant](/docs/api-reference/assistants). + parameters: + - in: path + name: assistant_id + required: true + schema: + type: string + example: file-AF1WoRqd3aJAHsqc9NY7iL8F + description: | + The ID of the assistant for which to create a File. + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/CreateAssistantFileRequest" + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/AssistantFileObject" + x-oaiMeta: + name: Create assistant file + beta: true + returns: An [assistant file](/docs/api-reference/assistants/file-object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_FBOFvAOHhwEWMghbMGseaPGQ/files \ + -H 'Authorization: Bearer $OPENAI_API_KEY"' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' \ + -d '{ + "file_id": "file-wB6RM6wHdA49HfS2DJ9fEyrH" + }' + python: | + from openai import OpenAI + client = OpenAI() + + assistant_file = client.beta.assistants.files.create( + assistant_id="asst_FBOFvAOHhwEWMghbMGseaPGQ", + file_id="file-wB6RM6wHdA49HfS2DJ9fEyrH" + ) + print(assistant_file) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const myAssistantFile = await openai.beta.assistants.files.create( + "asst_FBOFvAOHhwEWMghbMGseaPGQ", + { + file_id: "file-wB6RM6wHdA49HfS2DJ9fEyrH" + } + ); + console.log(myAssistantFile); + } + + main(); + response: &assistant_file_object | + { + "id": "file-wB6RM6wHdA49HfS2DJ9fEyrH", + "object": "assistant.file", + "created_at": 1699055364, + "assistant_id": "asst_FBOFvAOHhwEWMghbMGseaPGQ" + } + + /assistants/{assistant_id}/files/{file_id}: + get: + operationId: getAssistantFile + tags: + - Assistants + summary: Retrieves an AssistantFile. + parameters: + - in: path + name: assistant_id + required: true + schema: + type: string + description: The ID of the assistant who the file belongs to. + - in: path + name: file_id + required: true + schema: + type: string + description: The ID of the file we're getting. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/AssistantFileObject" + x-oaiMeta: + name: Retrieve assistant file + beta: true + returns: The [assistant file](/docs/api-reference/assistants/file-object) object matching the specified ID. + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_FBOFvAOHhwEWMghbMGseaPGQ/files/file-wB6RM6wHdA49HfS2DJ9fEyrH \ + -H 'Authorization: Bearer $OPENAI_API_KEY"' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + assistant_file = client.beta.assistants.files.retrieve( + assistant_id="asst_FBOFvAOHhwEWMghbMGseaPGQ", + file_id="file-wB6RM6wHdA49HfS2DJ9fEyrH" + ) + print(assistant_file) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const myAssistantFile = await openai.beta.assistants.files.retrieve( + "asst_FBOFvAOHhwEWMghbMGseaPGQ", + "file-wB6RM6wHdA49HfS2DJ9fEyrH" + ); + console.log(myAssistantFile); + } + + main(); + response: *assistant_file_object + delete: + operationId: deleteAssistantFile + tags: + - Assistants + summary: Delete an assistant file. + parameters: + - in: path + name: assistant_id + required: true + schema: + type: string + description: The ID of the assistant that the file belongs to. + - in: path + name: file_id + required: true + schema: + type: string + description: The ID of the file to delete. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/DeleteAssistantFileResponse" + x-oaiMeta: + name: Delete assistant file + beta: true + returns: Deletion status + examples: + request: + curl: | + curl https://api.openai.com/v1/assistants/asst_DUGk5I7sK0FpKeijvrO30z9J/files/file-9F1ex49ipEnKzyLUNnCA0Yzx \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' \ + -X DELETE + python: | + from openai import OpenAI + client = OpenAI() + + deleted_assistant_file = client.beta.assistants.files.delete( + assistant_id="asst_DUGk5I7sK0FpKeijvrO30z9J", + file_id="file-dEWwUbt2UGHp3v0e0DpCzemP" + ) + print(deleted_assistant_file) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const deletedAssistantFile = await openai.beta.assistants.files.del( + "asst_FBOFvAOHhwEWMghbMGseaPGQ", + "file-wB6RM6wHdA49HfS2DJ9fEyrH" + ); + console.log(deletedAssistantFile); + } + + main(); + response: | + { + id: "file-BK7bzQj3FfZFXr7DbL6xJwfo", + object: "assistant.file.deleted", + deleted: true + } + + /threads/{thread_id}/messages/{message_id}/files: + get: + operationId: listMessageFiles + tags: + - Assistants + summary: Returns a list of message files. + parameters: + - name: thread_id + in: path + description: The ID of the thread that the message and files belong to. + required: true + schema: + type: string + - name: message_id + in: path + description: The ID of the message that the files belongs to. + required: true + schema: + type: string + - name: limit + in: query + description: *pagination_limit_param_description + required: false + schema: + type: integer + default: 20 + - name: order + in: query + description: *pagination_order_param_description + schema: + type: string + default: desc + enum: ["asc", "desc"] + - name: after + in: query + description: *pagination_after_param_description + schema: + type: string + - name: before + in: query + description: *pagination_before_param_description + schema: + type: string + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/ListMessageFilesResponse" + x-oaiMeta: + name: List message files + beta: true + returns: A list of [message file](/docs/api-reference/messages/file-object) objects. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_RGUhOuO9b2nrktrmsQ2uSR6I/messages/msg_q3XhbGmMzsqEFa81gMLBDAVU/files \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + message_files = client.beta.threads.messages.files.list( + thread_id="thread_RGUhOuO9b2nrktrmsQ2uSR6I", + message_id="msg_q3XhbGmMzsqEFa81gMLBDAVU" + ) + print(message_files) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const messageFiles = await openai.beta.threads.messages.files.list( + "thread_RGUhOuO9b2nrktrmsQ2uSR6I", + "msg_q3XhbGmMzsqEFa81gMLBDAVU" + ); + console.log(messageFiles); + } + + main(); + response: | + { + "object": "list", + "data": [ + { + "id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "object": "thread.message.file", + "created_at": 1699061776, + "message_id": "msg_q3XhbGmMzsqEFa81gMLBDAVU" + }, + { + "id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "object": "thread.message.file", + "created_at": 1699061776, + "message_id": "msg_q3XhbGmMzsqEFa81gMLBDAVU" + } + ], + "first_id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "last_id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "has_more": false + } + + /threads/{thread_id}/messages/{message_id}/files/{file_id}: + get: + operationId: getMessageFile + tags: + - Assistants + summary: Retrieves a message file. + parameters: + - in: path + name: thread_id + required: true + schema: + type: string + example: thread_AF1WoRqd3aJAHsqc9NY7iL8F + description: The ID of the thread to which the message and File belong. + - in: path + name: message_id + required: true + schema: + type: string + example: msg_AF1WoRqd3aJAHsqc9NY7iL8F + description: The ID of the message the file belongs to. + - in: path + name: file_id + required: true + schema: + type: string + example: file-AF1WoRqd3aJAHsqc9NY7iL8F + description: The ID of the file being retrieved. + responses: + "200": + description: OK + content: + application/json: + schema: + $ref: "#/components/schemas/MessageFileObject" + x-oaiMeta: + name: Retrieve message file + beta: true + returns: The [message file](/docs/api-reference/messages/file-object) object. + examples: + request: + curl: | + curl https://api.openai.com/v1/threads/thread_RGUhOuO9b2nrktrmsQ2uSR6I/messages/msg_q3XhbGmMzsqEFa81gMLBDAVU/files/file-dEWwUbt2UGHp3v0e0DpCzemP \ + -H 'Authorization: Bearer $OPENAI_API_KEY' \ + -H 'Content-Type: application/json' \ + -H 'OpenAI-Beta: assistants=v1' + python: | + from openai import OpenAI + client = OpenAI() + + message_files = client.beta.threads.messages.files.retrieve( + thread_id="thread_RGUhOuO9b2nrktrmsQ2uSR6I", + message_id="msg_q3XhbGmMzsqEFa81gMLBDAVU", + file_id="file-dEWwUbt2UGHp3v0e0DpCzemP" + ) + print(message_files) + node.js: | + import OpenAI from "openai"; + const openai = new OpenAI(); + + async function main() { + const messageFile = await openai.beta.threads.messages.files.retrieve( + "thread_RGUhOuO9b2nrktrmsQ2uSR6I", + "msg_q3XhbGmMzsqEFa81gMLBDAVU", + "file-dEWwUbt2UGHp3v0e0DpCzemP" + ); + console.log(messageFile); + } + + main(); + response: | + { + "id": "file-dEWwUbt2UGHp3v0e0DpCzemP", + "object": "thread.message.file", + "created_at": 1699061776, + "message_id": "msg_q3XhbGmMzsqEFa81gMLBDAVU" + } + + /healthz: + get: + operationId: heathcheck + tags: + - Health Check + summary: Check the status of Nitro Server. + requestBody: + application/json: + $ref: "#/components/schemas/HealthcheckRequest" + x-codeSamples: + - lang: "curl" + source: | + curl -X GET 'http://localhost:3928/healthz' + responses: + "200": + description: Nitro health check + content: + application/json: + schema: + $ref: "#/components/schemas/HealthcheckResponse" + + /inferences/llamacpp/loadmodel: + post: + operationId: loadmodel + tags: + - Load Model + summary: Load model to Nitro Inference Server. + requestBody: + application/json: + $ref: "#/components/schemas/LoadModelRequest" + responses: + "200": + description: Model loaded + content: + application/json: + schema: + $ref: "#/components/schemas/LoadModelResponse" + + /inferences/llamacpp/unloadmodel: + get: + operationId: unloadmodel + tags: + - Unload Model + summary: Unload model from Nitro Inference Server. + requestBody: + application/json: + $ref: "#/components/schemas/UnloadModelRequest" + x-codeSamples: + - lang: "curl" + source: | + curl -X GET 'http://localhost:3928/inferences/llamacpp/unloadmodel' + responses: + "200": + description: Model unloaded + content: + application/json: + schema: + $ref: "#/components/schemas/UnloadModelResponse" + + /inferences/llamacpp/modelstatus: + get: + operationId: modelstatus + tags: + - Status + summary: Check status of the model on Nitro server + requestBody: + required: true + application/json: + $ref: "#/components/schemas/StatusRequest" + x-codeSamples: + - lang: "curl" + source: | + curl -X GET 'http://localhost:3928/inferences/llamacpp/modelstatus' + responses: + "200": + description: Check status + requestBody: + application/json: + schema: + $ref: "#/components/schemas/StatusResponse" + + /inferences/llamacpp/embedding: + post: + operationId: createEmbedding + tags: + - Embeddings + summary: Creates an embedding vector representing the input text. + requestBody: + required: true + requestBody: + application/json: + schema: + $ref: "#/components/schemas/CreateEmbeddingRequest" + responses: + "200": + description: OK + requestBody: + application/json: + schema: + $ref: "#/components/schemas/CreateEmbeddingResponse" + + /inferences/llamacpp/chat_completion: + post: + operationId: createChatCompletion + tags: + - Chat Completion + summary: Create an chat with the model. + requestBody: + required: true + requestBody: + application/json: + $ref: "#/components/schemas/ChatCompletionRequest" + responses: + "200": + description: OK + requestBody: + application/json: + schema: + $ref: "#/components/schemas/ChatCompletionResponse" + +components: + securitySchemes: + ApiKeyAuth: + type: http + scheme: "bearer" + + schemas: + + LoadModelRequest: + type: object + properties: + llama_model_path: + type: string + required: true + description: Path to your local LLM + example: "nitro/model/zephyr-7b-beta.Q5_K_M.gguf" + ngl: + type: number + default: 100 + minimum: 0 + maximum: 100 + nullable: true + description: The number of layers to load onto the GPU for acceleration. + ctx_len: + type: number + default: 2048 + nullable: true + description: The context length for model operations varies; the maximum depends on the specific model used. + embedding: + default: true + type: boolean + nullable: true + description: Whether to enable embedding. + cont_batching: + type: boolean + default: false + nullable: true + description: Whether to use continuous batching. + n_parallel: + type: integer + default: Automatically set to Dragon threads + example: 4 + nullable: true + description: The number of parallel operations. Only set when enable continuous batching. + pre_prompt: + type: string + default: A chat between a curious user and an artificial intelligence assistant. The assistant follows the given rules no matter what. + nullable: true + description: The prompt to use for internal configuration. + system_prompt: + type: string + default: "ASSISTANT's RULE:" + nullable: true + description: The prefix for system prompt + user_prompt: + type: string + default: "USER:" + nullable: true + description: The prefix for user prompt. + ai_prompt: + type: string + default: "ASSISTANT:" + nullable: true + description: The prefix for assistant prompt. + + required: + - llama_model_path + + LoadModelResponse: + type: object + properties: + message: + example: Model loaded successfully + description: A status indicator for when the model is successfully loaded. + anyOf: + - type: string + title: Success + description: The output will be "Model loaded successfully" + - type: string + title: Failed + description: The output will be "No model loaded" + code: + example: Model loaded successfully + description: A response code for Localization Support. + anyOf: + - type: string + title: Success + description: The output will be "Model loaded successfully" + - type: string + title: Failed + description: The output will be "No model loaded" + + HealthcheckRequest: + type: object + + HealthcheckResponse: + type: object + properties: + message: + example: Nitro is alive!!! + description: A status indicator for when the model is successfully loaded. + anyOf: + - type: string + title: Success + description: The output will be "Nitro is alive!!!" + - type: string + title: Failed + description: "curl: (7) Failed to connect to localhost port 3928 after 0 ms: Connection refused" + + UnloadModelRequest: + type: object + properties: + message: + example: TODO + description: TODO + + UnloadModelResponse: + type: object + properties: + message: + example: Model unloaded successfully + description: A status for successful model unloading. + anyOf: + - type: string + title: Success + description: The output will be "Model unloaded successfully" + - type: string + title: Failed + description: The output will be "No model loaded" + + StatusRequest: + type: object + properties: + message: + example: Model unloaded successfully + description: A status for successful model unloading. + + StatusResponse: + type: object + description: State of the loaded model + properties: + model_loaded: + type: boolean + example: true + nullable: true + description: A status for loading model to Nitro server. + frequency_penalty: + type: number + description: Adjusts likelihood of repeating words in the output, with a higher value discouraging repetition. + default: 0 + nullable: true + max: 2 + min: 0 + grammar: + type: string + default: "" + nullable: true + description: Specifies grammar constraints to be applied, with an empty string implying no constraints. + ignore_eos: + type: boolean + default: false + nullable: true + description: Determines if the model should consider end-of-sequence tokens, with false indicating they are considered. + logit_bias: + type: arrays + default: [] + description: An array for applying biases to certain tokens' logits to affect their selection probability. + mirostat: + type: number + default: 0 + nullable: true + description: Enables or disables the Mirostat algorithm for controlling output diversity. + mirostat_eta: + type: number + default: 0.1 + nullable: true + description: Parameter related to output diversity. + mirostat_tau: + type: number + default: 5.0 + nullable: true + description: Controls the temperature for the mirostat. + model: + type: string + example: "nitro/model/zephyr-7b-beta.Q5_K_M.gguf" + nullable: true + description: This is automatically set to the model you've loaded on the Nitro server. + n_ctx: + type: number + default: 42 + nullable: true + description: Number of tokens in the model's context window. + n_keep: + type: number + default: 0 + nullable: true + description: Number of tokens to keep from the beginning of the input. + n_predict: + type: number + default: 100 + nullable: true + description: Number of tokens the model should predict, with -1 indicating no specific limit. + n_probs: + type: number + default: 0 + nullable: true + description: Controls the number of probabilities returned by the model. + penalize_nl: + type: boolean + default: true + nullable: true + description: Penalizes new lines in the output to make them less likely. + presence_penalty: + type: number + default: 0 + nullable: true + description: Adjusts likelihood of introducing new concepts in the output. + repeat_last_n: + type: number + default: 64 + nullable: true + description: Number of tokens to check for repetition. + repeat_penalty: + type: number + default: 1.1 + nullable: true + description: Penalizes repetitions of phrases in the last `repeat_last_n` tokens. + seed: + type: number + default: 4294967295 + nullable: true + description: Random seed for ensuring reproducibility. + stop: + type: arrays + default: ["hello", "USER: "] + nullable: true + description: A list of tokens that signal the model to stop generating further output. + stream: + type: boolean + default: true + nullable: true + description: Determines if output generation is in a streaming manner. + temp: + type: number + default: 0.7 + min: 0 + max: 1 + nullable: true + description: Controls randomness of the output. + tfs_z: + type: number + default: 1.0 + nullable: true + description: A parameter likely related to internal model processing. + top_k: + type: number + default: 40 + nullable: true + description: Limits the number of highest probability tokens considered at each generation step. + top_p: + type: number + default: 0.95 + min: 0 + max: 1 + nullable: true + description: Chooses from the top tokens cumulatively making up a specified probability. + typical_p: + type: number + default: 1.0 + nullable: true + description: Controls output diversity, typically used alongside `top_p`. + + CreateEmbeddingRequest: + type: object + additionalProperties: false + properties: + context: + description: | + Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. + example: "hello" + + CreateEmbeddingResponse: + type: object + properties: + embedding: + type: array + example: [-0.987474262714386,0.29654932022094727,0.19979725778102875,...] + description: The list of embeddings generated by the model. + + ChatCompletionRequest: + type: object + properties: + messages: + type: object + description: Contains input data or prompts for the model to process + example: | + [ + { + "content": "Hello there :wave:", + "role": "assistant" + }, + { + "content": "Can you write a long story", + "role": "user" + } + ] + stream: + type: boolean + default: true + description: Enables continuous output generation, allowing for streaming of model responses. + model: + type: string + example: "gpt-3.5-turbo" + description: Specifies the model being used for inference or processing tasks. + max_tokens: + type: number + default: 2048 + description: The maximum number of tokens the model will generate in a single response + stop: + type: arrays + example: ['hello'] + description: Defines specific tokens or phrases at which the model will stop generating further output. + frequency_penalty: + type: number + default: 0 + description: Adjusts the likelihood of the model repeating words or phrases in its output. + presence_penalty: + type: number + default: 0 + description: Influences the generation of new and varied concepts in the model's output + temperature: + type: number + default: 0.7 + min: 0 + max: 1 + description: Controls the randomness of the model's output + + ChatCompletionResponse: + type: object + properties: + embedding: + type: string + example: Hello, I am an AI. + description: The list of text generated by the model. +################################### + Error: + type: object + properties: + code: + type: string + nullable: true + message: + type: string + nullable: false + param: + type: string + nullable: true + type: + type: string + nullable: false + required: + - type + - message + - param + - code + ErrorResponse: + type: object + properties: + error: + $ref: "#/components/schemas/Error" + required: + - error + + ListModelsResponse: + type: object + properties: + object: + type: string + enum: [list] + data: + type: array + items: + $ref: "#/components/schemas/Model" + required: + - object + - data + DeleteModelResponse: + type: object + properties: + id: + type: string + deleted: + type: boolean + object: + type: string + required: + - id + - object + - deleted + + CreateCompletionRequest: + type: object + properties: + model: + description: &model_description | + ID of the model to use. You can use the [List models](/docs/api-reference/models/list) API to see all of your available models, or see our [Model overview](/docs/models/overview) for descriptions of them. + anyOf: + - type: string + - type: string + enum: + [ + "babbage-002", + "davinci-002", + "gpt-3.5-turbo-instruct", + "text-davinci-003", + "text-davinci-002", + "text-davinci-001", + "code-davinci-002", + "text-curie-001", + "text-babbage-001", + "text-ada-001", + ] + x-oaiTypeLabel: string + prompt: + description: &completions_prompt_description | + The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays. + + Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document. + default: "<|endoftext|>" + nullable: true + oneOf: + - type: string + default: "" + example: "This is a test." + - type: array + items: + type: string + default: "" + example: "This is a test." + - type: array + minItems: 1 + items: + type: integer + example: "[1212, 318, 257, 1332, 13]" + - type: array + minItems: 1 + items: + type: array + minItems: 1 + items: + type: integer + example: "[[1212, 318, 257, 1332, 13]]" + best_of: + type: integer + default: 1 + minimum: 0 + maximum: 20 + nullable: true + description: &completions_best_of_description | + Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed. + + When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return – `best_of` must be greater than `n`. + + **Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`. + echo: + type: boolean + default: false + nullable: true + description: &completions_echo_description > + Echo back the prompt in addition to the completion + frequency_penalty: + type: number + default: 0 + minimum: -2 + maximum: 2 + nullable: true + description: &completions_frequency_penalty_description | + Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. + + [See more information about frequency and presence penalties.](/docs/guides/gpt/parameter-details) + logit_bias: &completions_logit_bias + type: object + x-oaiTypeLabel: map + default: null + nullable: true + additionalProperties: + type: integer + description: &completions_logit_bias_description | + Modify the likelihood of specified tokens appearing in the completion. + + Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this [tokenizer tool](/tokenizer?view=bpe) (which works for both GPT-2 and GPT-3) to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. + + As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated. + logprobs: &completions_logprobs_configuration + type: integer + minimum: 0 + maximum: 5 + default: null + nullable: true + description: &completions_logprobs_description | + Include the log probabilities on the `logprobs` most likely tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response. + + The maximum value for `logprobs` is 5. + max_tokens: + type: integer + minimum: 0 + default: 16 + example: 16 + nullable: true + description: &completions_max_tokens_description | + The maximum number of [tokens](/tokenizer) to generate in the completion. + + The token count of your prompt plus `max_tokens` cannot exceed the model's context length. [Example Python code](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken) for counting tokens. + n: + type: integer + minimum: 1 + maximum: 128 + default: 1 + example: 1 + nullable: true + description: &completions_completions_description | + How many completions to generate for each prompt. + + **Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`. + presence_penalty: + type: number + default: 0 + minimum: -2 + maximum: 2 + nullable: true + description: &completions_presence_penalty_description | + Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. + + [See more information about frequency and presence penalties.](/docs/guides/gpt/parameter-details) + seed: &completions_seed_param + type: integer + minimum: -9223372036854775808 + maximum: 9223372036854775807 + nullable: true + description: | + If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. + + Determinism is not guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend. + stop: + description: &completions_stop_description > + Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. + default: null + nullable: true + oneOf: + - type: string + default: <|endoftext|> + example: "\n" + nullable: true + - type: array + minItems: 1 + maxItems: 4 + items: + type: string + example: '["\n"]' + stream: + description: > + Whether to stream back partial progress. If set, tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) + as they become available, with the stream terminated by a `data: [DONE]` message. [Example Python code](https://cookbook.openai.com/examples/how_to_stream_completions). + type: boolean + nullable: true + default: false + suffix: + description: The suffix that comes after a completion of inserted text. + default: null + nullable: true + type: string + example: "test." + temperature: + type: number + minimum: 0 + maximum: 2 + default: 1 + example: 1 + nullable: true + description: &completions_temperature_description | + What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. + + We generally recommend altering this or `top_p` but not both. + top_p: + type: number + minimum: 0 + maximum: 1 + default: 1 + example: 1 + nullable: true + description: &completions_top_p_description | + An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. + + We generally recommend altering this or `temperature` but not both. + user: &end_user_param_configuration + type: string + example: user-1234 + description: | + A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. [Learn more](/docs/guides/safety-best-practices/end-user-ids). + required: + - model + - prompt + + CreateCompletionResponse: + type: object + description: | + Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint). + properties: + id: + type: string + description: A unique identifier for the completion. + choices: + type: array + description: The list of completion choices the model generated for the input prompt. + items: + type: object + required: + - finish_reason + - index + - logprobs + - text + properties: + finish_reason: + type: string + description: &completion_finish_reason_description | + The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, + `length` if the maximum number of tokens specified in the request was reached, + or `content_filter` if content was omitted due to a flag from our content filters. + enum: ["stop", "length", "content_filter"] + index: + type: integer + logprobs: + type: object + nullable: true + properties: + text_offset: + type: array + items: + type: integer + token_logprobs: + type: array + items: + type: number + tokens: + type: array + items: + type: string + top_logprobs: + type: array + items: + type: object + additionalProperties: + type: number + text: + type: string + created: + type: integer + description: The Unix timestamp (in seconds) of when the completion was created. + model: + type: string + description: The model used for completion. + system_fingerprint: + type: string + description: | + This fingerprint represents the backend configuration that the model runs with. + + Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism. + object: + type: string + description: The object type, which is always "text_completion" + enum: [text_completion] + usage: + $ref: "#/components/schemas/CompletionUsage" + required: + - id + - object + - created + - model + - choices + x-oaiMeta: + name: The completion object + legacy: true + example: | + { + "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7", + "object": "text_completion", + "created": 1589478378, + "model": "gpt-3.5-turbo", + "choices": [ + { + "text": "\n\nThis is indeed a test", + "index": 0, + "logprobs": null, + "finish_reason": "length" + } + ], + "usage": { + "prompt_tokens": 5, + "completion_tokens": 7, + "total_tokens": 12 + } + } + + ChatCompletionRequestMessageContentPart: + oneOf: + - $ref: "#/components/schemas/ChatCompletionRequestMessageContentPartText" + - $ref: "#/components/schemas/ChatCompletionRequestMessageContentPartImage" + x-oaiExpandable: true + + ChatCompletionRequestMessageContentPartImage: + type: object + title: Image content part + properties: + type: + type: string + enum: ["image_url"] + description: The type of the content part. + image_url: + type: object + properties: + url: + type: string + description: Either a URL of the image or the base64 encoded image data. + format: uri + detail: + type: string + description: Specifies the detail level of the image. + enum: ["auto", "low", "high"] + default: "auto" + required: + - url + required: + - type + - image_url + + ChatCompletionRequestMessageContentPartText: + type: object + title: Text content part + properties: + type: + type: string + enum: ["text"] + description: The type of the content part. + text: + type: string + description: The text content. + required: + - type + - text + + ChatCompletionRequestMessage: + oneOf: + - $ref: "#/components/schemas/ChatCompletionRequestSystemMessage" + - $ref: "#/components/schemas/ChatCompletionRequestUserMessage" + - $ref: "#/components/schemas/ChatCompletionRequestAssistantMessage" + - $ref: "#/components/schemas/ChatCompletionRequestToolMessage" + - $ref: "#/components/schemas/ChatCompletionRequestFunctionMessage" + x-oaiExpandable: true + + ChatCompletionRequestSystemMessage: + type: object + title: System message + properties: + content: + nullable: true + description: The contents of the system message. + type: string + role: + type: string + enum: ["system"] + description: The role of the messages author, in this case `system`. + required: + - content + - role + + ChatCompletionRequestUserMessage: + type: object + title: User message + properties: + content: + nullable: true + description: | + The contents of the user message. + oneOf: + - type: string + description: The text contents of the message. + title: Text content + - type: array + description: An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model. + title: Array of content parts + items: + $ref: "#/components/schemas/ChatCompletionRequestMessageContentPart" + minItems: 1 + role: + type: string + enum: ["user"] + description: The role of the messages author, in this case `user`. + required: + - content + - role + + ChatCompletionRequestAssistantMessage: + type: object + title: Assistant message + properties: + content: + nullable: true + type: string + description: | + The contents of the assistant message. + role: + type: string + enum: ["assistant"] + description: The role of the messages author, in this case `assistant`. + tool_calls: + $ref: "#/components/schemas/ChatCompletionMessageToolCalls" + function_call: + type: object + deprecated: true + description: "Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model." + properties: + arguments: + type: string + description: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. + name: + type: string + description: The name of the function to call. + required: + - arguments + - name + required: + - content + - role + + ChatCompletionRequestToolMessage: + type: object + title: Tool message + properties: + role: + type: string + enum: ["tool"] + description: The role of the messages author, in this case `tool`. + content: + nullable: true + type: string + description: The contents of the tool message. + tool_call_id: + type: string + description: Tool call that this message is responding to. + required: + - role + - content + - tool_call_id + + ChatCompletionRequestFunctionMessage: + type: object + title: Function message + deprecated: true + properties: + role: + type: string + enum: ["function"] + description: The role of the messages author, in this case `function`. + content: + type: string + nullable: true + description: The return value from the function call, to return to the model. + name: + type: string + description: The name of the function to call. + required: + - role + - name + - content + + FunctionParameters: + type: object + description: "The parameters the functions accepts, described as a JSON Schema object. See the [guide](/docs/guides/gpt/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format.\n\nTo describe a function that accepts no parameters, provide the value `{\"type\": \"object\", \"properties\": {}}`." + additionalProperties: true + + ChatCompletionFunctions: + type: object + deprecated: true + properties: + description: + type: string + description: A description of what the function does, used by the model to choose when and how to call the function. + name: + type: string + description: The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. + parameters: + $ref: "#/components/schemas/FunctionParameters" + required: + - name + - parameters + + ChatCompletionFunctionCallOption: + type: object + description: > + Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. + properties: + name: + type: string + description: The name of the function to call. + required: + - name + + ChatCompletionTool: + type: object + properties: + type: + type: string + enum: ["function"] + description: The type of the tool. Currently, only `function` is supported. + function: + $ref: "#/components/schemas/FunctionObject" + required: + - type + - function + + FunctionObject: + type: object + properties: + description: + type: string + description: A description of what the function does, used by the model to choose when and how to call the function. + name: + type: string + description: The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64. + parameters: + $ref: "#/components/schemas/FunctionParameters" + required: + - name + - parameters + + ChatCompletionToolChoiceOption: + description: | + Controls which (if any) function is called by the model. + `none` means the model will not call a function and instead generates a message. + `auto` means the model can pick between generating a message or calling a function. + Specifying a particular function via `{"type: "function", "function": {"name": "my_function"}}` forces the model to call that function. + + `none` is the default when no functions are present. `auto` is the default if functions are present. + oneOf: + - type: string + description: > + `none` means the model will not call a function and instead generates a message. + `auto` means the model can pick between generating a message or calling a function. + enum: [none, auto] + - $ref: "#/components/schemas/ChatCompletionNamedToolChoice" + x-oaiExpandable: true + + ChatCompletionNamedToolChoice: + type: object + description: Specifies a tool the model should use. Use to force the model to call a specific function. + properties: + type: + type: string + enum: ["function"] + description: The type of the tool. Currently, only `function` is supported. + function: + type: object + properties: + name: + type: string + description: The name of the function to call. + required: + - name + + ChatCompletionMessageToolCalls: + type: array + description: The tool calls generated by the model, such as function calls. + items: + $ref: "#/components/schemas/ChatCompletionMessageToolCall" + + ChatCompletionMessageToolCall: + type: object + properties: + # TODO: index included when streaming + id: + type: string + description: The ID of the tool call. + type: + type: string + enum: ["function"] + description: The type of the tool. Currently, only `function` is supported. + function: + type: object + description: The function that the model called. + properties: + name: + type: string + description: The name of the function to call. + arguments: + type: string + description: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. + required: + - name + - arguments + required: + - id + - type + - function + + ChatCompletionMessageToolCallChunk: + type: object + properties: + index: + type: integer + id: + type: string + description: The ID of the tool call. + type: + type: string + enum: ["function"] + description: The type of the tool. Currently, only `function` is supported. + function: + type: object + properties: + name: + type: string + description: The name of the function to call. + arguments: + type: string + description: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. + required: + - index + + # Note, this isn't referenced anywhere, but is kept as a convenience to record all possible roles in one place. + ChatCompletionRole: + type: string + description: The role of the author of a message + enum: + - system + - user + - assistant + - tool + - function + + ChatCompletionResponseMessage: + type: object + description: A chat completion message generated by the model. + properties: + content: + type: string + description: The contents of the message. + nullable: true + tool_calls: + $ref: "#/components/schemas/ChatCompletionMessageToolCalls" + role: + type: string + enum: ["assistant"] + description: The role of the author of this message. + function_call: + type: object + deprecated: true + description: "Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model." + properties: + arguments: + type: string + description: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. + name: + type: string + description: The name of the function to call. + required: + - name + - arguments + required: + - role + - content + + ChatCompletionStreamResponseDelta: + type: object + description: A chat completion delta generated by streamed model responses. + properties: + content: + type: string + description: The contents of the chunk message. + nullable: true + function_call: + deprecated: true + type: object + description: "Deprecated and replaced by `tool_calls`. The name and arguments of a function that should be called, as generated by the model." + properties: + arguments: + type: string + description: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function. + name: + type: string + description: The name of the function to call. + tool_calls: + type: array + items: + $ref: "#/components/schemas/ChatCompletionMessageToolCallChunk" + role: + type: string + enum: ["system", "user", "assistant", "tool"] + description: The role of the author of this message. + + CreateChatCompletionRequest: + type: object + properties: + messages: + description: A list of messages comprising the conversation so far. [Example Python code](https://cookbook.openai.com/examples/how_to_format_inputs_to_chatgpt_models). + type: array + minItems: 1 + items: + $ref: "#/components/schemas/ChatCompletionRequestMessage" + model: + description: ID of the model to use. See the [model endpoint compatibility](/docs/models/model-endpoint-compatibility) table for details on which models work with the Chat API. + example: "gpt-3.5-turbo" + anyOf: + - type: string + - type: string + enum: + [ + "gpt-4-1106-preview", + "gpt-4-vision-preview", + "gpt-4", + "gpt-4-0314", + "gpt-4-0613", + "gpt-4-32k", + "gpt-4-32k-0314", + "gpt-4-32k-0613", + "gpt-3.5-turbo-1106", + "gpt-3.5-turbo", + "gpt-3.5-turbo-16k", + "gpt-3.5-turbo-0301", + "gpt-3.5-turbo-0613", + "gpt-3.5-turbo-16k-0613", + ] + x-oaiTypeLabel: string + frequency_penalty: + type: number + default: 0 + minimum: -2 + maximum: 2 + nullable: true + description: *completions_frequency_penalty_description + logit_bias: + type: object + x-oaiTypeLabel: map + default: null + nullable: true + additionalProperties: + type: integer + description: | + Modify the likelihood of specified tokens appearing in the completion. + + Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. + max_tokens: + description: | + The maximum number of [tokens](/tokenizer) to generate in the chat completion. + + The total length of input tokens and generated tokens is limited by the model's context length. [Example Python code](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken) for counting tokens. + default: inf + type: integer + nullable: true + n: + type: integer + minimum: 1 + maximum: 128 + default: 1 + example: 1 + nullable: true + description: How many chat completion choices to generate for each input message. + presence_penalty: + type: number + default: 0 + minimum: -2 + maximum: 2 + nullable: true + description: *completions_presence_penalty_description + response_format: + type: object + description: | + An object specifying the format that the model must output. + + Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees the message the model generates is valid JSON. + + **Important:** when using JSON mode, you **must** also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in increased latency and appearance of a "stuck" request. Also note that the message content may be partially cut off if `finish_reason="length"`, which indicates the generation exceeded `max_tokens` or the conversation exceeded the max context length. + properties: + type: + type: string + enum: ["text", "json_object"] + example: "json_object" + default: "text" + description: Must be one of `text` or `json_object`. + seed: + type: integer + minimum: -9223372036854775808 + maximum: 9223372036854775807 + nullable: true + description: | + This feature is in Beta. + If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. + Determinism is not guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend. + x-oaiMeta: + beta: true + stop: + description: | + Up to 4 sequences where the API will stop generating further tokens. + default: null + oneOf: + - type: string + nullable: true + - type: array + minItems: 1 + maxItems: 4 + items: + type: string + stream: + description: > + If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) + as they become available, with the stream terminated by a `data: [DONE]` message. [Example Python code](https://cookbook.openai.com/examples/how_to_stream_completions). + type: boolean + nullable: true + default: false + temperature: + type: number + minimum: 0 + maximum: 2 + default: 1 + example: 1 + nullable: true + description: *completions_temperature_description + top_p: + type: number + minimum: 0 + maximum: 1 + default: 1 + example: 1 + nullable: true + description: *completions_top_p_description + tools: + type: array + description: > + A list of tools the model may call. Currently, only functions are supported as a tool. + Use this to provide a list of functions the model may generate JSON inputs for. + items: + $ref: "#/components/schemas/ChatCompletionTool" + tool_choice: + $ref: "#/components/schemas/ChatCompletionToolChoiceOption" + user: *end_user_param_configuration + function_call: + deprecated: true + description: | + Deprecated in favor of `tool_choice`. + + Controls which (if any) function is called by the model. + `none` means the model will not call a function and instead generates a message. + `auto` means the model can pick between generating a message or calling a function. + Specifying a particular function via `{"name": "my_function"}` forces the model to call that function. + + `none` is the default when no functions are present. `auto`` is the default if functions are present. + oneOf: + - type: string + description: > + `none` means the model will not call a function and instead generates a message. + `auto` means the model can pick between generating a message or calling a function. + enum: [none, auto] + - $ref: "#/components/schemas/ChatCompletionFunctionCallOption" + x-oaiExpandable: true + functions: + deprecated: true + description: | + Deprecated in favor of `tools`. + + A list of functions the model may generate JSON inputs for. + type: array + minItems: 1 + maxItems: 128 + items: + $ref: "#/components/schemas/ChatCompletionFunctions" + + required: + - model + - messages + + CreateChatCompletionResponse: + type: object + description: Represents a chat completion response returned by model, based on the provided input. + properties: + id: + type: string + description: A unique identifier for the chat completion. + choices: + type: array + description: A list of chat completion choices. Can be more than one if `n` is greater than 1. + items: + type: object + required: + - finish_reason + - index + - message + properties: + finish_reason: + type: string + description: &chat_completion_finish_reason_description | + The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, + `length` if the maximum number of tokens specified in the request was reached, + `content_filter` if content was omitted due to a flag from our content filters, + `tool_calls` if the model called a tool, or `function_call` (deprecated) if the model called a function. + enum: + [ + "stop", + "length", + "tool_calls", + "content_filter", + "function_call", + ] + index: + type: integer + description: The index of the choice in the list of choices. + message: + $ref: "#/components/schemas/ChatCompletionResponseMessage" + created: + type: integer + description: The Unix timestamp (in seconds) of when the chat completion was created. + model: + type: string + description: The model used for the chat completion. + system_fingerprint: + type: string + description: | + This fingerprint represents the backend configuration that the model runs with. + + Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism. + object: + type: string + description: The object type, which is always `chat.completion`. + enum: [chat.completion] + usage: + $ref: "#/components/schemas/CompletionUsage" + required: + - choices + - created + - id + - model + - object + x-oaiMeta: + name: The chat completion object + group: chat + example: *chat_completion_example + + CreateChatCompletionFunctionResponse: + type: object + description: Represents a chat completion response returned by model, based on the provided input. + properties: + id: + type: string + description: A unique identifier for the chat completion. + choices: + type: array + description: A list of chat completion choices. Can be more than one if `n` is greater than 1. + items: + type: object + required: + - finish_reason + - index + - message + properties: + finish_reason: + type: string + description: + &chat_completion_function_finish_reason_description | + The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, `length` if the maximum number of tokens specified in the request was reached, `content_filter` if content was omitted due to a flag from our content filters, or `function_call` if the model called a function. + enum: ["stop", "length", "function_call", "content_filter"] + index: + type: integer + description: The index of the choice in the list of choices. + message: + $ref: "#/components/schemas/ChatCompletionResponseMessage" + created: + type: integer + description: The Unix timestamp (in seconds) of when the chat completion was created. + model: + type: string + description: The model used for the chat completion. + system_fingerprint: + type: string + description: | + This fingerprint represents the backend configuration that the model runs with. + + Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism. + object: + type: string + description: The object type, which is always `chat.completion`. + enum: [chat.completion] + usage: + $ref: "#/components/schemas/CompletionUsage" + required: + - choices + - created + - id + - model + - object + x-oaiMeta: + name: The chat completion object + group: chat + example: *chat_completion_function_example + + ListPaginatedFineTuningJobsResponse: + type: object + properties: + data: + type: array + items: + $ref: "#/components/schemas/FineTuningJob" + has_more: + type: boolean + object: + type: string + enum: [list] + required: + - object + - data + - has_more + + CreateChatCompletionStreamResponse: + type: object + description: Represents a streamed chunk of a chat completion response returned by model, based on the provided input. + properties: + id: + type: string + description: A unique identifier for the chat completion. Each chunk has the same ID. + choices: + type: array + description: A list of chat completion choices. Can be more than one if `n` is greater than 1. + items: + type: object + required: + - delta + - finish_reason + - index + properties: + delta: + $ref: "#/components/schemas/ChatCompletionStreamResponseDelta" + finish_reason: + type: string + description: *chat_completion_finish_reason_description + enum: + [ + "stop", + "length", + "tool_calls", + "content_filter", + "function_call", + ] + nullable: true + index: + type: integer + description: The index of the choice in the list of choices. + created: + type: integer + description: The Unix timestamp (in seconds) of when the chat completion was created. Each chunk has the same timestamp. + model: + type: string + description: The model to generate the completion. + system_fingerprint: + type: string + description: | + This fingerprint represents the backend configuration that the model runs with. + Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism. + object: + type: string + description: The object type, which is always `chat.completion.chunk`. + enum: [chat.completion.chunk] + required: + - choices + - created + - id + - model + - object + x-oaiMeta: + name: The chat completion chunk object + group: chat + example: *chat_completion_chunk_example + + CreateChatCompletionImageResponse: + type: object + description: Represents a streamed chunk of a chat completion response returned by model, based on the provided input. + x-oaiMeta: + name: The chat completion chunk object + group: chat + example: *chat_completion_image_example + + CreateEditRequest: + type: object + properties: + instruction: + description: The instruction that tells the model how to edit the prompt. + type: string + example: "Fix the spelling mistakes." + model: + description: ID of the model to use. You can use the `text-davinci-edit-001` or `code-davinci-edit-001` model with this endpoint. + example: "text-davinci-edit-001" + anyOf: + - type: string + - type: string + enum: ["text-davinci-edit-001", "code-davinci-edit-001"] + x-oaiTypeLabel: string + input: + description: The input text to use as a starting point for the edit. + type: string + default: "" + nullable: true + example: "What day of the wek is it?" + n: + type: integer + minimum: 1 + maximum: 20 + default: 1 + example: 1 + nullable: true + description: How many edits to generate for the input and instruction. + temperature: + type: number + minimum: 0 + maximum: 2 + default: 1 + example: 1 + nullable: true + description: *completions_temperature_description + top_p: + type: number + minimum: 0 + maximum: 1 + default: 1 + example: 1 + nullable: true + description: *completions_top_p_description + required: + - model + - instruction + + CreateEditResponse: + type: object + title: Edit + deprecated: true + properties: + choices: + type: array + description: A list of edit choices. Can be more than one if `n` is greater than 1. + items: + type: object + required: + - text + - index + - finish_reason + properties: + finish_reason: + type: string + description: *completion_finish_reason_description + enum: ["stop", "length"] + index: + type: integer + description: The index of the choice in the list of choices. + text: + type: string + description: The edited result. + object: + type: string + description: The object type, which is always `edit`. + enum: [edit] + created: + type: integer + description: The Unix timestamp (in seconds) of when the edit was created. + usage: + $ref: "#/components/schemas/CompletionUsage" + required: + - object + - created + - choices + - usage + x-oaiMeta: + name: The edit object + example: *edit_example + + CreateImageRequest: + type: object + properties: + prompt: + description: A text description of the desired image(s). The maximum length is 1000 characters for `dall-e-2` and 4000 characters for `dall-e-3`. + type: string + example: "A cute baby sea otter" + model: + anyOf: + - type: string + - type: string + enum: ["dall-e-2", "dall-e-3"] + x-oaiTypeLabel: string + default: "dall-e-2" + example: "dall-e-3" + nullable: true + description: The model to use for image generation. + n: &images_n + type: integer + minimum: 1 + maximum: 10 + default: 1 + example: 1 + nullable: true + description: The number of images to generate. Must be between 1 and 10. For `dall-e-3`, only `n=1` is supported. + quality: + type: string + enum: ["standard", "hd"] + default: "standard" + example: "standard" + description: The quality of the image that will be generated. `hd` creates images with finer details and greater consistency across the image. This param is only supported for `dall-e-3`. + response_format: &images_response_format + type: string + enum: ["url", "b64_json"] + default: "url" + example: "url" + nullable: true + description: The format in which the generated images are returned. Must be one of `url` or `b64_json`. + size: &images_size + type: string + enum: ["256x256", "512x512", "1024x1024", "1792x1024", "1024x1792"] + default: "1024x1024" + example: "1024x1024" + nullable: true + description: The size of the generated images. Must be one of `256x256`, `512x512`, or `1024x1024` for `dall-e-2`. Must be one of `1024x1024`, `1792x1024`, or `1024x1792` for `dall-e-3` models. + style: + type: string + enum: ["vivid", "natural"] + default: "vivid" + example: "vivid" + nullable: true + description: The style of the generated images. Must be one of `vivid` or `natural`. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images. This param is only supported for `dall-e-3`. + user: *end_user_param_configuration + required: + - prompt + + ImagesResponse: + properties: + created: + type: integer + data: + type: array + items: + $ref: "#/components/schemas/Image" + required: + - created + - data + + Image: + type: object + description: Represents the url or the content of an image generated by the OpenAI API. + properties: + b64_json: + type: string + description: The base64-encoded JSON of the generated image, if `response_format` is `b64_json`. + url: + type: string + description: The URL of the generated image, if `response_format` is `url` (default). + revised_prompt: + type: string + description: The prompt that was used to generate the image, if there was any revision to the prompt. + x-oaiMeta: + name: The image object + example: | + { + "url": "...", + "revised_prompt": "..." + } + + CreateImageEditRequest: + type: object + properties: + image: + description: The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask. + type: string + format: binary + prompt: + description: A text description of the desired image(s). The maximum length is 1000 characters. + type: string + example: "A cute baby sea otter wearing a beret" + mask: + description: An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where `image` should be edited. Must be a valid PNG file, less than 4MB, and have the same dimensions as `image`. + type: string + format: binary + model: + anyOf: + - type: string + - type: string + enum: ["dall-e-2"] + x-oaiTypeLabel: string + default: "dall-e-2" + example: "dall-e-2" + nullable: true + description: The model to use for image generation. Only `dall-e-2` is supported at this time. + n: + type: integer + minimum: 1 + maximum: 10 + default: 1 + example: 1 + nullable: true + description: The number of images to generate. Must be between 1 and 10. + size: &dalle2_images_size + type: string + enum: ["256x256", "512x512", "1024x1024"] + default: "1024x1024" + example: "1024x1024" + nullable: true + description: The size of the generated images. Must be one of `256x256`, `512x512`, or `1024x1024`. + response_format: *images_response_format + user: *end_user_param_configuration + required: + - prompt + - image + + CreateImageVariationRequest: + type: object + properties: + image: + description: The image to use as the basis for the variation(s). Must be a valid PNG file, less than 4MB, and square. + type: string + format: binary + model: + anyOf: + - type: string + - type: string + enum: ["dall-e-2"] + x-oaiTypeLabel: string + default: "dall-e-2" + example: "dall-e-2" + nullable: true + description: The model to use for image generation. Only `dall-e-2` is supported at this time. + n: *images_n + response_format: *images_response_format + size: *dalle2_images_size + user: *end_user_param_configuration + required: + - image + + CreateModerationRequest: + type: object + properties: + input: + description: The input text to classify + oneOf: + - type: string + default: "" + example: "I want to kill them." + - type: array + items: + type: string + default: "" + example: "I want to kill them." + model: + description: | + Two content moderations models are available: `text-moderation-stable` and `text-moderation-latest`. + + The default is `text-moderation-latest` which will be automatically upgraded over time. This ensures you are always using our most accurate model. If you use `text-moderation-stable`, we will provide advanced notice before updating the model. Accuracy of `text-moderation-stable` may be slightly lower than for `text-moderation-latest`. + nullable: false + default: "text-moderation-latest" + example: "text-moderation-stable" + anyOf: + - type: string + - type: string + enum: ["text-moderation-latest", "text-moderation-stable"] + x-oaiTypeLabel: string + required: + - input + + CreateModerationResponse: + type: object + description: Represents policy compliance report by OpenAI's content moderation model against a given input. + properties: + id: + type: string + description: The unique identifier for the moderation request. + model: + type: string + description: The model used to generate the moderation results. + results: + type: array + description: A list of moderation objects. + items: + type: object + properties: + flagged: + type: boolean + description: Whether the content violates [OpenAI's usage policies](/policies/usage-policies). + categories: + type: object + description: A list of the categories, and whether they are flagged or not. + properties: + hate: + type: boolean + description: Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harrassment. + hate/threatening: + type: boolean + description: Hateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. + harassment: + type: boolean + description: Content that expresses, incites, or promotes harassing language towards any target. + harassment/threatening: + type: boolean + description: Harassment content that also includes violence or serious harm towards any target. + self-harm: + type: boolean + description: Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders. + self-harm/intent: + type: boolean + description: Content where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders. + self-harm/instructions: + type: boolean + description: Content that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts. + sexual: + type: boolean + description: Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness). + sexual/minors: + type: boolean + description: Sexual content that includes an individual who is under 18 years old. + violence: + type: boolean + description: Content that depicts death, violence, or physical injury. + violence/graphic: + type: boolean + description: Content that depicts death, violence, or physical injury in graphic detail. + required: + - hate + - hate/threatening + - harassment + - harassment/threatening + - self-harm + - self-harm/intent + - self-harm/instructions + - sexual + - sexual/minors + - violence + - violence/graphic + category_scores: + type: object + description: A list of the categories along with their scores as predicted by model. + properties: + hate: + type: number + description: The score for the category 'hate'. + hate/threatening: + type: number + description: The score for the category 'hate/threatening'. + harassment: + type: number + description: The score for the category 'harassment'. + harassment/threatening: + type: number + description: The score for the category 'harassment/threatening'. + self-harm: + type: number + description: The score for the category 'self-harm'. + self-harm/intent: + type: number + description: The score for the category 'self-harm/intent'. + self-harm/instructions: + type: number + description: The score for the category 'self-harm/instructions'. + sexual: + type: number + description: The score for the category 'sexual'. + sexual/minors: + type: number + description: The score for the category 'sexual/minors'. + violence: + type: number + description: The score for the category 'violence'. + violence/graphic: + type: number + description: The score for the category 'violence/graphic'. + required: + - hate + - hate/threatening + - harassment + - harassment/threatening + - self-harm + - self-harm/intent + - self-harm/instructions + - sexual + - sexual/minors + - violence + - violence/graphic + required: + - flagged + - categories + - category_scores + required: + - id + - model + - results + x-oaiMeta: + name: The moderation object + example: *moderation_example + + ListFilesResponse: + type: object + properties: + data: + type: array + items: + $ref: "#/components/schemas/OpenAIFile" + object: + type: string + enum: [list] + required: + - object + - data + + CreateFileRequest: + type: object + additionalProperties: false + properties: + file: + description: | + The File object (not file name) to be uploaded. + type: string + format: binary + purpose: + description: | + The intended purpose of the uploaded file. + + Use "fine-tune" for [Fine-tuning](/docs/api-reference/fine-tuning) and "assistants" for [Assistants](/docs/api-reference/assistants) and [Messages](/docs/api-reference/messages). This allows us to validate the format of the uploaded file is correct for fine-tuning. + type: string + enum: ["fine-tune", "assistants"] + required: + - file + - purpose + + DeleteFileResponse: + type: object + properties: + id: + type: string + object: + type: string + enum: [file] + deleted: + type: boolean + required: + - id + - object + - deleted + + CreateFineTuningJobRequest: + type: object + properties: + model: + description: | + The name of the model to fine-tune. You can select one of the + [supported models](/docs/guides/fine-tuning/what-models-can-be-fine-tuned). + example: "gpt-3.5-turbo" + anyOf: + - type: string + - type: string + enum: ["babbage-002", "davinci-002", "gpt-3.5-turbo"] + x-oaiTypeLabel: string + training_file: + description: | + The ID of an uploaded file that contains training data. + + See [upload file](/docs/api-reference/files/upload) for how to upload a file. + + Your dataset must be formatted as a JSONL file. Additionally, you must upload your file with the purpose `fine-tune`. + + See the [fine-tuning guide](/docs/guides/fine-tuning) for more details. + type: string + example: "file-abc123" + hyperparameters: + type: object + description: The hyperparameters used for the fine-tuning job. + properties: + batch_size: + description: | + Number of examples in each batch. A larger batch size means that model parameters + are updated less frequently, but with lower variance. + oneOf: + - type: string + enum: [auto] + - type: integer + minimum: 1 + maximum: 256 + default: auto + learning_rate_multiplier: + description: | + Scaling factor for the learning rate. A smaller learning rate may be useful to avoid + overfitting. + oneOf: + - type: string + enum: [auto] + - type: number + minimum: 0 + exclusiveMinimum: true + default: auto + n_epochs: + description: | + The number of epochs to train the model for. An epoch refers to one full cycle + through the training dataset. + oneOf: + - type: string + enum: [auto] + - type: integer + minimum: 1 + maximum: 50 + default: auto + suffix: + description: | + A string of up to 18 characters that will be added to your fine-tuned model name. + + For example, a `suffix` of "custom-model-name" would produce a model name like `ft:gpt-3.5-turbo:openai:custom-model-name:7p4lURel`. + type: string + minLength: 1 + maxLength: 40 + default: null + nullable: true + validation_file: + description: | + The ID of an uploaded file that contains validation data. + + If you provide this file, the data is used to generate validation + metrics periodically during fine-tuning. These metrics can be viewed in + the fine-tuning results file. + The same data should not be present in both train and validation files. + + Your dataset must be formatted as a JSONL file. You must upload your file with the purpose `fine-tune`. + + See the [fine-tuning guide](/docs/guides/fine-tuning) for more details. + type: string + nullable: true + example: "file-abc123" + required: + - model + - training_file + + ListFineTuningJobEventsResponse: + type: object + properties: + data: + type: array + items: + $ref: "#/components/schemas/FineTuningJobEvent" + object: + type: string + enum: [list] + required: + - object + - data + + CreateFineTuneRequest: + type: object + properties: + training_file: + description: | + The ID of an uploaded file that contains training data. + + See [upload file](/docs/api-reference/files/upload) for how to upload a file. + + Your dataset must be formatted as a JSONL file, where each training + example is a JSON object with the keys "prompt" and "completion". + Additionally, you must upload your file with the purpose `fine-tune`. + + See the [fine-tuning guide](/docs/guides/legacy-fine-tuning/creating-training-data) for more details. + type: string + example: "file-abc123" + batch_size: + description: | + The batch size to use for training. The batch size is the number of + training examples used to train a single forward and backward pass. + + By default, the batch size will be dynamically configured to be + ~0.2% of the number of examples in the training set, capped at 256 - + in general, we've found that larger batch sizes tend to work better + for larger datasets. + default: null + type: integer + nullable: true + classification_betas: + description: | + If this is provided, we calculate F-beta scores at the specified + beta values. The F-beta score is a generalization of F-1 score. + This is only used for binary classification. + + With a beta of 1 (i.e. the F-1 score), precision and recall are + given the same weight. A larger beta score puts more weight on + recall and less on precision. A smaller beta score puts more weight + on precision and less on recall. + type: array + items: + type: number + example: [0.6, 1, 1.5, 2] + default: null + nullable: true + classification_n_classes: + description: | + The number of classes in a classification task. + + This parameter is required for multiclass classification. + type: integer + default: null + nullable: true + classification_positive_class: + description: | + The positive class in binary classification. + + This parameter is needed to generate precision, recall, and F1 + metrics when doing binary classification. + type: string + default: null + nullable: true + compute_classification_metrics: + description: | + If set, we calculate classification-specific metrics such as accuracy + and F-1 score using the validation set at the end of every epoch. + These metrics can be viewed in the [results file](/docs/guides/legacy-fine-tuning/analyzing-your-fine-tuned-model). + + In order to compute classification metrics, you must provide a + `validation_file`. Additionally, you must + specify `classification_n_classes` for multiclass classification or + `classification_positive_class` for binary classification. + type: boolean + default: false + nullable: true + hyperparameters: + type: object + description: The hyperparameters used for the fine-tuning job. + properties: + n_epochs: + description: | + The number of epochs to train the model for. An epoch refers to one + full cycle through the training dataset. + oneOf: + - type: string + enum: [auto] + - type: integer + minimum: 1 + maximum: 50 + default: auto + learning_rate_multiplier: + description: | + The learning rate multiplier to use for training. + The fine-tuning learning rate is the original learning rate used for + pretraining multiplied by this value. + + By default, the learning rate multiplier is the 0.05, 0.1, or 0.2 + depending on final `batch_size` (larger learning rates tend to + perform better with larger batch sizes). We recommend experimenting + with values in the range 0.02 to 0.2 to see what produces the best + results. + default: null + type: number + nullable: true + model: + description: | + The name of the base model to fine-tune. You can select one of "ada", + "babbage", "curie", "davinci", or a fine-tuned model created after 2022-04-21 and before 2023-08-22. + To learn more about these models, see the + [Models](/docs/models) documentation. + default: "curie" + example: "curie" + nullable: true + anyOf: + - type: string + - type: string + enum: ["ada", "babbage", "curie", "davinci"] + x-oaiTypeLabel: string + prompt_loss_weight: + description: | + The weight to use for loss on the prompt tokens. This controls how + much the model tries to learn to generate the prompt (as compared + to the completion which always has a weight of 1.0), and can add + a stabilizing effect to training when completions are short. + + If prompts are extremely long (relative to completions), it may make + sense to reduce this weight so as to avoid over-prioritizing + learning the prompt. + default: 0.01 + type: number + nullable: true + suffix: + description: | + A string of up to 40 characters that will be added to your fine-tuned model name. + + For example, a `suffix` of "custom-model-name" would produce a model name like `ada:ft-your-org:custom-model-name-2022-02-15-04-21-04`. + type: string + minLength: 1 + maxLength: 40 + default: null + nullable: true + validation_file: + description: | + The ID of an uploaded file that contains validation data. + + If you provide this file, the data is used to generate validation + metrics periodically during fine-tuning. These metrics can be viewed in + the [fine-tuning results file](/docs/guides/legacy-fine-tuning/analyzing-your-fine-tuned-model). + Your train and validation data should be mutually exclusive. + + Your dataset must be formatted as a JSONL file, where each validation + example is a JSON object with the keys "prompt" and "completion". + Additionally, you must upload your file with the purpose `fine-tune`. + + See the [fine-tuning guide](/docs/guides/legacy-fine-tuning/creating-training-data) for more details. + type: string + nullable: true + example: "file-abc123" + required: + - training_file + + ListFineTunesResponse: + type: object + properties: + data: + type: array + items: + $ref: "#/components/schemas/FineTune" + object: + type: string + enum: [list] + required: + - object + - data + + ListFineTuneEventsResponse: + type: object + properties: + data: + type: array + items: + $ref: "#/components/schemas/FineTuneEvent" + object: + type: string + enum: [list] + required: + - object + - data + + CreateTranscriptionRequest: + type: object + additionalProperties: false + properties: + file: + description: | + The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. + type: string + x-oaiTypeLabel: file + format: binary + model: + description: | + ID of the model to use. Only `whisper-1` is currently available. + example: whisper-1 + anyOf: + - type: string + - type: string + enum: ["whisper-1"] + x-oaiTypeLabel: string + language: + description: | + The language of the input audio. Supplying the input language in [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) format will improve accuracy and latency. + type: string + prompt: + description: | + An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text/prompting) should match the audio language. + type: string + response_format: + description: | + The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`. + type: string + enum: + - json + - text + - srt + - verbose_json + - vtt + default: json + temperature: + description: | + The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit. + type: number + default: 0 + required: + - file + - model + + # Note: This does not currently support the non-default response format types. + CreateTranscriptionResponse: + type: object + properties: + text: + type: string + required: + - text + + CreateTranslationRequest: + type: object + additionalProperties: false + properties: + file: + description: | + The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. + type: string + x-oaiTypeLabel: file + format: binary + model: + description: | + ID of the model to use. Only `whisper-1` is currently available. + example: whisper-1 + anyOf: + - type: string + - type: string + enum: ["whisper-1"] + x-oaiTypeLabel: string + prompt: + description: | + An optional text to guide the model's style or continue a previous audio segment. The [prompt](/docs/guides/speech-to-text/prompting) should be in English. + type: string + response_format: + description: | + The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`. + type: string + default: json + temperature: + description: | + The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit. + type: number + default: 0 + required: + - file + - model + + # Note: This does not currently support the non-default response format types. + CreateTranslationResponse: + type: object + properties: + text: + type: string + required: + - text + + CreateSpeechRequest: + type: object + additionalProperties: false + properties: + model: + description: | + One of the available [TTS models](/docs/models/tts): `tts-1` or `tts-1-hd` + anyOf: + - type: string + - type: string + enum: ["tts-1", "tts-1-hd"] + x-oaiTypeLabel: string + input: + type: string + description: The text to generate audio for. The maximum length is 4096 characters. + maxLength: 4096 + voice: + description: The voice to use when generating the audio. Supported voices are `alloy`, `echo`, `fable`, `onyx`, `nova`, and `shimmer`. + type: string + enum: ["alloy", "echo", "fable", "onyx", "nova", "shimmer"] + response_format: + description: "The format to audio in. Supported formats are `mp3`, `opus`, `aac`, and `flac`." + default: "mp3" + type: string + enum: ["mp3", "opus", "aac", "flac"] + speed: + description: "The speed of the generated audio. Select a value from `0.25` to `4.0`. `1.0` is the default." + type: number + default: 1.0 + minimum: 0.25 + maximum: 4.0 + required: + - model + - input + - voice + + Model: + title: Model + description: Describes an OpenAI model offering that can be used with the API. + properties: + id: + type: string + description: The model identifier, which can be referenced in the API endpoints. + created: + type: integer + description: The Unix timestamp (in seconds) when the model was created. + object: + type: string + description: The object type, which is always "model". + enum: [model] + owned_by: + type: string + description: The organization that owns the model. + required: + - id + - object + - created + - owned_by + x-oaiMeta: + name: The model object + example: *retrieve_model_response + + OpenAIFile: + title: OpenAIFile + description: The `File` object represents a document that has been uploaded to OpenAI. + properties: + id: + type: string + description: The file identifier, which can be referenced in the API endpoints. + bytes: + type: integer + description: The size of the file, in bytes. + created_at: + type: integer + description: The Unix timestamp (in seconds) for when the file was created. + filename: + type: string + description: The name of the file. + object: + type: string + description: The object type, which is always `file`. + enum: ["file"] + purpose: + type: string + description: The intended purpose of the file. Supported values are `fine-tune`, `fine-tune-results`, `assistants`, and `assistants_output`. + enum: + [ + "fine-tune", + "fine-tune-results", + "assistants", + "assistants_output", + ] + status: + type: string + deprecated: true + description: Deprecated. The current status of the file, which can be either `uploaded`, `processed`, or `error`. + enum: ["uploaded", "processed", "error"] + status_details: + type: string + deprecated: true + description: Deprecated. For details on why a fine-tuning training file failed validation, see the `error` field on `fine_tuning.job`. + required: + - id + - object + - bytes + - created_at + - filename + - purpose + - status + x-oaiMeta: + name: The File object + example: | + { + "id": "file-BK7bzQj3FfZFXr7DbL6xJwfo", + "object": "file", + "bytes": 120000, + "created_at": 1677610602, + "filename": "salesOverview.pdf", + "purpose": "assistants", + } + Embedding: + type: object + description: | + Represents an embedding vector returned by embedding endpoint. + properties: + index: + type: integer + description: The index of the embedding in the list of embeddings. + embedding: + type: array + description: | + The embedding vector, which is a list of floats. The length of vector depends on the model as listed in the [embedding guide](/docs/guides/embeddings). + items: + type: number + object: + type: string + description: The object type, which is always "embedding". + enum: [embedding] + required: + - index + - object + - embedding + x-oaiMeta: + name: The embedding object + example: | + { + "object": "embedding", + "embedding": [ + 0.0023064255, + -0.009327292, + .... (1536 floats total for ada-002) + -0.0028842222, + ], + "index": 0 + } + + FineTuningJob: + type: object + title: FineTuningJob + description: | + The `fine_tuning.job` object represents a fine-tuning job that has been created through the API. + properties: + id: + type: string + description: The object identifier, which can be referenced in the API endpoints. + created_at: + type: integer + description: The Unix timestamp (in seconds) for when the fine-tuning job was created. + error: + type: object + nullable: true + description: For fine-tuning jobs that have `failed`, this will contain more information on the cause of the failure. + properties: + code: + type: string + description: A machine-readable error code. + message: + type: string + description: A human-readable error message. + param: + type: string + description: The parameter that was invalid, usually `training_file` or `validation_file`. This field will be null if the failure was not parameter-specific. + nullable: true + required: + - code + - message + - param + fine_tuned_model: + type: string + nullable: true + description: The name of the fine-tuned model that is being created. The value will be null if the fine-tuning job is still running. + finished_at: + type: integer + nullable: true + description: The Unix timestamp (in seconds) for when the fine-tuning job was finished. The value will be null if the fine-tuning job is still running. + hyperparameters: + type: object + description: The hyperparameters used for the fine-tuning job. See the [fine-tuning guide](/docs/guides/fine-tuning) for more details. + properties: + n_epochs: + oneOf: + - type: string + enum: [auto] + - type: integer + minimum: 1 + maximum: 50 + default: auto + description: + The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset. + + "auto" decides the optimal number of epochs based on the size of the dataset. If setting the number manually, we support any number between 1 and 50 epochs. + required: + - n_epochs + model: + type: string + description: The base model that is being fine-tuned. + object: + type: string + description: The object type, which is always "fine_tuning.job". + enum: [fine_tuning.job] + organization_id: + type: string + description: The organization that owns the fine-tuning job. + result_files: + type: array + description: The compiled results file ID(s) for the fine-tuning job. You can retrieve the results with the [Files API](/docs/api-reference/files/retrieve-contents). + items: + type: string + example: file-abc123 + status: + type: string + description: The current status of the fine-tuning job, which can be either `validating_files`, `queued`, `running`, `succeeded`, `failed`, or `cancelled`. + enum: + [ + "validating_files", + "queued", + "running", + "succeeded", + "failed", + "cancelled", + ] + trained_tokens: + type: integer + nullable: true + description: The total number of billable tokens processed by this fine-tuning job. The value will be null if the fine-tuning job is still running. + training_file: + type: string + description: The file ID used for training. You can retrieve the training data with the [Files API](/docs/api-reference/files/retrieve-contents). + validation_file: + type: string + nullable: true + description: The file ID used for validation. You can retrieve the validation results with the [Files API](/docs/api-reference/files/retrieve-contents). + required: + - created_at + - error + - finished_at + - fine_tuned_model + - hyperparameters + - id + - model + - object + - organization_id + - result_files + - status + - trained_tokens + - training_file + - validation_file + x-oaiMeta: + name: The fine-tuning job object + example: *fine_tuning_example + + FineTuningJobEvent: + type: object + description: Fine-tuning job event object + properties: + id: + type: string + created_at: + type: integer + level: + type: string + enum: ["info", "warn", "error"] + message: + type: string + object: + type: string + enum: [fine_tuning.job.event] + required: + - id + - object + - created_at + - level + - message + x-oaiMeta: + name: The fine-tuning job event object + example: | + { + "object": "fine_tuning.job.event", + "id": "ftevent-abc123" + "created_at": 1677610602, + "level": "info", + "message": "Created fine-tuning job" + } + + FineTune: + type: object + deprecated: true + description: | + The `FineTune` object represents a legacy fine-tune job that has been created through the API. + properties: + id: + type: string + description: The object identifier, which can be referenced in the API endpoints. + created_at: + type: integer + description: The Unix timestamp (in seconds) for when the fine-tuning job was created. + events: + type: array + description: The list of events that have been observed in the lifecycle of the FineTune job. + items: + $ref: "#/components/schemas/FineTuneEvent" + fine_tuned_model: + type: string + nullable: true + description: The name of the fine-tuned model that is being created. + hyperparams: + type: object + description: The hyperparameters used for the fine-tuning job. See the [fine-tuning guide](/docs/guides/legacy-fine-tuning/hyperparameters) for more details. + properties: + batch_size: + type: integer + description: | + The batch size to use for training. The batch size is the number of + training examples used to train a single forward and backward pass. + classification_n_classes: + type: integer + description: | + The number of classes to use for computing classification metrics. + classification_positive_class: + type: string + description: | + The positive class to use for computing classification metrics. + compute_classification_metrics: + type: boolean + description: | + The classification metrics to compute using the validation dataset at the end of every epoch. + learning_rate_multiplier: + type: number + description: | + The learning rate multiplier to use for training. + n_epochs: + type: integer + description: | + The number of epochs to train the model for. An epoch refers to one + full cycle through the training dataset. + prompt_loss_weight: + type: number + description: | + The weight to use for loss on the prompt tokens. + required: + - batch_size + - learning_rate_multiplier + - n_epochs + - prompt_loss_weight + model: + type: string + description: The base model that is being fine-tuned. + object: + type: string + description: The object type, which is always "fine-tune". + enum: [fine-tune] + organization_id: + type: string + description: The organization that owns the fine-tuning job. + result_files: + type: array + description: The compiled results files for the fine-tuning job. + items: + $ref: "#/components/schemas/OpenAIFile" + status: + type: string + description: The current status of the fine-tuning job, which can be either `created`, `running`, `succeeded`, `failed`, or `cancelled`. + training_files: + type: array + description: The list of files used for training. + items: + $ref: "#/components/schemas/OpenAIFile" + updated_at: + type: integer + description: The Unix timestamp (in seconds) for when the fine-tuning job was last updated. + validation_files: + type: array + description: The list of files used for validation. + items: + $ref: "#/components/schemas/OpenAIFile" + required: + - created_at + - fine_tuned_model + - hyperparams + - id + - model + - object + - organization_id + - result_files + - status + - training_files + - updated_at + - validation_files + x-oaiMeta: + name: The fine-tune object + example: *fine_tune_example + + FineTuneEvent: + type: object + deprecated: true + description: Fine-tune event object + properties: + created_at: + type: integer + level: + type: string + message: + type: string + object: + type: string + enum: [fine-tune-event] + required: + - object + - created_at + - level + - message + x-oaiMeta: + name: The fine-tune event object + example: | + { + "object": "fine-tune-event", + "created_at": 1677610602, + "level": "info", + "message": "Created fine-tune job" + } + + CompletionUsage: + type: object + description: Usage statistics for the completion request. + properties: + completion_tokens: + type: integer + description: Number of tokens in the generated completion. + prompt_tokens: + type: integer + description: Number of tokens in the prompt. + total_tokens: + type: integer + description: Total number of tokens used in the request (prompt + completion). + required: + - prompt_tokens + - completion_tokens + - total_tokens + + AssistantObject: + type: object + title: Assistant + description: Represents an `assistant` that can call the model and use tools. + properties: + id: + description: The identifier, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `assistant`. + type: string + enum: [assistant] + created_at: + description: The Unix timestamp (in seconds) for when the assistant was created. + type: integer + name: + description: &assistant_name_param_description | + The name of the assistant. The maximum length is 256 characters. + type: string + maxLength: 256 + nullable: true + description: + description: &assistant_description_param_description | + The description of the assistant. The maximum length is 512 characters. + type: string + maxLength: 512 + nullable: true + model: + description: *model_description + type: string + instructions: + description: &assistant_instructions_param_description | + The system instructions that the assistant uses. The maximum length is 32768 characters. + type: string + maxLength: 32768 + nullable: true + tools: + description: &assistant_tools_param_description | + A list of tool enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types `code_interpreter`, `retrieval`, or `function`. + default: [] + type: array + maxItems: 128 + items: + oneOf: + - $ref: "#/components/schemas/AssistantToolsCode" + - $ref: "#/components/schemas/AssistantToolsRetrieval" + - $ref: "#/components/schemas/AssistantToolsFunction" + x-oaiExpandable: true + file_ids: + description: &assistant_file_param_description | + A list of [file](/docs/api-reference/files) IDs attached to this assistant. There can be a maximum of 20 files attached to the assistant. Files are ordered by their creation date in ascending order. + default: [] + type: array + maxItems: 20 + items: + type: string + metadata: + description: &metadata_description | + Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long. + type: object + x-oaiTypeLabel: map + nullable: true + required: + - id + - object + - created_at + - name + - description + - model + - instructions + - tools + - file_ids + - metadata + x-oaiMeta: + name: The assistant object + beta: true + example: *create_assistants_example + + CreateAssistantRequest: + type: object + additionalProperties: false + properties: + model: + description: *model_description + anyOf: + - type: string + name: + description: *assistant_name_param_description + type: string + nullable: true + maxLength: 256 + description: + description: *assistant_description_param_description + type: string + nullable: true + maxLength: 512 + instructions: + description: *assistant_instructions_param_description + type: string + nullable: true + maxLength: 32768 + tools: + description: *assistant_tools_param_description + default: [] + type: array + maxItems: 128 + items: + oneOf: + - $ref: "#/components/schemas/AssistantToolsCode" + - $ref: "#/components/schemas/AssistantToolsRetrieval" + - $ref: "#/components/schemas/AssistantToolsFunction" + x-oaiExpandable: true + file_ids: + description: *assistant_file_param_description + default: [] + maxItems: 20 + type: array + items: + type: string + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - model + + ModifyAssistantRequest: + type: object + additionalProperties: false + properties: + model: + description: *model_description + anyOf: + - type: string + name: + description: *assistant_name_param_description + type: string + nullable: true + maxLength: 256 + description: + description: *assistant_description_param_description + type: string + nullable: true + maxLength: 512 + instructions: + description: *assistant_instructions_param_description + type: string + nullable: true + maxLength: 32768 + tools: + description: *assistant_tools_param_description + default: [] + type: array + maxItems: 128 + items: + oneOf: + - $ref: "#/components/schemas/AssistantToolsCode" + - $ref: "#/components/schemas/AssistantToolsRetrieval" + - $ref: "#/components/schemas/AssistantToolsFunction" + x-oaiExpandable: true + file_ids: + description: | + A list of [File](/docs/api-reference/files) IDs attached to this assistant. There can be a maximum of 20 files attached to the assistant. Files are ordered by their creation date in ascending order. If a file was previosuly attached to the list but does not show up in the list, it will be deleted from the assistant. + default: [] + type: array + maxItems: 20 + items: + type: string + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + + DeleteAssistantResponse: + type: object + properties: + id: + type: string + deleted: + type: boolean + object: + type: string + enum: [assistant.deleted] + required: + - id + - object + - deleted + + ListAssistantsResponse: + type: object + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/AssistantObject" + first_id: + type: string + example: "asst_hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "asst_QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - first_id + - last_id + - has_more + x-oaiMeta: + name: List assistants response object + group: chat + example: *list_assistants_example + + AssistantToolsCode: + type: object + title: Code interpreter tool + properties: + type: + type: string + description: "The type of tool being defined: `code_interpreter`" + enum: ["code_interpreter"] + required: + - type + + AssistantToolsRetrieval: + type: object + title: Retrieval tool + properties: + type: + type: string + description: "The type of tool being defined: `retrieval`" + enum: ["retrieval"] + required: + - type + + AssistantToolsFunction: + type: object + title: Function tool + properties: + type: + type: string + description: "The type of tool being defined: `function`" + enum: ["function"] + function: + $ref: "#/components/schemas/FunctionObject" + required: + - type + - function + + RunObject: + type: object + title: A run on a thread + description: Represents an execution run on a [thread](/docs/api-reference/threads). + properties: + id: + description: The identifier, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `thread.run`. + type: string + enum: ["thread.run"] + created_at: + description: The Unix timestamp (in seconds) for when the run was created. + type: integer + thread_id: + description: The ID of the [thread](/docs/api-reference/threads) that was executed on as a part of this run. + type: string + assistant_id: + description: The ID of the [assistant](/docs/api-reference/assistants) used for execution of this run. + type: string + status: + description: The status of the run, which can be either `queued`, `in_progress`, `requires_action`, `cancelling`, `cancelled`, `failed`, `completed`, or `expired`. + type: string + enum: + [ + "queued", + "in_progress", + "requires_action", + "cancelling", + "cancelled", + "failed", + "completed", + "expired", + ] + required_action: + type: object + description: Details on the action required to continue the run. Will be `null` if no action is required. + nullable: true + properties: + type: + description: For now, this is always `submit_tool_outputs`. + type: string + enum: ["submit_tool_outputs"] + submit_tool_outputs: + type: object + description: Details on the tool outputs needed for this run to continue. + properties: + tool_calls: + type: array + description: A list of the relevant tool calls. + items: + $ref: "#/components/schemas/RunToolCallObject" + required: + - tool_calls + required: + - type + - submit_tool_outputs + last_error: + type: object + description: The last error associated with this run. Will be `null` if there are no errors. + nullable: true + properties: + code: + type: string + description: One of `server_error` or `rate_limit_exceeded`. + enum: ["server_error", "rate_limit_exceeded"] + message: + type: string + description: A human-readable description of the error. + required: + - code + - message + expires_at: + description: The Unix timestamp (in seconds) for when the run will expire. + type: integer + started_at: + description: The Unix timestamp (in seconds) for when the run was started. + type: integer + nullable: true + cancelled_at: + description: The Unix timestamp (in seconds) for when the run was cancelled. + type: integer + nullable: true + failed_at: + description: The Unix timestamp (in seconds) for when the run failed. + type: integer + nullable: true + completed_at: + description: The Unix timestamp (in seconds) for when the run was completed. + type: integer + nullable: true + model: + description: The model that the [assistant](/docs/api-reference/assistants) used for this run. + type: string + instructions: + description: The instructions that the [assistant](/docs/api-reference/assistants) used for this run. + type: string + tools: + description: The list of tools that the [assistant](/docs/api-reference/assistants) used for this run. + default: [] + type: array + maxItems: 20 + items: + oneOf: + - $ref: "#/components/schemas/AssistantToolsCode" + - $ref: "#/components/schemas/AssistantToolsRetrieval" + - $ref: "#/components/schemas/AssistantToolsFunction" + x-oaiExpandable: true + file_ids: + description: The list of [File](/docs/api-reference/files) IDs the [assistant](/docs/api-reference/assistants) used for this run. + default: [] + type: array + items: + type: string + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - id + - object + - created_at + - thread_id + - assistant_id + - status + - required_action + - last_error + - expires_at + - started_at + - cancelled_at + - failed_at + - completed_at + - model + - instructions + - tools + - file_ids + - metadata + x-oaiMeta: + name: The run object + beta: true + example: | + { + "id": "run_example123", + "object": "thread.run", + "created_at": 1698107661, + "assistant_id": "asst_gZ1aOomboBuYWPcXJx4vAYB0", + "thread_id": "thread_adOpf7Jbb5Abymz0QbwxAh3c", + "status": "completed", + "started_at": 1699073476, + "expires_at": null, + "cancelled_at": null, + "failed_at": null, + "completed_at": 1699073498, + "last_error": null, + "model": "gpt-4", + "instructions": null, + "tools": [{"type": "retrieval"}, {"type": "code_interpreter"}], + "file_ids": [], + "metadata": {} + } + CreateRunRequest: + type: object + additionalProperties: false + properties: + assistant_id: + description: The ID of the [assistant](/docs/api-reference/assistants) to use to execute this run. + type: string + model: + description: The ID of the [Model](/docs/api-reference/models) to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. + type: string + nullable: true + instructions: + description: Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. + type: string + nullable: true + tools: + description: Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. + nullable: true + type: array + maxItems: 20 + items: + oneOf: + - $ref: "#/components/schemas/AssistantToolsCode" + - $ref: "#/components/schemas/AssistantToolsRetrieval" + - $ref: "#/components/schemas/AssistantToolsFunction" + x-oaiExpandable: true + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - thread_id + - assistant_id + ListRunsResponse: + type: object + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/RunObject" + first_id: + type: string + example: "run_hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "run_QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - first_id + - last_id + - has_more + ModifyRunRequest: + type: object + additionalProperties: false + properties: + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + SubmitToolOutputsRunRequest: + type: object + additionalProperties: false + properties: + tool_outputs: + description: A list of tools for which the outputs are being submitted. + type: array + items: + type: object + properties: + tool_call_id: + type: string + description: The ID of the tool call in the `required_action` object within the run object the output is being submitted for. + output: + type: string + description: The output of the tool call to be submitted to continue the run. + required: + - tool_outputs + RunToolCallObject: + type: object + description: Tool call objects + properties: + id: + type: string + description: The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the [Submit tool outputs to run](/docs/api-reference/runs/submitToolOutputs) endpoint. + type: + type: string + description: The type of tool call the output is required for. For now, this is always `function`. + enum: ["function"] + function: + type: object + description: The function definition. + properties: + name: + type: string + description: The name of the function. + arguments: + type: string + description: The arguments that the model expects you to pass to the function. + required: + - name + - arguments + required: + - id + - type + - function + CreateThreadAndRunRequest: + type: object + additionalProperties: false + properties: + assistant_id: + description: The ID of the [assistant](/docs/api-reference/assistants) to use to execute this run. + type: string + thread: + $ref: "#/components/schemas/CreateThreadRequest" + description: If no thread is provided, an empty thread will be created. + model: + description: The ID of the [Model](/docs/api-reference/models) to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used. + type: string + nullable: true + instructions: + description: Override the default system message of the assistant. This is useful for modifying the behavior on a per-run basis. + type: string + nullable: true + tools: + description: Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis. + nullable: true + type: array + maxItems: 20 + items: + oneOf: + - $ref: "#/components/schemas/AssistantToolsCode" + - $ref: "#/components/schemas/AssistantToolsRetrieval" + - $ref: "#/components/schemas/AssistantToolsFunction" + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - thread_id + - assistant_id + + ThreadObject: + type: object + title: Thread + description: Represents a thread that contains [messages](/docs/api-reference/messages). + properties: + id: + description: The identifier, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `thread`. + type: string + enum: ["thread"] + created_at: + description: The Unix timestamp (in seconds) for when the thread was created. + type: integer + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - id + - object + - created_at + - metadata + x-oaiMeta: + name: The thread object + beta: true + example: | + { + "id": "thread_abc123", + "object": "thread", + "created_at": 1698107661, + "metadata": {} + } + + CreateThreadRequest: + type: object + additionalProperties: false + properties: + messages: + description: A list of [messages](/docs/api-reference/messages) to start the thread with. + type: array + items: + $ref: "#/components/schemas/CreateMessageRequest" + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + + ModifyThreadRequest: + type: object + additionalProperties: false + properties: + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + + DeleteThreadResponse: + type: object + properties: + id: + type: string + deleted: + type: boolean + object: + type: string + enum: [thread.deleted] + required: + - id + - object + - deleted + + ListThreadsResponse: + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/ThreadObject" + first_id: + type: string + example: "asst_hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "asst_QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - first_id + - last_id + - has_more + + MessageObject: + type: object + title: The message object + description: Represents a message within a [thread](/docs/api-reference/threads). + properties: + id: + description: The identifier, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `thread.message`. + type: string + enum: ["thread.message"] + created_at: + description: The Unix timestamp (in seconds) for when the message was created. + type: integer + thread_id: + description: The [thread](/docs/api-reference/threads) ID that this message belongs to. + type: string + role: + description: The entity that produced the message. One of `user` or `assistant`. + type: string + enum: ["user", "assistant"] + content: + description: The content of the message in array of text and/or images. + type: array + items: + oneOf: + - $ref: "#/components/schemas/MessageContentImageFileObject" + - $ref: "#/components/schemas/MessageContentTextObject" + x-oaiExpandable: true + assistant_id: + description: If applicable, the ID of the [assistant](/docs/api-reference/assistants) that authored this message. + type: string + nullable: true + run_id: + description: If applicable, the ID of the [run](/docs/api-reference/runs) associated with the authoring of this message. + type: string + nullable: true + file_ids: + description: A list of [file](/docs/api-reference/files) IDs that the assistant should use. Useful for tools like retrieval and code_interpreter that can access files. A maximum of 10 files can be attached to a message. + default: [] + maxItems: 10 + type: array + items: + type: string + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - id + - object + - created_at + - thread_id + - role + - content + - assistant_id + - run_id + - file_ids + - metadata + x-oaiMeta: + name: The message object + beta: true + example: | + { + "id": "msg_dKYDWyQvtjDBi3tudL1yWKDa", + "object": "thread.message", + "created_at": 1698983503, + "thread_id": "thread_RGUhOuO9b2nrktrmsQ2uSR6I", + "role": "assistant", + "content": [ + { + "type": "text", + "text": { + "value": "Hi! How can I help you today?", + "annotations": [] + } + } + ], + "file_ids": [], + "assistant_id": "asst_ToSF7Gb04YMj8AMMm50ZLLtY", + "run_id": "run_BjylUJgDqYK9bOhy4yjAiMrn", + "metadata": {} + } + + CreateMessageRequest: + type: object + additionalProperties: false + required: + - role + - content + properties: + role: + type: string + enum: ["user"] + description: The role of the entity that is creating the message. Currently only `user` is supported. + content: + type: string + minLength: 1 + maxLength: 32768 + description: The content of the message. + file_ids: + description: A list of [File](/docs/api-reference/files) IDs that the message should use. There can be a maximum of 10 files attached to a message. Useful for tools like `retrieval` and `code_interpreter` that can access and use files. + default: [] + type: array + minItems: 1 + maxItems: 10 + items: + type: string + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + + ModifyMessageRequest: + type: object + additionalProperties: false + properties: + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + + DeleteMessageResponse: + type: object + properties: + id: + type: string + deleted: + type: boolean + object: + type: string + enum: [thread.message.deleted] + required: + - id + - object + - deleted + + ListMessagesResponse: + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/MessageObject" + first_id: + type: string + example: "msg_hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "msg_QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - first_id + - last_id + - has_more + + MessageContentImageFileObject: + title: Image file + type: object + description: References an image [File](/docs/api-reference/files) in the content of a message. + properties: + type: + description: Always `image_file`. + type: string + enum: ["image_file"] + image_file: + type: object + properties: + file_id: + description: The [File](/docs/api-reference/files) ID of the image in the message content. + type: string + required: + - file_id + required: + - type + - image_file + + MessageContentTextObject: + title: Text + type: object + description: The text content that is part of a message. + properties: + type: + description: Always `text`. + type: string + enum: ["text"] + text: + type: object + properties: + value: + description: The data that makes up the text. + type: string + annotations: + type: array + items: + oneOf: + - $ref: "#/components/schemas/MessageContentTextAnnotationsFileCitationObject" + - $ref: "#/components/schemas/MessageContentTextAnnotationsFilePathObject" + x-oaiExpandable: true + required: + - value + - annotations + required: + - type + - text + + MessageContentTextAnnotationsFileCitationObject: + title: File citation + type: object + description: A citation within the message that points to a specific quote from a specific File associated with the assistant or the message. Generated when the assistant uses the "retrieval" tool to search files. + properties: + type: + description: Always `file_citation`. + type: string + enum: ["file_citation"] + text: + description: The text in the message content that needs to be replaced. + type: string + file_citation: + type: object + properties: + file_id: + description: The ID of the specific File the citation is from. + type: string + quote: + description: The specific quote in the file. + type: string + required: + - file_id + - quote + start_index: + type: integer + minimum: 0 + end_index: + type: integer + minimum: 0 + required: + - type + - text + - file_citation + - start_index + - end_index + + MessageContentTextAnnotationsFilePathObject: + title: File path + type: object + description: A URL for the file that's generated when the assistant used the `code_interpreter` tool to generate a file. + properties: + type: + description: Always `file_path`. + type: string + enum: ["file_path"] + text: + description: The text in the message content that needs to be replaced. + type: string + file_path: + type: object + properties: + file_id: + description: The ID of the file that was generated. + type: string + required: + - file_id + start_index: + type: integer + minimum: 0 + end_index: + type: integer + minimum: 0 + required: + - type + - text + - file_path + - start_index + - end_index + + RunStepObject: + type: object + title: Run steps + description: | + Represents a step in execution of a run. + properties: + id: + description: The identifier of the run step, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `thread.run.step``. + type: string + enum: ["thread.run.step"] + created_at: + description: The Unix timestamp (in seconds) for when the run step was created. + type: integer + assistant_id: + description: The ID of the [assistant](/docs/api-reference/assistants) associated with the run step. + type: string + thread_id: + description: The ID of the [thread](/docs/api-reference/threads) that was run. + type: string + run_id: + description: The ID of the [run](/docs/api-reference/runs) that this run step is a part of. + type: string + type: + description: The type of run step, which can be either `message_creation` or `tool_calls`. + type: string + enum: ["message_creation", "tool_calls"] + status: + description: The status of the run step, which can be either `in_progress`, `cancelled`, `failed`, `completed`, or `expired`. + type: string + enum: ["in_progress", "cancelled", "failed", "completed", "expired"] + step_details: + type: object + description: The details of the run step. + oneOf: + - $ref: "#/components/schemas/RunStepDetailsMessageCreationObject" + - $ref: "#/components/schemas/RunStepDetailsToolCallsObject" + x-oaiExpandable: true + last_error: + type: object + description: The last error associated with this run step. Will be `null` if there are no errors. + nullable: true + properties: + code: + type: string + description: One of `server_error` or `rate_limit_exceeded`. + enum: ["server_error", "rate_limit_exceeded"] + message: + type: string + description: A human-readable description of the error. + required: + - code + - message + expired_at: + description: The Unix timestamp (in seconds) for when the run step expired. A step is considered expired if the parent run is expired. + type: integer + nullable: true + cancelled_at: + description: The Unix timestamp (in seconds) for when the run step was cancelled. + type: integer + nullable: true + failed_at: + description: The Unix timestamp (in seconds) for when the run step failed. + type: integer + nullable: true + completed_at: + description: The Unix timestamp (in seconds) for when the run step completed. + type: integer + nullable: true + metadata: + description: *metadata_description + type: object + x-oaiTypeLabel: map + nullable: true + required: + - id + - object + - created_at + - assistant_id + - thread_id + - run_id + - type + - status + - step_details + - last_error + - expired_at + - cancelled_at + - failed_at + - completed_at + - metadata + x-oaiMeta: + name: The run step object + beta: true + example: *run_step_object_example + + ListRunStepsResponse: + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/RunStepObject" + first_id: + type: string + example: "step_hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "step_QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - first_id + - last_id + - has_more + + RunStepDetailsMessageCreationObject: + title: Message creation + type: object + description: Details of the message creation by the run step. + properties: + type: + description: Always `message_creation``. + type: string + enum: ["message_creation"] + message_creation: + type: object + properties: + message_id: + type: string + description: The ID of the message that was created by this run step. + required: + - message_id + required: + - type + - message_creation + + RunStepDetailsToolCallsObject: + title: Tool calls + type: object + description: Details of the tool call. + properties: + type: + description: Always `tool_calls`. + type: string + enum: ["tool_calls"] + tool_calls: + type: array + description: | + An array of tool calls the run step was involved in. These can be associated with one of three types of tools: `code_interpreter`, `retrieval`, or `function`. + items: + type: object + oneOf: + - $ref: "#/components/schemas/RunStepDetailsToolCallsCodeObject" + - $ref: "#/components/schemas/RunStepDetailsToolCallsRetrievalObject" + - $ref: "#/components/schemas/RunStepDetailsToolCallsFunctionObject" + x-oaiExpandable: true + required: + - type + - tool_calls + + RunStepDetailsToolCallsCodeObject: + title: Code interpreter tool call + type: object + description: Details of the Code Interpreter tool call the run step was involved in. + properties: + id: + type: string + description: The ID of the tool call. + type: + type: string + description: The type of tool call. This is always going to be `code_interpreter` for this type of tool call. + enum: ["code_interpreter"] + code_interpreter: + type: object + description: The Code Interpreter tool call definition. + required: + - input + - outputs + properties: + input: + type: string + description: The input to the Code Interpreter tool call. + outputs: + type: array + description: The outputs from the Code Interpreter tool call. Code Interpreter can output one or more items, including text (`logs`) or images (`image`). Each of these are represented by a different object type. + items: + type: object + oneOf: + - $ref: "#/components/schemas/RunStepDetailsToolCallsCodeOutputLogsObject" + - $ref: "#/components/schemas/RunStepDetailsToolCallsCodeOutputImageObject" + x-oaiExpandable: true + required: + - id + - type + - code_interpreter + + RunStepDetailsToolCallsCodeOutputLogsObject: + title: Code interpreter log output + type: object + description: Text output from the Code Interpreter tool call as part of a run step. + properties: + type: + description: Always `logs`. + type: string + enum: ["logs"] + logs: + type: string + description: The text output from the Code Interpreter tool call. + required: + - type + - logs + + RunStepDetailsToolCallsCodeOutputImageObject: + title: Code interpreter image output + type: object + properties: + type: + description: Always `image`. + type: string + enum: ["image"] + image: + type: object + properties: + file_id: + description: The [file](/docs/api-reference/files) ID of the image. + type: string + required: + - file_id + required: + - type + - image + + RunStepDetailsToolCallsRetrievalObject: + title: Retrieval tool call + type: object + properties: + id: + type: string + description: The ID of the tool call object. + type: + type: string + description: The type of tool call. This is always going to be `retrieval` for this type of tool call. + enum: ["retrieval"] + retrieval: + type: object + description: For now, this is always going to be an empty object. + x-oaiTypeLabel: map + required: + - id + - type + - retrieval + + RunStepDetailsToolCallsFunctionObject: + type: object + title: Function tool call + properties: + id: + type: string + description: The ID of the tool call object. + type: + type: string + description: The type of tool call. This is always going to be `function` for this type of tool call. + enum: ["function"] + function: + type: object + description: The definition of the function that was called. + properties: + name: + type: string + description: The name of the function. + arguments: + type: string + description: The arguments passed to the function. + output: + type: string + description: The output of the function. This will be `null` if the outputs have not been [submitted](/docs/api-reference/runs/submitToolOutputs) yet. + nullable: true + required: + - name + - arguments + - output + required: + - id + - type + - function + + AssistantFileObject: + type: object + title: Assistant files + description: A list of [Files](/docs/api-reference/files) attached to an `assistant`. + properties: + id: + description: The identifier, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `assistant.file`. + type: string + enum: [assistant.file] + created_at: + description: The Unix timestamp (in seconds) for when the assistant file was created. + type: integer + assistant_id: + description: The assistant ID that the file is attached to. + type: string + required: + - id + - object + - created_at + - assistant_id + x-oaiMeta: + name: The assistant file object + beta: true + example: | + { + "id": "file-wB6RM6wHdA49HfS2DJ9fEyrH", + "object": "assistant.file", + "created_at": 1699055364, + "assistant_id": "asst_FBOFvAOHhwEWMghbMGseaPGQ" + } + + CreateAssistantFileRequest: + type: object + additionalProperties: false + properties: + file_id: + description: A [File](/docs/api-reference/files) ID (with `purpose="assistants"`) that the assistant should use. Useful for tools like `retrieval` and `code_interpreter` that can access files. + type: string + required: + - file_id + + DeleteAssistantFileResponse: + type: object + description: Deletes the association between the assistant and the file, but does not delete the [File](/docs/api-reference/files) object itself. + properties: + id: + type: string + deleted: + type: boolean + object: + type: string + enum: [assistant.file.deleted] + required: + - id + - object + - deleted + ListAssistantFilesResponse: + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/AssistantFileObject" + first_id: + type: string + example: "file-hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "file-QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - items + - first_id + - last_id + - has_more + + MessageFileObject: + type: object + title: Message files + description: A list of files attached to a `message`. + properties: + id: + description: The identifier, which can be referenced in API endpoints. + type: string + object: + description: The object type, which is always `thread.message.file`. + type: string + enum: ["thread.message.file"] + created_at: + description: The Unix timestamp (in seconds) for when the message file was created. + type: integer + message_id: + description: The ID of the [message](/docs/api-reference/messages) that the [File](/docs/api-reference/files) is attached to. + type: string + required: + - id + - object + - created_at + - message_id + x-oaiMeta: + name: The message file object + beta: true + example: | + { + "id": "file-BK7bzQj3FfZFXr7DbL6xJwfo", + "object": "thread.message.file", + "created_at": 1698107661, + "message_id": "message_QLoItBbqwyAJEzlTy4y9kOMM", + "file_id": "file-BK7bzQj3FfZFXr7DbL6xJwfo" + } + + ListMessageFilesResponse: + properties: + object: + type: string + example: "list" + data: + type: array + items: + $ref: "#/components/schemas/MessageFileObject" + first_id: + type: string + example: "file-hLBK7PXBv5Lr2NQT7KLY0ag1" + last_id: + type: string + example: "file-QLoItBbqwyAJEzlTy4y9kOMM" + has_more: + type: boolean + example: false + required: + - object + - data + - items + - first_id + - last_id + - has_more + +security: + - ApiKeyAuth: [] +x-oaiMeta: + groups: + # > General Notes + # The `groups` section is used to generate the API reference pages and navigation, in the same + # order listed below. Additionally, each `group` can have a list of `sections`, each of which + # will become a navigation subroute and subsection under the group. Each section has: + # - `type`: Currently, either an `endpoint` or `object`, depending on how the section needs to + # be rendered + # - `key`: The reference key that can be used to lookup the section definition + # - `path`: The path (url) of the section, which is used to generate the navigation link. + # + # > The `object` sections maps to a schema component and the following fields are read for rendering + # - `x-oaiMeta.name`: The name of the object, which will become the section title + # - `x-oaiMeta.example`: The example object, which will be used to generate the example sample (always JSON) + # - `description`: The description of the object, which will be used to generate the section description + # + # > The `endpoint` section maps to an operation path and the following fields are read for rendering: + # - `x-oaiMeta.name`: The name of the endpoint, which will become the section title + # - `x-oaiMeta.examples`: The endpoint examples, which can be an object (meaning a single variation, most + # endpoints, or an array of objects, meaning multiple variations, e.g. the + # chat completion and completion endpoints, with streamed and non-streamed examples. + # - `x-oaiMeta.returns`: text describing what the endpoint returns. + # - `summary`: The summary of the endpoint, which will be used to generate the section description + - id: audio + title: Audio + description: | + Learn how to turn audio into text or text into audio. + + Related guide: [Speech to text](/docs/guides/speech-to-text) + sections: + - type: endpoint + key: createSpeech + path: createSpeech + - type: endpoint + key: createTranscription + path: createTranscription + - type: endpoint + key: createTranslation + path: createTranslation + - id: chat + title: Chat + description: | + Given a list of messages comprising a conversation, the model will return a response. + + Related guide: [Chat Completions](/docs/guides/gpt) + sections: + - type: object + key: CreateChatCompletionResponse + path: object + - type: object + key: CreateChatCompletionStreamResponse + path: streaming + - type: endpoint + key: createChatCompletion + path: create + - id: completions + title: Completions + legacy: true + description: | + Given a prompt, the model will return one or more predicted completions, and can also return the probabilities of alternative tokens at each position. We recommend most users use our Chat Completions API. [Learn more](/docs/deprecations/2023-07-06-gpt-and-embeddings) + + Related guide: [Legacy Completions](/docs/guides/gpt/completions-api) + sections: + - type: object + key: CreateCompletionResponse + path: object + - type: endpoint + key: createCompletion + path: create + - id: embeddings + title: Embeddings + description: | + Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. + + Related guide: [Embeddings](/docs/guides/embeddings) + sections: + - type: object + key: Embedding + path: object + - type: endpoint + key: createEmbedding + path: create + - id: fine-tuning + title: Fine-tuning + description: | + Manage fine-tuning jobs to tailor a model to your specific training data. + + Related guide: [Fine-tune models](/docs/guides/fine-tuning) + sections: + - type: object + key: FineTuningJob + path: object + - type: endpoint + key: createFineTuningJob + path: create + - type: endpoint + key: listPaginatedFineTuningJobs + path: list + - type: endpoint + key: retrieveFineTuningJob + path: retrieve + - type: endpoint + key: cancelFineTuningJob + path: cancel + - type: object + key: FineTuningJobEvent + path: event-object + - type: endpoint + key: listFineTuningEvents + path: list-events + - id: files + title: Files + description: | + Files are used to upload documents that can be used with features like [Assistants](/docs/api-reference/assistants) and [Fine-tuning](/docs/api-reference/fine-tuning). + sections: + - type: object + key: OpenAIFile + path: object + - type: endpoint + key: listFiles + path: list + - type: endpoint + key: createFile + path: create + - type: endpoint + key: deleteFile + path: delete + - type: endpoint + key: retrieveFile + path: retrieve + - type: endpoint + key: downloadFile + path: retrieve-contents + - id: images + title: Images + description: | + Given a prompt and/or an input image, the model will generate a new image. + + Related guide: [Image generation](/docs/guides/images) + sections: + - type: object + key: Image + path: object + - type: endpoint + key: createImage + path: create + - type: endpoint + key: createImageEdit + path: createEdit + - type: endpoint + key: createImageVariation + path: createVariation + - id: models + title: Models + description: | + List and describe the various models available in the API. You can refer to the [Models](/docs/models) documentation to understand what models are available and the differences between them. + sections: + - type: object + key: Model + path: object + - type: endpoint + key: listModels + path: list + - type: endpoint + key: retrieveModel + path: retrieve + - type: endpoint + key: deleteModel + path: delete + - id: moderations + title: Moderations + description: | + Given a input text, outputs if the model classifies it as violating OpenAI's content policy. + + Related guide: [Moderations](/docs/guides/moderation) + sections: + - type: object + key: CreateModerationResponse + path: object + - type: endpoint + key: createModeration + path: create + - id: assistants + title: Assistants + beta: true + description: | + Build assistants that can call models and use tools to perform tasks. + + [Get started with the Assistants API](/docs/assistants) + sections: + - type: object + key: AssistantObject + path: object + - type: endpoint + key: createAssistant + path: createAssistant + - type: endpoint + key: getAssistant + path: getAssistant + - type: endpoint + key: modifyAssistant + path: modifyAssistant + - type: endpoint + key: deleteAssistant + path: deleteAssistant + - type: endpoint + key: listAssistants + path: listAssistants + - type: object + key: AssistantFileObject + path: file-object + - type: endpoint + key: createAssistantFile + path: createAssistantFile + - type: endpoint + key: getAssistantFile + path: getAssistantFile + - type: endpoint + key: deleteAssistantFile + path: deleteAssistantFile + - type: endpoint + key: listAssistantFiles + path: listAssistantFiles + - id: threads + title: Threads + beta: true + description: | + Create threads that assistants can interact with. + + Related guide: [Assistants](/docs/assistants/overview) + sections: + - type: object + key: ThreadObject + path: object + - type: endpoint + key: createThread + path: createThread + - type: endpoint + key: getThread + path: getThread + - type: endpoint + key: modifyThread + path: modifyThread + - type: endpoint + key: deleteThread + path: deleteThread + - id: messages + title: Messages + beta: true + description: | + Create messages within threads + + Related guide: [Assistants](/docs/assistants/overview) + sections: + - type: object + key: MessageObject + path: object + - type: endpoint + key: createMessage + path: createMessage + - type: endpoint + key: getMessage + path: getMessage + - type: endpoint + key: modifyMessage + path: modifyMessage + - type: endpoint + key: listMessages + path: listMessages + - type: object + key: MessageFileObject + path: file-object + - type: endpoint + key: getMessageFile + path: getMessageFile + - type: endpoint + key: listMessageFiles + path: listMessageFiles + - id: runs + title: Runs + beta: true + description: | + Represents an execution run on a thread. + + Related guide: [Assistants](/docs/assistants/overview) + sections: + - type: object + key: RunObject + path: object + - type: endpoint + key: createRun + path: createRun + - type: endpoint + key: getRun + path: getRun + - type: endpoint + key: modifyRun + path: modifyRun + - type: endpoint + key: listRuns + path: listRuns + - type: endpoint + key: submitToolOuputsToRun + path: submitToolOutputs + - type: endpoint + key: cancelRun + path: cancelRun + - type: endpoint + key: createThreadAndRun + path: createThreadAndRun + - type: object + key: RunStepObject + path: step-object + - type: endpoint + key: getRunStep + path: getRunStep + - type: endpoint + key: listRunSteps + path: listRunSteps + - id: fine-tunes + title: Fine-tunes + deprecated: true + description: | + Manage legacy fine-tuning jobs to tailor a model to your specific training data. + + We recommend transitioning to the updating [fine-tuning API](/docs/guides/fine-tuning) + sections: + - type: object + key: FineTune + path: object + - type: endpoint + key: createFineTune + path: create + - type: endpoint + key: listFineTunes + path: list + - type: endpoint + key: retrieveFineTune + path: retrieve + - type: endpoint + key: cancelFineTune + path: cancel + - type: object + key: FineTuneEvent + path: event-object + - type: endpoint + key: listFineTuneEvents + path: list-events + - id: edits + title: Edits + deprecated: true + description: | + Given a prompt and an instruction, the model will return an edited version of the prompt. + sections: + - type: object + key: CreateEditResponse + path: object + - type: endpoint + key: createEdit + path: create \ No newline at end of file diff --git a/docs/openapi/OpenAPISpec.json b/docs/openapi/OpenAPISpec.json deleted file mode 100644 index 9c18ec9d1..000000000 --- a/docs/openapi/OpenAPISpec.json +++ /dev/null @@ -1,74 +0,0 @@ -{ - "openapi": "3.0.0", - "info": { - "description": "Jan.ai api reference documentation.", - "title": "Rest Endpoints", - "version": "" - }, - "paths": { - "/api/rest/myquery": { - "get": { - "summary": "MyQuery", - "description": "***\nThe GraphQl query for this endpoint is:\n``` graphql\nquery MyQuery {\n collections {\n id\n name\n slug\n }\n}\n```", - "parameters": [ - { - "description": "Your x-hasura-admin-secret will be used for authentication of the API request.", - "in": "header", - "name": "x-hasura-admin-secret", - "schema": { - "type": "string" - } - } - ], - "responses": { - "200": { - "content": { - "application/json": { - "schema": { - "properties": { - "collections": { - "items": { - "description": "columns and relationships of \"collections\"", - "nullable": false, - "properties": { - "id": { - "$ref": "#/components/schemas/uuid!" - }, - "name": { - "nullable": false, - "title": "String", - "type": "string" - }, - "slug": { - "nullable": false, - "title": "String", - "type": "string" - } - }, - "title": "collections", - "type": "object" - }, - "nullable": false, - "type": "array" - } - } - } - } - }, - "description": "Responses for GET /api/rest/myquery" - } - } - } - } - }, - "components": { - "schemas": { - "uuid!": { - "nullable": false, - "pattern": "[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89aAbB][a-f0-9]{3}-[a-f0-9]{12}", - "title": "uuid", - "type": "string" - } - } - } -} \ No newline at end of file diff --git a/docs/package.json b/docs/package.json index 9a122d340..4d7465f9c 100644 --- a/docs/package.json +++ b/docs/package.json @@ -24,8 +24,9 @@ "autoprefixer": "^10.4.16", "axios": "^1.5.1", "clsx": "^1.2.1", + "docusaurus-plugin-redoc": "^2.0.0", "docusaurus-plugin-sass": "^0.2.5", - "dotenv": "^16.3.1", + "docusaurus-theme-redoc": "^2.0.0", "js-yaml": "^4.1.0", "postcss": "^8.4.30", "posthog-docusaurus": "^2.0.0", @@ -33,12 +34,13 @@ "react": "^17.0.2", "react-dom": "^17.0.2", "react-icons": "^4.11.0", - "redocusaurus": "^1.6.3", + "redocusaurus": "^2.0.0", "sass": "^1.69.3", "tailwindcss": "^3.3.3" }, "devDependencies": { "@docusaurus/module-type-aliases": "2.4.1", + "dotenv": "^16.3.1", "tailwindcss-animate": "^1.0.7" }, "browserslist": { @@ -56,4 +58,4 @@ "engines": { "node": ">=16.14" } -} +} \ No newline at end of file diff --git a/docs/sidebars.js b/docs/sidebars.js index 3e6e280c0..07de6696d 100644 --- a/docs/sidebars.js +++ b/docs/sidebars.js @@ -13,47 +13,64 @@ /** @type {import('@docusaurus/plugin-content-docs').SidebarsConfig} */ const sidebars = { - blogSidebar: [ - "guides/overview" - ], + // blogSidebar: [ + // "guides/overview" + // ], docsSidebar: [ - "nitro/overview", + { + type: "category", + label: "Introduction", + collapsible: false, + collapsed: false, + items: [ + { type: "doc", id: "new/about", label: "About Nitro" }, + { type: "doc", id: "new/quickstart", label: "Quickstart" }, + { type: "doc", id: "new/install", label: "Installation" }, + "new/build-source" + ], + }, { type: 'category', - label: 'Getting Started', + label: 'Features', + link: { type: "doc", id: "features/feat"}, items: [ - "nitro/key-concepts", - "nitro/architecture", - "nitro/installation", - "nitro/using-nitro", + "features/chat", + "features/embed", + "features/multi-thread", + "features/cont-batch", + "features/load-unload", + "features/warmup", + "features/prompt", ], }, - "guides/troubleshooting", - ], - - apiSidebar: [ - "api/overview", { type: "category", - label: "Endpoints", - collapsible: true, + label: "Guides", + collapsible: false, collapsed: false, - items: [ - { - type: "autogenerated", - dirName: "api", - }, - ], + items: ["examples/llm"], }, + // { + // type: "category", + // label: "Specification", + // collapsible: false, + // collapsed: false, + // items: [{ type: "doc", id: "new/architecture", label: "Architecture" }], + // }, ], - communitySidebar: [ - "community/support", - "community/contribuiting", - "community/coc", - "community/changelog" - ] + apiSidebar: [ + "api" + ], + + // communitySidebar: [ + // "community/support", + // "community/contribuiting", + // "community/coc", + // "community/changelog" + // ] }; + module.exports = sidebars; diff --git a/docs/static/img/favicon.ico b/docs/static/img/favicon.ico index ca006fa7d..dfde26184 100644 Binary files a/docs/static/img/favicon.ico and b/docs/static/img/favicon.ico differ diff --git a/docs/yarn.lock b/docs/yarn.lock index 3bea109c7..72ec28f66 100644 --- a/docs/yarn.lock +++ b/docs/yarn.lock @@ -304,7 +304,7 @@ dependencies: "@babel/types" "^7.23.0" -"@babel/helper-module-imports@^7.0.0", "@babel/helper-module-imports@^7.22.15", "@babel/helper-module-imports@^7.22.5": +"@babel/helper-module-imports@^7.22.15": version "7.22.15" resolved "https://registry.yarnpkg.com/@babel/helper-module-imports/-/helper-module-imports-7.22.15.tgz#16146307acdc40cc00c3b2c647713076464bdbf0" integrity sha512-0pYVBnDKZO2fnSPCrgM/6WMc7eS20Fbok+0r88fp+YtWVLZrp4CkafFGIp+W0VKw4a22sgebPT99y+FDNMdP4w== @@ -1208,7 +1208,7 @@ "@babel/parser" "^7.22.15" "@babel/types" "^7.22.15" -"@babel/traverse@^7.12.9", "@babel/traverse@^7.18.8", "@babel/traverse@^7.22.8", "@babel/traverse@^7.23.2", "@babel/traverse@^7.23.3", "@babel/traverse@^7.4.5": +"@babel/traverse@^7.12.9", "@babel/traverse@^7.18.8", "@babel/traverse@^7.22.8", "@babel/traverse@^7.23.2", "@babel/traverse@^7.23.3": version "7.23.3" resolved "https://registry.yarnpkg.com/@babel/traverse/-/traverse-7.23.3.tgz#26ee5f252e725aa7aca3474aa5b324eaf7908b5b" integrity sha512-+K0yF1/9yR0oHdE0StHuEj3uTPzwwbrLGfNOndVJVV2TqA5+j3oljJUb4nmB954FLGjNem976+B+eDuLIjesiQ== @@ -2004,7 +2004,7 @@ url-loader "^4.1.1" webpack "^5.88.1" -"@emotion/is-prop-valid@^1.1.0": +"@emotion/is-prop-valid@^1.2.1": version "1.2.1" resolved "https://registry.yarnpkg.com/@emotion/is-prop-valid/-/is-prop-valid-1.2.1.tgz#23116cf1ed18bfeac910ec6436561ecb1a3885cc" integrity sha512-61Mf7Ufx4aDxx1xlDeOm8aFFigGHE4z+0sKCa+IHCeZKiyP9RLD0Mmx7m8b9/Cf37f7NAvQOOJAbQQGVr5uERw== @@ -2016,15 +2016,10 @@ resolved "https://registry.yarnpkg.com/@emotion/memoize/-/memoize-0.8.1.tgz#c1ddb040429c6d21d38cc945fe75c818cfb68e17" integrity sha512-W2P2c/VRW1/1tLox0mVUalvnWXxavmv/Oum2aPsRcoDJuob75FC3Y8FbpfLwUegRcxINtGUMPq0tFCvYNTBXNA== -"@emotion/stylis@^0.8.4": - version "0.8.5" - resolved "https://registry.yarnpkg.com/@emotion/stylis/-/stylis-0.8.5.tgz#deacb389bd6ee77d1e7fcaccce9e16c5c7e78e04" - integrity sha512-h6KtPihKFn3T9fuIrwvXXUOwlx3rfUvfZIcP5a6rh8Y7zjE3O06hT5Ss4S/YI1AYhuZ1kjaE/5EaOOI2NqSylQ== - -"@emotion/unitless@^0.7.4": - version "0.7.5" - resolved "https://registry.yarnpkg.com/@emotion/unitless/-/unitless-0.7.5.tgz#77211291c1900a700b8a78cfafda3160d76949ed" - integrity sha512-OWORNpfjMsSSUBVrRBVGECkhWcULOAJz9ZW8uK9qgxD+87M7jHRcvh/A96XXNhXTLmKcoYSQtBEX7lHMO7YRwg== +"@emotion/unitless@^0.8.0": + version "0.8.1" + resolved "https://registry.yarnpkg.com/@emotion/unitless/-/unitless-0.8.1.tgz#182b5a4704ef8ad91bde93f7a860a88fd92c79a3" + integrity sha512-KOEGMu6dmJZtpadb476IsZBclKvILjopjUii3V+7MnXIQCYh8W3NgNcgwo21n9LXZX6EDIKvqfjYxXebDwxKmQ== "@exodus/schemasafe@^1.0.0-rc.2": version "1.3.0" @@ -2255,23 +2250,7 @@ require-from-string "^2.0.2" uri-js "^4.2.2" -"@redocly/openapi-core@1.0.0-beta.123": - version "1.0.0-beta.123" - resolved "https://registry.yarnpkg.com/@redocly/openapi-core/-/openapi-core-1.0.0-beta.123.tgz#0c29ae9fabe5f143f571caf608a7d025f41125db" - integrity sha512-W6MbUWpb/VaV+Kf0c3jmMIJw3WwwF7iK5nAfcOS+ZwrlbxtIl37+1hEydFlJ209vCR9HL12PaMwdh2Vpihj6Jw== - dependencies: - "@redocly/ajv" "^8.11.0" - "@types/node" "^14.11.8" - colorette "^1.2.0" - js-levenshtein "^1.1.6" - js-yaml "^4.1.0" - lodash.isequal "^4.5.0" - minimatch "^5.0.1" - node-fetch "^2.6.1" - pluralize "^8.0.0" - yaml-ast-parser "0.0.43" - -"@redocly/openapi-core@^1.0.0-beta.104": +"@redocly/openapi-core@1.4.0", "@redocly/openapi-core@^1.0.0-rc.2": version "1.4.0" resolved "https://registry.yarnpkg.com/@redocly/openapi-core/-/openapi-core-1.4.0.tgz#d1ce8e391b32452082f754315c8eb265690b784f" integrity sha512-M4f0H3XExPvJ0dwbEou7YKLzkpz2ZMS9JoNvrbEECO7WCwjGZ4AjbiUjp2p0ZzFMNIiNgTVUJJmkxGxsXW471Q== @@ -2812,6 +2791,11 @@ dependencies: "@types/node" "*" +"@types/stylis@^4.0.2": + version "4.2.3" + resolved "https://registry.yarnpkg.com/@types/stylis/-/stylis-4.2.3.tgz#0dff504fc23487a02a29209b162249070e83a0da" + integrity sha512-86XLCVEmWagiUEbr2AjSbeY4qHN9jMm3pgM3PuBYfLIbT0MpDSnA3GA/4W7KoH/C/eeK77kNaeIxZzjhKYIBgw== + "@types/unist@*", "@types/unist@^3.0.0": version "3.0.2" resolved "https://registry.yarnpkg.com/@types/unist/-/unist-3.0.2.tgz#6dd61e43ef60b34086287f83683a5c1b2dc53d20" @@ -3224,18 +3208,10 @@ axios@^0.25.0: dependencies: follow-redirects "^1.14.7" -axios@^0.27.2: - version "0.27.2" - resolved "https://registry.yarnpkg.com/axios/-/axios-0.27.2.tgz#207658cc8621606e586c85db4b41a750e756d972" - integrity sha512-t+yRIyySRTp/wua5xEr+z1q60QmLq8ABsS5O9Me1AsE5dfKqgnCFzwiCZZ/cGNd1lq4/7akDWMxdhVlucjmnOQ== - dependencies: - follow-redirects "^1.14.9" - form-data "^4.0.0" - -axios@^1.5.1: - version "1.6.1" - resolved "https://registry.yarnpkg.com/axios/-/axios-1.6.1.tgz#76550d644bf0a2d469a01f9244db6753208397d7" - integrity sha512-vfBmhDpKafglh0EldBEbVuoe7DyAavGSLWhuSm5ZSEKQnHhBf0xAAwybbNH1IkrJNGnS/VG4I5yxig1pCEXE4g== +axios@^1.5.1, axios@^1.6.1: + version "1.6.2" + resolved "https://registry.yarnpkg.com/axios/-/axios-1.6.2.tgz#de67d42c755b571d3e698df1b6504cde9b0ee9f2" + integrity sha512-7i24Ri4pmDRfJTR7LDBhsOTtcm+9kjX5WiY1X3wIisx6G9So3pfMkEiU7emUBe46oceVImccTEM3k6C5dbVW8A== dependencies: follow-redirects "^1.15.0" form-data "^4.0.0" @@ -3305,17 +3281,6 @@ babel-plugin-polyfill-regenerator@^0.5.3: dependencies: "@babel/helper-define-polyfill-provider" "^0.4.3" -"babel-plugin-styled-components@>= 1.12.0": - version "2.1.4" - resolved "https://registry.yarnpkg.com/babel-plugin-styled-components/-/babel-plugin-styled-components-2.1.4.tgz#9a1f37c7f32ef927b4b008b529feb4a2c82b1092" - integrity sha512-Xgp9g+A/cG47sUyRwwYxGM4bR/jDRg5N6it/8+HxCnbT5XNKSKDT9xm4oag/osgqjC2It/vH0yXsomOG6k558g== - dependencies: - "@babel/helper-annotate-as-pure" "^7.22.5" - "@babel/helper-module-imports" "^7.22.5" - "@babel/plugin-syntax-jsx" "^7.22.5" - lodash "^4.17.21" - picomatch "^2.3.1" - bail@^1.0.0: version "1.0.5" resolved "https://registry.yarnpkg.com/bail/-/bail-1.0.5.tgz#b6fa133404a392cbc1f8c4bf63f5953351e7a776" @@ -3574,9 +3539,9 @@ caniuse-api@^3.0.0: lodash.uniq "^4.5.0" caniuse-lite@^1.0.0, caniuse-lite@^1.0.30001538, caniuse-lite@^1.0.30001541: - version "1.0.30001561" - resolved "https://registry.yarnpkg.com/caniuse-lite/-/caniuse-lite-1.0.30001561.tgz#752f21f56f96f1b1a52e97aae98c57c562d5d9da" - integrity sha512-NTt0DNoKe958Q0BE0j0c1V9jbUzhBxHIEJy7asmGrpE0yG63KTV7PLHPnK2E1O9RsQrQ081I3NLuXGS6zht3cw== + version "1.0.30001562" + resolved "https://registry.yarnpkg.com/caniuse-lite/-/caniuse-lite-1.0.30001562.tgz#9d16c5fd7e9c592c4cd5e304bc0f75b0008b2759" + integrity sha512-kfte3Hym//51EdX4239i+Rmp20EsLIYGdPkERegTgU19hQWCRhsRFGKHTliUlsry53tv17K7n077Kqa0WJU4ng== ccount@^1.0.0: version "1.1.0" @@ -4198,7 +4163,7 @@ css-select@^5.1.0: domutils "^3.0.1" nth-check "^2.0.1" -css-to-react-native@^3.0.0: +css-to-react-native@^3.2.0: version "3.2.0" resolved "https://registry.yarnpkg.com/css-to-react-native/-/css-to-react-native-3.2.0.tgz#cdd8099f71024e149e4f6fe17a7d46ecd55f1e32" integrity sha512-e8RKaLXMOFii+02mOlqwjbD00KSEKqblnpO9e++1aXS1fPQOpS1YoqdVHBqPjHNoxeF2mimzVqawm2KCbEdtHQ== @@ -4293,7 +4258,7 @@ csso@^4.2.0: dependencies: css-tree "^1.1.2" -csstype@^3.0.2: +csstype@^3.0.2, csstype@^3.1.2: version "3.1.2" resolved "https://registry.yarnpkg.com/csstype/-/csstype-3.1.2.tgz#1d4bf9d572f11c14031f0436e1c10bc1f571f50b" integrity sha512-I7K1Uu0MBPzaFKg4nI5Q7Vs2t+3gWWW648spaF+Rg7pI9ds18Ugn+lvg4SHczUdKlHI5LWBXyqfS8+DufyBsgQ== @@ -4604,6 +4569,11 @@ dayjs@^1.11.7: resolved "https://registry.yarnpkg.com/dayjs/-/dayjs-1.11.10.tgz#68acea85317a6e164457d6d6947564029a6a16a0" integrity sha512-vjAczensTgRcqDERK0SR2XMwsF/tSvnvlv6VcF2GIhg6Sx4yOIt/irsr1RDJsKiIyBzJDpCoXiWWq28MqH2cnQ== +debounce@^1.2.1: + version "1.2.1" + resolved "https://registry.yarnpkg.com/debounce/-/debounce-1.2.1.tgz#38881d8f4166a5c5848020c11827b834bcb3e0a5" + integrity sha512-XRRe6Glud4rd/ZGQfiV1ruXSfbvfJedlV9Y6zOlP+2K04vBYiJEte6stfFkCP03aMnY5tsipamumUjL14fofug== + debug@2.6.9, debug@^2.6.0: version "2.6.9" resolved "https://registry.yarnpkg.com/debug/-/debug-2.6.9.tgz#5d128515df134ff327e90a4c93f4e077a536341f" @@ -4809,13 +4779,13 @@ dns-packet@^5.2.2: dependencies: "@leichtgewicht/ip-codec" "^2.0.1" -docusaurus-plugin-redoc@1.6.0: - version "1.6.0" - resolved "https://registry.yarnpkg.com/docusaurus-plugin-redoc/-/docusaurus-plugin-redoc-1.6.0.tgz#a3d07bb10a99e9195aab3e2ae4296f49a1530e4a" - integrity sha512-bvOmVcJ9Lo6ymyaHCoXTjN6Ck7/Dog1KRsJgZilB6ukHQ7d6nJrAwAEoDF1rXto8tOvIUqVb6Zzy7qDPvBQA1Q== +docusaurus-plugin-redoc@2.0.0, docusaurus-plugin-redoc@^2.0.0: + version "2.0.0" + resolved "https://registry.yarnpkg.com/docusaurus-plugin-redoc/-/docusaurus-plugin-redoc-2.0.0.tgz#2f7b2ee9fd4beb86cdc2d88efd9ba87b76752484" + integrity sha512-+cUy/wnQVQmuygMxP0gAWODzo502QruhyUTHShxMEBhkL57dOx0COMgd8Iu4BlqiW9RGzN3hEZEpLzGTaGFOtQ== dependencies: - "@redocly/openapi-core" "1.0.0-beta.123" - redoc "2.0.0" + "@redocly/openapi-core" "1.4.0" + redoc "2.1.3" docusaurus-plugin-sass@^0.2.5: version "0.2.5" @@ -4824,18 +4794,18 @@ docusaurus-plugin-sass@^0.2.5: dependencies: sass-loader "^10.1.1" -docusaurus-theme-redoc@1.6.4: - version "1.6.4" - resolved "https://registry.yarnpkg.com/docusaurus-theme-redoc/-/docusaurus-theme-redoc-1.6.4.tgz#29736f5590c0b04f3538087ab6e17d3d06d7e099" - integrity sha512-dEKh/HYWGqGG2Qoy2CgXon28Z32Z/LdNzZvreAQqeYtiXb7Ey9gZFwSstpU4jEcoUa347NCYseLPn8bkxlemCw== +docusaurus-theme-redoc@2.0.0, docusaurus-theme-redoc@^2.0.0: + version "2.0.0" + resolved "https://registry.yarnpkg.com/docusaurus-theme-redoc/-/docusaurus-theme-redoc-2.0.0.tgz#2cbae0f51f1c1f9527069e54173cfdb184d4a995" + integrity sha512-BOew0bVJvc8LV+zMMURx/2pWkk8VQNY2Wow2AFVSCGCkHi4UMwpq50VFL42t0MF6EnoSY9hqArqNfofpUFiiOw== dependencies: - "@redocly/openapi-core" "1.0.0-beta.123" + "@redocly/openapi-core" "1.4.0" clsx "^1.2.1" copyfiles "^2.4.1" lodash "^4.17.21" - mobx "^6.8.0" - redoc "2.0.0" - styled-components "^5.3.6" + mobx "^6.10.2" + redoc "2.1.3" + styled-components "^6.1.0" dom-converter@^0.2.0: version "0.2.0" @@ -4965,9 +4935,9 @@ ee-first@1.1.1: integrity sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow== electron-to-chromium@^1.4.535: - version "1.4.581" - resolved "https://registry.yarnpkg.com/electron-to-chromium/-/electron-to-chromium-1.4.581.tgz#23b684c67bf56d4284e95598c05a5d266653b6d8" - integrity sha512-6uhqWBIapTJUxgPTCHH9sqdbxIMPt7oXl0VcAL1kOtlU6aECdcMncCrX5Z7sHQ/invtrC9jUQUef7+HhO8vVFw== + version "1.4.583" + resolved "https://registry.yarnpkg.com/electron-to-chromium/-/electron-to-chromium-1.4.583.tgz#7b0ac4f36388da4b5485788adb92cd7dd0abffc4" + integrity sha512-93y1gcONABZ7uqYe/JWDVQP/Pj/sQSunF0HVAPdlg/pfBnOyBMLlQUxWvkqcljJg1+W6cjvPuYD+r1Th9Tn8mA== elkjs@^0.8.2: version "0.8.2" @@ -5447,7 +5417,7 @@ flux@^4.0.1: fbemitter "^3.0.0" fbjs "^3.0.1" -follow-redirects@^1.0.0, follow-redirects@^1.14.7, follow-redirects@^1.14.9, follow-redirects@^1.15.0: +follow-redirects@^1.0.0, follow-redirects@^1.14.7, follow-redirects@^1.15.0: version "1.15.3" resolved "https://registry.yarnpkg.com/follow-redirects/-/follow-redirects-1.15.3.tgz#fe2f3ef2690afce7e82ed0b44db08165b207123a" integrity sha512-1VzOtuEM8pC9SFU1E+8KfTjZyMztRsgEfwQl44z8A25uy13jSzTj6dyK2Df52iV0vgHCfBwLhDWevLn95w5v6Q== @@ -6016,7 +5986,7 @@ history@^4.9.0: tiny-warning "^1.0.0" value-equal "^1.0.1" -hoist-non-react-statics@^3.0.0, hoist-non-react-statics@^3.1.0: +hoist-non-react-statics@^3.1.0: version "3.3.2" resolved "https://registry.yarnpkg.com/hoist-non-react-statics/-/hoist-non-react-statics-3.3.2.tgz#ece0acaf71d62c2969c2ec59feff42a4b1a85b45" integrity sha512-/gGivxi8JPKWNm/W0jSmzcMPpfpPLc3dY/6GxhX2hQ9iGj3aDfklV4ET7NjKpSinLpJ5vafa9iiGIEZg10SfBw== @@ -6038,6 +6008,11 @@ html-entities@^2.3.2: resolved "https://registry.yarnpkg.com/html-entities/-/html-entities-2.4.0.tgz#edd0cee70402584c8c76cc2c0556db09d1f45061" integrity sha512-igBTJcNNNhvZFRtm8uA6xMY6xYleeDwn3PeBCkDz7tHttv4F2hsDI2aPgNERWzvRcNYHNT3ymRaQzllmXj4YsQ== +html-escaper@^2.0.2: + version "2.0.2" + resolved "https://registry.yarnpkg.com/html-escaper/-/html-escaper-2.0.2.tgz#dfd60027da36a36dfcbe236262c00a5822681453" + integrity sha512-H2iMtd0I4Mt5eYiapRdIDjp+XzelXQ0tFE4JS7YFwFevXXMmOp9myNrUvCg0D6ws8iqkRPBfKHgbwig1SmlLfg== + html-minifier-terser@^6.0.2, html-minifier-terser@^6.1.0: version "6.1.0" resolved "https://registry.yarnpkg.com/html-minifier-terser/-/html-minifier-terser-6.1.0.tgz#bfc818934cc07918f6b3669f5774ecdfd48f32ab" @@ -6855,26 +6830,11 @@ lodash.debounce@^4.0.8: resolved "https://registry.yarnpkg.com/lodash.debounce/-/lodash.debounce-4.0.8.tgz#82d79bff30a67c4005ffd5e2515300ad9ca4d7af" integrity sha512-FT1yDzDYEoYWhnSGnpE/4Kj1fLZkDFyqRb7fNt6FdYOSxlUWAtp42Eh6Wb0rGIv/m9Bgo7x4GhQbm5Ys4SG5ow== -lodash.escape@^4.0.1: - version "4.0.1" - resolved "https://registry.yarnpkg.com/lodash.escape/-/lodash.escape-4.0.1.tgz#c9044690c21e04294beaa517712fded1fa88de98" - integrity sha512-nXEOnb/jK9g0DYMr1/Xvq6l5xMD7GDG55+GSYIYmS0G4tBk/hURD4JR9WCavs04t33WmJx9kCyp9vJ+mr4BOUw== - -lodash.flatten@^4.4.0: - version "4.4.0" - resolved "https://registry.yarnpkg.com/lodash.flatten/-/lodash.flatten-4.4.0.tgz#f31c22225a9632d2bbf8e4addbef240aa765a61f" - integrity sha512-C5N2Z3DgnnKr0LOpv/hKCgKdb7ZZwafIrsesve6lmzvZIRZRGaZ/l6Q8+2W7NaT+ZwO3fFlSCzCzrDCFdJfZ4g== - lodash.flow@^3.3.0: version "3.5.0" resolved "https://registry.yarnpkg.com/lodash.flow/-/lodash.flow-3.5.0.tgz#87bf40292b8cf83e4e8ce1a3ae4209e20071675a" integrity sha512-ff3BX/tSioo+XojX4MOsOMhJw0nZoUEF011LX8g8d3gvjVbxd89cCio4BCXronjxcTUIJUoqKEUA+n4CqvvRPw== -lodash.invokemap@^4.6.0: - version "4.6.0" - resolved "https://registry.yarnpkg.com/lodash.invokemap/-/lodash.invokemap-4.6.0.tgz#1748cda5d8b0ef8369c4eb3ec54c21feba1f2d62" - integrity sha512-CfkycNtMqgUlfjfdh2BhKO/ZXrP8ePOX5lEU/g0R3ItJcnuxWDwokMGKx1hWcfOikmyOVx6X9IwWnDGlgKl61w== - lodash.isequal@^4.5.0: version "4.5.0" resolved "https://registry.yarnpkg.com/lodash.isequal/-/lodash.isequal-4.5.0.tgz#415c4478f2bcc30120c22ce10ed3226f7d3e18e0" @@ -6885,21 +6845,11 @@ lodash.memoize@^4.1.2: resolved "https://registry.yarnpkg.com/lodash.memoize/-/lodash.memoize-4.1.2.tgz#bcc6c49a42a2840ed997f323eada5ecd182e0bfe" integrity sha512-t7j+NzmgnQzTAYXcsHYLgimltOV1MXHtlOWf6GjL9Kj8GK5FInw5JotxvbOs+IvV1/Dzo04/fCGfLVs7aXb4Ag== -lodash.pullall@^4.2.0: - version "4.2.0" - resolved "https://registry.yarnpkg.com/lodash.pullall/-/lodash.pullall-4.2.0.tgz#9d98b8518b7c965b0fae4099bd9fb7df8bbf38ba" - integrity sha512-VhqxBKH0ZxPpLhiu68YD1KnHmbhQJQctcipvmFnqIBDYzcIHzf3Zpu0tpeOKtR4x76p9yohc506eGdOjTmyIBg== - lodash.uniq@4.5.0, lodash.uniq@^4.5.0: version "4.5.0" resolved "https://registry.yarnpkg.com/lodash.uniq/-/lodash.uniq-4.5.0.tgz#d0225373aeb652adc1bc82e4945339a842754773" integrity sha512-xfBaXQd9ryd9dlSDvnvI0lvxfLJlYAZzXomUYzLKtUeOQvOP5piqAWuGtrhWeqaXK9hhoM/iyJc5AV+XfsX3HQ== -lodash.uniqby@^4.7.0: - version "4.7.0" - resolved "https://registry.yarnpkg.com/lodash.uniqby/-/lodash.uniqby-4.7.0.tgz#d99c07a669e9e6d24e1362dfe266c67616af1302" - integrity sha512-e/zcLx6CSbmaEgFHCA7BnoQKyCtKMxnuWrJygbwPs/AIn+IMKl66L8/s+wBUn5LRw2pZx3bUHibiV1b6aTWIww== - lodash@^4.17.19, lodash@^4.17.20, lodash@^4.17.21: version "4.17.21" resolved "https://registry.yarnpkg.com/lodash/-/lodash-4.17.21.tgz#679591c564c3bffaae8454cf0b3df370c3d6911c" @@ -8028,7 +7978,7 @@ mobx-react@^7.2.0: dependencies: mobx-react-lite "^3.4.0" -mobx@^6.8.0: +mobx@^6.10.2: version "6.11.0" resolved "https://registry.yarnpkg.com/mobx/-/mobx-6.11.0.tgz#8a748b18c140892d1d0f28b71315f1f639180006" integrity sha512-qngYCmr0WJiFRSAtYe82DB7SbzvbhehkJjONs8ydynUwoazzUQHZdAlaJqUfks5j4HarhWsZrMRhV7HtSO9HOQ== @@ -8319,7 +8269,7 @@ open@^8.0.9, open@^8.4.0: is-docker "^2.1.1" is-wsl "^2.2.0" -openapi-sampler@^1.3.0: +openapi-sampler@^1.3.1: version "1.3.1" resolved "https://registry.yarnpkg.com/openapi-sampler/-/openapi-sampler-1.3.1.tgz#eebb2a1048f830cc277398bc8022b415f887e859" integrity sha512-Ert9mvc2tLPmmInwSyGZS+v4Ogu9/YoZuq9oP3EdUklg2cad6+IGndP9yqJJwbgdXwZibiq5fpv6vYujchdJFg== @@ -8966,7 +8916,7 @@ postcss-zindex@^5.1.0: resolved "https://registry.yarnpkg.com/postcss-zindex/-/postcss-zindex-5.1.0.tgz#4a5c7e5ff1050bd4c01d95b1847dfdcc58a496ff" integrity sha512-fgFMf0OtVSBR1va1JNHYgMxYk73yhn/qb4uQDq1DLGYolz8gHCyr/sesEuGUaYs58E3ZJRcpoGuPVoB7Meiq9A== -postcss@^8.3.11, postcss@^8.4.14, postcss@^8.4.17, postcss@^8.4.21, postcss@^8.4.23, postcss@^8.4.26, postcss@^8.4.30: +postcss@^8.3.11, postcss@^8.4.14, postcss@^8.4.17, postcss@^8.4.21, postcss@^8.4.23, postcss@^8.4.26, postcss@^8.4.30, postcss@^8.4.31: version "8.4.31" resolved "https://registry.yarnpkg.com/postcss/-/postcss-8.4.31.tgz#92b451050a9f914da6755af352bdc0192508656d" integrity sha512-PS08Iboia9mts/2ygV3eLpY5ghnUcfLV/EXTOW1E2qYxJKGGBUtNjN76FYHnMs36RmARn41bC0AZmn+rR0OVpQ== @@ -9244,9 +9194,9 @@ react-helmet-async@*, react-helmet-async@^1.3.0: shallowequal "^1.1.0" react-icons@^4.11.0: - version "4.11.0" - resolved "https://registry.yarnpkg.com/react-icons/-/react-icons-4.11.0.tgz#4b0e31c9bfc919608095cc429c4f1846f4d66c65" - integrity sha512-V+4khzYcE5EBk/BvcuYRq6V/osf11ODUM2J8hg2FDSswRrGvqiYUYPRy4OdrWaQOBj4NcpJfmHZLNaD+VH0TyA== + version "4.12.0" + resolved "https://registry.yarnpkg.com/react-icons/-/react-icons-4.12.0.tgz#54806159a966961bfd5cdb26e492f4dafd6a8d78" + integrity sha512-IBaDuHiShdZqmfc/TwHu6+d6k2ltNCf3AszxNmjJc1KUfXdEeRJOKyNvLmAHaarhzGmTSVygNdyu8/opXv2gaw== react-is@^16.13.1, react-is@^16.6.0, react-is@^16.7.0: version "16.13.1" @@ -9328,10 +9278,10 @@ react-simple-code-editor@^0.10.0: resolved "https://registry.yarnpkg.com/react-simple-code-editor/-/react-simple-code-editor-0.10.0.tgz#73e7ac550a928069715482aeb33ccba36efe2373" integrity sha512-bL5W5mAxSW6+cLwqqVWY47Silqgy2DKDTR4hDBrLrUqC5BXc29YVx17l2IZk5v36VcDEq1Bszu2oHm1qBwKqBA== -react-tabs@^3.2.2: - version "3.2.3" - resolved "https://registry.yarnpkg.com/react-tabs/-/react-tabs-3.2.3.tgz#ccbb3e1241ad3f601047305c75db661239977f2f" - integrity sha512-jx325RhRVnS9DdFbeF511z0T0WEqEoMl1uCE3LoZ6VaZZm7ytatxbum0B8bCTmaiV0KsU+4TtLGTGevCic7SWg== +react-tabs@^4.3.0: + version "4.3.0" + resolved "https://registry.yarnpkg.com/react-tabs/-/react-tabs-4.3.0.tgz#9f4db0fd209ba4ab2c1e78993ff964435f84af62" + integrity sha512-2GfoG+f41kiBIIyd3gF+/GRCCYtamC8/2zlAcD8cqQmqI9Q+YVz7fJLHMmU9pXDVYYHpJeCgUSBJju85vu5q8Q== dependencies: clsx "^1.1.0" prop-types "^15.5.0" @@ -9418,12 +9368,12 @@ recursive-readdir@^2.2.2: dependencies: minimatch "^3.0.5" -redoc@2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/redoc/-/redoc-2.0.0.tgz#8b3047ca75b84d31558c6c92da7f84affef35c3e" - integrity sha512-rU8iLdAkT89ywOkYk66Mr+IofqaMASlRvTew0dJvopCORMIPUcPMxjlJbJNC6wsn2vvMnpUFLQ/0ISDWn9BWag== +redoc@2.1.3: + version "2.1.3" + resolved "https://registry.yarnpkg.com/redoc/-/redoc-2.1.3.tgz#612c9fed744993d5fc99cbf39fe9056bd1034fa5" + integrity sha512-d7F9qLLxaiFW4GC03VkwlX9wuRIpx9aiIIf3o6mzMnqPfhxrn2IRKGndrkJeVdItgCfmg9jXZiFEowm60f1meQ== dependencies: - "@redocly/openapi-core" "^1.0.0-beta.104" + "@redocly/openapi-core" "^1.0.0-rc.2" classnames "^2.3.1" decko "^1.2.0" dompurify "^2.2.8" @@ -9433,26 +9383,25 @@ redoc@2.0.0: mark.js "^8.11.1" marked "^4.0.15" mobx-react "^7.2.0" - openapi-sampler "^1.3.0" + openapi-sampler "^1.3.1" path-browserify "^1.0.1" perfect-scrollbar "^1.5.5" polished "^4.1.3" prismjs "^1.27.0" prop-types "^15.7.2" - react-tabs "^3.2.2" + react-tabs "^4.3.0" slugify "~1.4.7" stickyfill "^1.1.1" - style-loader "^3.3.1" swagger2openapi "^7.0.6" url-template "^2.0.8" -redocusaurus@^1.6.3: - version "1.6.4" - resolved "https://registry.yarnpkg.com/redocusaurus/-/redocusaurus-1.6.4.tgz#0aaa49cf68056a958b4fac5f93259c85b4ae0f75" - integrity sha512-0o7bDrs5eLOiMR7BLjdZ6nYEQBNvle/MrUJsvfaKShkZHvbelAJPmH7muoiL+JWcxGCiI8vuh9EKTDDqqRkE9A== +redocusaurus@^2.0.0: + version "2.0.0" + resolved "https://registry.yarnpkg.com/redocusaurus/-/redocusaurus-2.0.0.tgz#83481ff4f5c6f2a00df901e359850bef3a7c43c6" + integrity sha512-wRSpkY+PwkqAj98RD+1ec6U8KDKySH6GT0jahWY+dPlpckyHj7D5i3ipXdTiJ6jXXCyM2qUwimX5PZJEdooDhA== dependencies: - docusaurus-plugin-redoc "1.6.0" - docusaurus-theme-redoc "1.6.4" + docusaurus-plugin-redoc "2.0.0" + docusaurus-theme-redoc "2.0.0" reftools@^1.1.9: version "1.1.9" @@ -10328,9 +10277,9 @@ statuses@2.0.1: integrity sha512-OpZ3zP+jT1PI7I8nemJX4AKmAX070ZkYPVWV/AaKTJl+tXCTGyVdC1a4SL8RUQYEwk/f34ZX8UTykN68FwrqAA== std-env@^3.0.1: - version "3.4.3" - resolved "https://registry.yarnpkg.com/std-env/-/std-env-3.4.3.tgz#326f11db518db751c83fd58574f449b7c3060910" - integrity sha512-f9aPhy8fYBuMN+sNfakZV18U39PbalgjXG3lLB9WkaYTxijru61wb57V9wxxNthXM5Sd88ETBWi29qLAsHO52Q== + version "3.5.0" + resolved "https://registry.yarnpkg.com/std-env/-/std-env-3.5.0.tgz#83010c9e29bd99bf6f605df87c19012d82d63b97" + integrity sha512-JGUEaALvL0Mf6JCfYnJOTcobY+Nc7sG/TemDRBqCA0wEr4DER7zDchaaixTlmOxAjG1uRJmX82EQcxwTQTkqVA== stickyfill@^1.1.1: version "1.1.1" @@ -10425,11 +10374,6 @@ strip-json-comments@~2.0.1: resolved "https://registry.yarnpkg.com/strip-json-comments/-/strip-json-comments-2.0.1.tgz#3c531942e908c2697c0ec344858c286c7ca0a60a" integrity sha512-4gB8na07fecVVkOI6Rs4e7T6NOTki5EmL7TUduTs6bu3EdnSycntVJ4re8kgZA+wx9IueI2Y11bfbgwtzuE0KQ== -style-loader@^3.3.1: - version "3.3.3" - resolved "https://registry.yarnpkg.com/style-loader/-/style-loader-3.3.3.tgz#bba8daac19930169c0c9c96706749a597ae3acff" - integrity sha512-53BiGLXAcll9maCYtZi2RCQZKa8NQQai5C4horqKyRmHj9H7QmcUyucrH+4KW/gBQbXM2AsB0axoEcFZPlfPcw== - style-to-object@0.3.0, style-to-object@^0.3.0: version "0.3.0" resolved "https://registry.yarnpkg.com/style-to-object/-/style-to-object-0.3.0.tgz#b1b790d205991cc783801967214979ee19a76e46" @@ -10444,21 +10388,20 @@ style-to-object@^0.4.0: dependencies: inline-style-parser "0.1.1" -styled-components@^5.3.6: - version "5.3.11" - resolved "https://registry.yarnpkg.com/styled-components/-/styled-components-5.3.11.tgz#9fda7bf1108e39bf3f3e612fcc18170dedcd57a8" - integrity sha512-uuzIIfnVkagcVHv9nE0VPlHPSCmXIUGKfJ42LNjxCCTDTL5sgnJ8Z7GZBq0EnLYGln77tPpEpExt2+qa+cZqSw== - dependencies: - "@babel/helper-module-imports" "^7.0.0" - "@babel/traverse" "^7.4.5" - "@emotion/is-prop-valid" "^1.1.0" - "@emotion/stylis" "^0.8.4" - "@emotion/unitless" "^0.7.4" - babel-plugin-styled-components ">= 1.12.0" - css-to-react-native "^3.0.0" - hoist-non-react-statics "^3.0.0" +styled-components@^6.1.0: + version "6.1.1" + resolved "https://registry.yarnpkg.com/styled-components/-/styled-components-6.1.1.tgz#a5414ada07fb1c17b96a26a05369daa4e2ad55e5" + integrity sha512-cpZZP5RrKRIClBW5Eby4JM1wElLVP4NQrJbJ0h10TidTyJf4SIIwa3zLXOoPb4gJi8MsJ8mjq5mu2IrEhZIAcQ== + dependencies: + "@emotion/is-prop-valid" "^1.2.1" + "@emotion/unitless" "^0.8.0" + "@types/stylis" "^4.0.2" + css-to-react-native "^3.2.0" + csstype "^3.1.2" + postcss "^8.4.31" shallowequal "^1.1.0" - supports-color "^5.5.0" + stylis "^4.3.0" + tslib "^2.5.0" stylehacks@^5.1.1: version "5.1.1" @@ -10468,7 +10411,7 @@ stylehacks@^5.1.1: browserslist "^4.21.4" postcss-selector-parser "^6.0.4" -stylis@^4.1.3: +stylis@^4.1.3, stylis@^4.3.0: version "4.3.0" resolved "https://registry.yarnpkg.com/stylis/-/stylis-4.3.0.tgz#abe305a669fc3d8777e10eefcfc73ad861c5588c" integrity sha512-E87pIogpwUsUwXw7dNyU4QDjdgVMy52m+XEOPEKUn161cCzWjjhPSQhByfd1CcNvrOLnXQ6OnnZDwnJrz/Z4YQ== @@ -10486,7 +10429,7 @@ sucrase@^3.32.0: pirates "^4.0.1" ts-interface-checker "^0.1.9" -supports-color@^5.3.0, supports-color@^5.5.0: +supports-color@^5.3.0: version "5.5.0" resolved "https://registry.yarnpkg.com/supports-color/-/supports-color-5.5.0.tgz#e2e69a44ac8772f78a1ec0b35b689df6530efc8f" integrity sha512-QjVjwdXIt408MIiAqCX4oUKsgU2EqAGzs2Ppkm4aQYbjm+ZEWEcW4SfFNTr4uMNZma0ey4f5lgLrkB0aX0QMow== @@ -10720,7 +10663,7 @@ ts-interface-checker@^0.1.9: resolved "https://registry.yarnpkg.com/ts-interface-checker/-/ts-interface-checker-0.1.13.tgz#784fd3d679722bc103b1b4b8030bcddb5db2a699" integrity sha512-Y/arvbn+rrz3JCKl9C4kVNfTfSm2/mEp5FSz5EsZSANGPSlQrpRI5M4PKF+mJnE52jOO90PnPSc3Ur3bTQw0gA== -tslib@^2.0.3, tslib@^2.1.0, tslib@^2.4.0, tslib@^2.6.0: +tslib@^2.0.3, tslib@^2.1.0, tslib@^2.4.0, tslib@^2.5.0, tslib@^2.6.0: version "2.6.2" resolved "https://registry.yarnpkg.com/tslib/-/tslib-2.6.2.tgz#703ac29425e7b37cd6fd456e92404d46d1f3e4ae" integrity sha512-AEYxH93jGFPn/a2iVAwW87VuUIkR1FVUKB77NwMF7nBTDkDrrT/Hpt/IrCJ0QXhW27jTBDcf5ZY7w6RiqTMw2Q== @@ -11205,11 +11148,11 @@ wait-on@^6.0.1: rxjs "^7.5.4" wait-on@^7.0.1: - version "7.1.0" - resolved "https://registry.yarnpkg.com/wait-on/-/wait-on-7.1.0.tgz#3184ccfff7eb8a4d62ef3dfa6a4ff3675617ff60" - integrity sha512-U7TF/OYYzAg+OoiT/B8opvN48UHt0QYMi4aD3PjRFpybQ+o6czQF8Ig3SKCCMJdxpBrCalIJ4O00FBof27Fu9Q== + version "7.2.0" + resolved "https://registry.yarnpkg.com/wait-on/-/wait-on-7.2.0.tgz#d76b20ed3fc1e2bebc051fae5c1ff93be7892928" + integrity sha512-wCQcHkRazgjG5XoAq9jbTMLpNIjoSlZslrJ2+N9MxDsGEv1HnFoVjOCexL0ESva7Y9cu350j+DWADdk54s4AFQ== dependencies: - axios "^0.27.2" + axios "^1.6.1" joi "^17.11.0" lodash "^4.17.21" minimist "^1.2.8" @@ -11251,23 +11194,19 @@ webidl-conversions@^3.0.0: integrity sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ== webpack-bundle-analyzer@^4.5.0, webpack-bundle-analyzer@^4.9.0: - version "4.9.1" - resolved "https://registry.yarnpkg.com/webpack-bundle-analyzer/-/webpack-bundle-analyzer-4.9.1.tgz#d00bbf3f17500c10985084f22f1a2bf45cb2f09d" - integrity sha512-jnd6EoYrf9yMxCyYDPj8eutJvtjQNp8PHmni/e/ulydHBWhT5J3menXt3HEkScsu9YqMAcG4CfFjs3rj5pVU1w== + version "4.10.0" + resolved "https://registry.yarnpkg.com/webpack-bundle-analyzer/-/webpack-bundle-analyzer-4.10.0.tgz#eecb0ade9bd1944d3d2e38262ec9793da6f13e69" + integrity sha512-j+apH0Cs+FY8IOIwxLbkgEJnbQgEPEG8uqLVnRb9tAoGbyKNxQA1u9wNDrTQHK3PinO4Pckew7AE7pnX/RS3wA== dependencies: "@discoveryjs/json-ext" "0.5.7" acorn "^8.0.4" acorn-walk "^8.0.0" commander "^7.2.0" + debounce "^1.2.1" escape-string-regexp "^4.0.0" gzip-size "^6.0.0" + html-escaper "^2.0.2" is-plain-object "^5.0.0" - lodash.debounce "^4.0.8" - lodash.escape "^4.0.1" - lodash.flatten "^4.4.0" - lodash.invokemap "^4.6.0" - lodash.pullall "^4.2.0" - lodash.uniqby "^4.7.0" opener "^1.5.2" picocolors "^1.0.0" sirv "^2.0.3" @@ -11578,4 +11517,4 @@ zwitch@^1.0.0: zwitch@^2.0.0: version "2.0.4" resolved "https://registry.yarnpkg.com/zwitch/-/zwitch-2.0.4.tgz#c827d4b0acb76fc3e685a4c6ec2902d51070e9d7" - integrity sha512-bXE4cR/kVZhKZX/RjPEflHaKVhUVl85noU3v6b8apfQEc1x4A+zBxjZ4lN8LqGd6WZ3dl98pY4o717VFmoPp+A== + integrity sha512-bXE4cR/kVZhKZX/RjPEflHaKVhUVl85noU3v6b8apfQEc1x4A+zBxjZ4lN8LqGd6WZ3dl98pY4o717VFmoPp+A== \ No newline at end of file diff --git a/examples/interface/README.md b/examples/interface/README.md new file mode 100644 index 000000000..6d4df5aae --- /dev/null +++ b/examples/interface/README.md @@ -0,0 +1,49 @@ +This guide provides instructions to create a chatbot powered by Nitro using the GGUF model. + +## Step 1: Download the Model + +First, you'll need to download the chatbot model. + +1. **Navigate to the Models Folder** + - Open your project directory. + - Locate and open the `models` folder within the directory. + +2. **Select a GGUF Model** + - Visit the Hugging Face repository at [TheBloke's Models](https://huggingface.co/TheBloke). + - Browse through the available models. + - Choose the model that best fits your needs. + +3. **Download the Model** + - Once you've selected a model, download it using a command like the one below. Replace `` with the path of your chosen model. + + ```zsh title="This is an example of downloading Zephyr 7B Q5" + wget https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q5_K_M.gguf?download=true + ``` + +## Step 2: Load model +Now, you'll set up the model in your application. + +1. **Open `app.py` File** + + - In your project directory, find and open the app.py file. + +2. **Configure the Model Path** + + - Modify the model path in app.py to point to your downloaded model. + - Update the configuration parameters as necessary. + + ```zsh title="Example Configuration" {2} + dat = { + "llama_model_path": "nitro/interface/models/zephyr-7b-beta.Q5_K_M.gguf", + "ctx_len": 2048, + "ngl": 100, + "embedding": True, + "n_parallel": 4, + "pre_prompt": "A chat between a curious user and an artificial intelligence", + "user_prompt": "USER: ", + "ai_prompt": "ASSISTANT: "} + ``` + +Congratulations! Your Nitro chatbot is now set up. Feel free to experiment with different configuration parameters to tailor the chatbot to your needs. + +For more information on parameter settings and their effects, please refer to Run Nitro(using-nitro) for a comprehensive parameters table. \ No newline at end of file diff --git a/examples/interface/app.py b/examples/interface/app.py new file mode 100644 index 000000000..3b9e32b05 --- /dev/null +++ b/examples/interface/app.py @@ -0,0 +1,92 @@ +import gradio as gr +import os +import time +import json +import requests + +# URLs for model loading and chat completion +load_model_url = "http://localhost:3928/inferences/llamacpp/loadmodel" +chat_completion_url = "http://localhost:3928/inferences/llamacpp/chat_completion" + +headers = { + 'Content-Type': 'application/json' +} + +# Function to load the model +def load_model(): + load_data = { + "llama_model_path": "nitro/model/llama-2-7b-chat.Q5_K_M.gguf?download=true" + # Add other necessary parameters if required + } + response = requests.post(load_model_url, headers=headers, data=json.dumps(load_data)) + if response.status_code != 200: + print("Error in loading model: ", response.status_code, response.text) + else: + print("Model loaded successfully.") + +# Define the vote function for like/dislike functionality +def vote(data: gr.LikeData): + if data.liked: + print("You upvoted this response: " + data.value) + else: + print("You downvoted this response: " + data.value) + +# Function to handle text input +def add_text(history, text): + return history + [(text, None)] + +# Function to handle file input +def add_file(history, file): + return history + [((file.name,), None)] + +# Bot response function with markdown support and like/dislike feature +def bot(history): + last_message = history[-1][0] if history else "" + dat = { + "llama_model_path": "nitro/model/llama-2-7b-chat.Q5_K_M.gguf?download=true", + "messages": [ + { + "role": "user", + "content": last_message + }, + ] + } + + response = requests.post(chat_completion_url, headers=headers, data=json.dumps(dat)) + + if response.status_code == 200: + response_text = response.text + output = json.loads(response_text) + final_response = output['response'] + else: + print("Error: ", response.status_code, response.text) + + history[-1][1] = final_response + yield history + +# Load the model at the start +load_model() + +# Setup the chatbot and input components +with gr.Blocks() as demo: + chatbot = gr.Chatbot( + [], + elem_id="chatbot", + bubble_full_width=False, + avatar_images=(None, (os.path.join(os.path.dirname(__file__), "nitro/example/avatar.png"))), + ) + + with gr.Row(): + txt = gr.Textbox(scale=4, show_label=False, placeholder="Enter text and press enter, or upload an image", container=False) + btn = gr.UploadButton("📁", file_types=["image", "video", "audio"]) + + txt_msg = txt.submit(add_text, [chatbot, txt], [chatbot], queue=False).then(bot, chatbot, chatbot, api_name="bot_response") + file_msg = btn.upload(add_file, [chatbot, btn], [chatbot], queue=False).then(bot, chatbot, chatbot) + + # Attach the like/dislike functionality to the chatbot + chatbot.like(vote, None, None) + +# Launch the application +if __name__ == "__main__": + demo.queue() + demo.launch(allowed_paths=["nitro/example/avatar.png"]) \ No newline at end of file diff --git a/examples/interface/avatar.png b/examples/interface/avatar.png new file mode 100644 index 000000000..983b48097 Binary files /dev/null and b/examples/interface/avatar.png differ