Merge pull request #1696 from janhq/dev

Chore: Sync dev to main
janhq · Nov 18, 2024 · 8d60c95 · 8d60c95
2 parents 49f72aa + 6892823
commit 8d60c95
Show file tree

Hide file tree

Showing 702 changed files with 67,141 additions and 42,482 deletions.
diff --git a/.github/ISSUE_TEMPLATE/QA_checklist.md b/.github/ISSUE_TEMPLATE/QA_checklist.md
@@ -0,0 +1,173 @@
+---
+name: QA Checklist
+about: QA Checklist
+title: 'QA: [VERSION]'
+labels: 'type: QA checklist'
+assignees: ''
+---
+**QA details:**
+
+Version: `v1.0.x-xxx`
+
+OS (select one)
+- [ ] Windows 11 (online & offline)
+- [ ] Ubuntu 24, 22 (online & offline)
+- [ ] Mac Silicon OS 14/15 (online & offline)
+- [ ] Mac Intel (online & offline)
+
+--------
+
+# 1. Manual QA (CLI)
+## Installation
+- [ ]  it should install with local installer (default; no internet required during installation, all dependencies bundled)
+- [ ]  it should install with network installer
+- [ ] it should install 2 binaries (cortex and cortex-server) [mac: binaries in `/usr/local/bin`]
+- [ ]  it should install with correct folder permissions
+- [ ]  it should install with folders: /engines /logs (no /models folder until model pull)
+- [ ] It should install with Docker image https://cortex.so/docs/installation/docker/
+
+## Data/Folder structures
+- [ ] cortex.so models are stored in `cortex.so/model_name/variants/`, with .gguf and model.yml file
+- [ ] huggingface models are stored `huggingface.co/author/model_name` with .gguf and model.yml file
+- [ ] downloaded models are saved in cortex.db with the right fields: `model`, `author_repo_id`, `branch_name`, `path_to_model_yaml` (view via SQL)
+
+## Cortex Update
+- [ ] cortex -v should check output current version and check for updates
+- [ ] cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp)
+- [ ]  `cortex update` should update from ~3-5 versions ago to latest (+3 to 5 bump)
+- [ ]  `cortex update` should update from the previous version to latest (+1 bump)
+- [ ] `cortex update -v 1.x.x-xxx` should update from the previous version to specified version
+- [ ] `cortex update` should update from previous stable version to latest
+- [ ] it should gracefully update when server is actively running
+
+## Overall / App Shell
+- [ ] cortex returns helpful text in a timely* way (< 5s)
+- [ ] `cortex` or `cortex -h` displays help commands
+- [ ] CLI commands should start the API server, if not running [except 
+- [ ] it should correctly log to cortex-cli.log and cortex.log
+- [ ] There should be no stdout from inactive shell session
+
+## Engines
+- [ ] llama.cpp should be installed by default
+- [ ] it should run gguf models on llamacpp
+- [ ] it should list engines
+- [ ] it should get engines
+- [ ] it should install engines (latest version if not specified)
+- [ ] it should install engines (with specified variant and version)
+- [ ] it should get default engine
+- [ ] it should set default engine (with specified variant/version)
+- [ ] it should load engine
+- [ ] it should unload engine
+- [ ] it should update engine (to latest version)
+- [ ] it should update engine (to specified version)
+- [ ] it should uninstall engines
+- [ ]  it should gracefully continue engine installation if interrupted halfway (partial download)
+- [ ]  it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx)
+- [ ]  it should run trtllm models on trt-llm [WIP,  not tested]
+- [ ] it shoud handle engine variants [WIP, not tested]
+- [ ]  it should update engines versions [WIP, not tested]
+
+## Server
+- [ ] `cortex start` should start server and output localhost URL & port number
+- [ ] users can access API Swagger documentation page  at localhost URL & port number
+- [ ] `cortex start` can be configured with parameters (port, [logLevel [WIP]](https://github.com/janhq/cortex.cpp/pull/1636)) https://cortex.so/docs/cli/start/
+- [ ]  it should correctly log to cortex logs (logs/cortex.log, logs/cortex-cli.log)
+- [ ]  `cortex ps` should return server status and running models (or no model loaded)
+- [ ]  `cortex stop` should stop server
+
+## Model Pulling
+- [ ] Pulling a model should pull .gguf and model.yml file
+- [ ] Model download progress should appear as download bars for each file
+- [ ] Model download progress should be accurate (%, total time, download size, speed)
+### cortex.so
+- [ ]  it should pull by built in model_ID
+- [ ] pull by model_ID should recommend default variant at the top (set in HF model.yml)
+- [ ]  it should pull by built-in model_id:variant
+### huggingface.co
+- [ ]  it should pull by HF repo/model ID
+- [ ]  it should pull by full HF url (ending in .gguf)
+### Interrupted Download
+- [ ] it should allow user to interrupt / stop download 
+- [ ] pulling again after interruption should accurately calculates remainder of model file size neeed to be downloaded (`Found unfinished download! Additional XGB needs to be downloaded`)
+- [ ]  it should allow to continue downloading the remainder after interruption
+
+## Model Management
+- [ ]  it should list downloaded models
+- [ ] it should get a local model
+- [ ]  it should update model parameters in model.yaml
+- [ ] it should delete a model
+- [ ]  it should import models with model_id and model_path
+
+## Model Running
+- [ ] `cortex run <cortexso model>` - if no local models detected, shows `pull` model menu
+- [ ] `cortex run` - if local model detected, runs the local model
+- [ ]  `cortex run` - if multiple local models detected, shows list of local models (from multiple model sources eg cortexso, HF authors) for users to select (via regex search)
+- [ ] `cortex run <invalid model id>` should return gracefully `Model not found!`
+- [ ]  run should autostart server
+- [ ] `cortex run <model>` starts interactive chat (by default)
+- [ ] `cortex run <model> -d` runs in detached mode
+- [ ] `cortex models start <model>`  
+- [ ] terminate StdIn or `exit()` should exit interactive chat
+
+## Hardware Detection / Acceleration [WIP, no need to QA]
+- [ ]  it should auto offload max ngl
+- [ ]  it should correctly detect available GPUs
+- [ ]  it should gracefully detect missing dependencies/drivers
+CPU Extension (e.g. AVX-2, noAVX, AVX-512)
+GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc)
+
+## Uninstallation / Reinstallation
+- [ ]  it should uninstall 2 binaries (cortex and cortex-server)
+- [ ]  it should uninstall with 2 options to delete or not delete data folder 
+- [ ]  it should gracefully uninstall when server is still running
+- [ ] uninstalling should not leave any dangling files
+- [ ] uninstalling should not leave any dangling processes
+- [ ]  it should reinstall without having conflict issues with existing cortex data folders
+
+--
+# 2. API QA
+
+## Checklist for each endpoint
+- [ ] Upon `cortex start`, API page is displayed at localhost:port endpoint
+- [ ] Endpoints should support the parameters stated in API reference (towards OpenAI Compatibility)
+- [ ] https://cortex.so/api-reference is updated
+
+## Endpoints
+### Chat Completions
+- [ ] POST `v1/chat/completions`
+- [ ] Cortex supports Function Calling #295
+
+### Engines
+- [ ] List engines: GET `/v1/engines`
+- [ ] Get engine: GET `/v1/engines/{name}`
+- [ ] Install engine: POST `/v1/engines/install/{name}`
+- [ ] Get default engine variant/version: GET `v1/engines/{name}/default`
+- [ ] Set default engine variant/version: POST `v1/engines/{name}/default`
+- [ ] Load engine: POST `v1/engines/{name}/load`
+- [ ] Unload engine: DELETE `v1/engines/{name}/load`
+- [ ] Update engine: POST `v1/engines/{name}/update`
+- [ ] uninstall engine: DELETE `/v1/engines/install/{name}`
+
+### Pulling Models
+- [ ] Pull model: POST `/v1/models/pull` starts download (websockets)
+- [ ] Pull model: `websockets /events` emitted  
+- [ ] Stop model download: DELETE `/v1/models/pull` (websockets)
+- [ ] Stop model download: `websockets /events` stopped
+- [ ] Import model: POST `v1/models/import`
+
+### Running Models
+- [ ] List models: GET `v1/models`
+- [ ] Start model: POST `/v1/models/start`
+- [ ] Stop model: POST `/v1/models/stop`
+- [ ] Get model: GET `/v1/models/{id}`
+- [ ] Delete model: DELETE `/v1/models/{id}`
+- [ ] Update model: PATCH `/v1/models/{model}` updates model.yaml params
+
+## Server
+- [ ] CORs [WIP]
+- [ ] health: GET `/healthz`
+- [ ] terminate server: DELETE `/processManager/destroy`
+--------
+Test list for reference:
+- #1357 e2e tests for APIs in CI
+- #1147, #1225 for starting QA list
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,58 @@
+name: "\U0001F41B Bug Report"
+description: "If something isn't working as expected \U0001F914"
+labels: [ "type: bug" ]
+title: 'bug: [DESCRIPTION]'
+
+body:
+  - type: input
+    validations:
+      required: true
+    attributes:
+      label: "Cortex version"
+      description: "**Tip:** `cortex -v` outputs the version number"
+
+  - type: textarea
+    validations:
+      required: true
+    attributes:
+      label: "Describe the issue and expected behaviour"
+      description: "A clear & concise description of the issue encountered"
+
+  - type: textarea
+    attributes:
+      label: "Steps to Reproduce"
+      description: |
+        Please list out steps to reproduce the issue
+      placeholder: |
+        1. Go to '...'
+        2. Click on '...'
+
+  - type: textarea
+    attributes:
+      label: "Screenshots / Logs"
+      description: |
+        Please include cortex-cli.log and cortex.log files in: ~/cortex/logs/
+
+  - type: checkboxes
+    attributes:
+      label: "What is your OS?"
+      options:
+        - label: Windows
+        - label: Mac Silicon 
+        - label: Mac Intel
+        - label: Linux / Ubuntu
+
+  - type: checkboxes
+    attributes:
+      label: "What engine are you running?"
+      options:
+        - label: cortex.llamacpp (default)
+        - label: cortex.tensorrt-llm (Nvidia GPUs)
+        - label: cortex.onnx (NPUs, DirectML)
+
+  - type: input
+    attributes:
+      label: "Hardware Specs eg OS version, GPU"
+      description:
+
+
diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1,7 @@
+## To encourage contributors to use issue templates, we don't allow blank issues
+blank_issues_enabled: true
+
+contact_links:
+  - name: "\1F4AC Cortex Discussions"
+    url: "https://github.com/orgs/janhq/discussions/categories/q-a"
+    about: "Get help, discuss features & roadmap, and share your projects"
diff --git a/.github/ISSUE_TEMPLATE/discussion-thread.md b/.github/ISSUE_TEMPLATE/discussion-thread.md
diff --git a/.github/ISSUE_TEMPLATE/epic-request.md b/.github/ISSUE_TEMPLATE/epic-request.md
@@ -6,15 +6,56 @@ labels: 'type: epic'
 assignees: ''
 
 ---
+## Goal 
 
-**Problem**
-A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
 
-**Success Criteria**
-A clear and concise description of what you want to happen.
+## Success Criteria
 
-**Sub Issues**
-- 
 
-**Additional context**
-Add any other context or screenshots about the epic request here.
+## Tasklist
+- [ ] 
+
+## API / CLI Documentation
+### API 
+#### 1. Feature
+```
+GET /v1/endpoint
+```
+
+Body:
+```json
+{
+    "key": "value"
+}
+```
+**Response**
+```json
+200
+{
+}
+Error
+{
+}
+```
+
+### CLI 
+#### 1. Feature 
+```
+GET /v1/endpoint
+```
+Response:
+```
+```
+#### Help Command
+```
+❯ cortex ...
+Usage:
+cortex [options] [subcommand]
+Options:
+  -h,--help                 Print this help message and exit
+  ...                             ...
+
+Subcommands:
+    start                       Start a model by ID
+     ...                           ...
+```
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md