Skip to content

Commit

Permalink
feat: update README.md (#6)
Browse files Browse the repository at this point in the history
* feat: update README.md

* fix: remove cortex-cpp

---------

Co-authored-by: vansangpfiev <[email protected]>
  • Loading branch information
vansangpfiev and sangjanai authored Jun 14, 2024
1 parent c90a3a9 commit 0f7b3f0
Showing 1 changed file with 102 additions and 1 deletion.
103 changes: 102 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,102 @@
# cortex.onnx
# cortex.onnx
cortex.onnx is a high-efficiency C++ inference engine for edge computing focusing on Windows platform using DirectML for GPU acceleration.

It is a dynamic library that can be loaded by any server at runtime.

# Repo Structure
```
.
├── base -> Engine interface
├── examples -> Server example to integrate engine
├── onnxruntime-genai -> Upstream onnxruntime-genai
├── src -> Engine implementation
├── third-party -> Dependencies of the cortex.onnx project
```

## Build from source

This guide provides step-by-step instructions for building cortex.onnx from source on Windows systems.

## Clone the Repository

First, you need to clone the cortex.onnx repository:

```bash
git clone --recurse https://github.com/janhq/cortex.onnx.git
```

If you don't have git, you can download the source code as a file archive from [cortex.onnx GitHub](https://github.com/janhq/cortex.onnx).

## Build library with server example
- **On Windows**
Install CMake and MsBuild
```
# Build dependencies
./build_cortex_onnx.bat
# Build engine
mkdir build
cd build
cmake ..
cmake --build . --config Release -j4
# Build server example (from root repository)
mkdir -p examples/server/build
cd examples/server/build
cmake ..
cmake --build . --config Release -j4
```

# Quickstart
**Step 1: Downloading a Model**

Clone a model from https://huggingface.co/cortexhub, checkout to dml branch

**Step 2: Start server**
- **On Windows**

```bash
cd examples/server/build/Release
mkdir -p engines\cortex.onnx\
cp ..\..\..\..\build\Release\engine.dll engines\cortex.onnx\
cp ..\..\..\..\onnxruntime-genai\build\Release\*.dll .\
server.exe
```

**Step 3: Load model**
```bash title="Load model"
curl http://localhost:3928/loadmodel \
-H 'Content-Type: application/json' \
-d '{
"model_path": "./model/llama3",
"model_alias": "llama3",
"system_prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n",
"user_prompt": "<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n",
"ai_prompt": "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
}'
```
**Step 4: Making an Inference**

```bash title="cortex.onnx Inference"
curl http://localhost:3928/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
}
],
"model": "llama3"
}'
```

Table of parameters

| Parameter | Type | Description |
|------------------|---------|--------------------------------------------------------------|
| `model_path` | String | The file path to the onnx model. |

0 comments on commit 0f7b3f0

Please sign in to comment.