Avalonia-ai-assistant is a sample AI chat assistant based on Avalonia UI framework and LLamaSharp library. It allows to run most popular AI models locally using a consumer GPU (or CPU only).
Run dotnet build -c Release
command to build CPU version or dotnet build -c Release_Cuda
if you have CUDA 12 compatible GPU.
Download any LLM model in GGUF format. For example:
Model | Link |
---|---|
LLama2 7B | https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF |
LLama2 8B Pro Inst | https://huggingface.co/TheBloke/LLaMA-Pro-8B-Instruct-GGUF |
CodeLLama 7B Inst | https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF |
Model | Link |
---|---|
LLama3 8B Inst | https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF |
Model | Link |
---|---|
Phi 3 mini Inst | https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf |
Phi 3 medium Inst | https://huggingface.co/QuantFactory/Phi-3-medium-4k-instruct-GGUF |
Model | Link |
---|---|
Codestral 22B | https://huggingface.co/bartowski/Codestral-22B-v0.1-GGUF |
Change appsettings.json
file to have default setting for the path to the downloaded model. Also setup correct values for the modal max context size and other parameters.
For example for LLama 2 it can be:
"ModelParams": {
"FileName": "llama-2-7b-chat.Q4_K_M.gguf",
"Path": "C:\\ML\\Models\\LLaMA\\",
"GpuLayerCount": "33",
"TotalLayerCount": "33",
"ContextSize": "4096",
"CustomHistoryTransformer": "Llama2"
}
Note: do not forget to add "CustomHistoryTransformer": "Llama2"
attribute for any LLama 2 or Mistral family model (remove it for different types of models).
After running avalon-ai-assistant application configure Options and Settings tabs in UI.
Start new chat session and enjoy!
Avalonia-ai-assistant is released under the MIT License. See the LICENSE file for details.