A simple Gradio WebUI for loading/unloading models and loras for tabbyAPI. This provides demo functionality for accessing tabbyAPI's extensive feature base via API, and can be run remotely on a separate system.
This repo is meant serve as a demo of the API's features and provide an accessible means to change models without editing the config and restarting the instance. Supports speculative decoding and loading of multiple loras with custom scaling.
This WebUI does not provide an LLM inference frontend - use any OAI-compatible inference frontend of your choosing.
To get started, make sure you have the following installed on your system:
- Python 3.8+ (preferably 3.11) with pip
- Clone this repository to your machine:
git clone https://github.com/theroyallab/tabbyAPI-gradio-loader
- Navigate to the project directory:
cd tabbyAPI-gradio-loader
- Create a python virtual environment:
python -m venv venv
- Activate the virtual environment:
- On Windows (Using powershell or Windows terminal):
.\venv\Scripts\activate.
- On Linux:
source venv/bin/activate
- On Windows (Using powershell or Windows terminal):
- Install the requirements file:
pip install -r requirements.txt
- Make sure you are in the project directory and entered into the venv
- Run the WebUI application:
python webui.py
- Input your tabbyAPI endpoint URL and admin key and press connect!
Argument | Description |
---|---|
-h or--help |
Show this help message and exit |
-p or --port |
Specify port to host the WebUI on (default 7860) |
-l or --listen |
Share WebUI link via LAN |
-s or --share |
Share WebUI link remotely via Gradio's built in tunnel |
-a or --autolaunch |
Launch browser after starting WebUI |
-e or --endpoint_url |
TabbyAPI endpoint URL (default http://localhost:5000) |
-k or --admin_key |
TabbyAPI admin key, connect automatically on launch |