feat: track the cost associated with each user #198
Replies: 9 comments 10 replies
-
Related: #20, we should try to add this feature to the rest of our proxy servers as well. |
Beta Was this translation helpful? Give feedback.
-
This would also solve my Feature-Request: open-webui/open-webui#1320 |
Beta Was this translation helpful? Give feedback.
-
@changchiyou do you want to create a key per user? LiteLLM also allows for tracking by the 'user' param in curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-zi5onDRdHGD24v0Zdn7VBA' \
--data ' {
"model": "azure-gpt-3.5",
"user": "ishaan3", # 👈 TRACKING COST FOR THIS USER
"messages": [
{
"role": "user",
"content": "what time is it"
}
]
}' |
Beta Was this translation helpful? Give feedback.
-
@krrishdholakia Does the method you provide, which uses and the logs in LiteLLM show that the
|
Beta Was this translation helpful? Give feedback.
-
same problem here! We are unable to track costs per user: |
Beta Was this translation helpful? Give feedback.
-
If it helps, a dirty hack in backend/apps/[openai|litellm]/main.py at proxy function: _tmp = json.loads(body) And now I can see the right user at langfuse: |
Beta Was this translation helpful? Give feedback.
-
We've added LiteLLM support to our pipelines examples (see : https://github.com/open-webui/pipelines/blob/main/pipelines/examples/litellm_manifold_pipeline.py and https://github.com/open-webui/pipelines/blob/main/pipelines/examples/litellm_subprocess_manifold_pipeline.py), enabling per-user cost tracking. This will be included with the v0.2.0 release, currently in the :dev branch. (We'll be leveraging OpenAI's endpoint to create plugins. Full docs will be available soon!) |
Beta Was this translation helpful? Give feedback.
-
Hello everyone, I've successfully implemented this feature by integrating Open WebUI, LiteLLM, and Langfuse, utilizing the filter"""
title: Chat Info Filter Pipeline
author: changchiyou
date: 2024-08-02
version: 1.0
license: MIT
description: A filter pipeline that preprocess form data before requesting chat completion with LiteLLM.
requirements:
"""
from typing import List, Optional
from pydantic import BaseModel
class Pipeline:
class Valves(BaseModel):
# List target pipeline ids (models) that this filter will be connected to.
# If you want to connect this filter to all pipelines, you can set pipelines to ["*"]
# e.g. ["llama3:latest", "gpt-3.5-turbo"]
pipelines: List[str] = []
# Assign a priority level to the filter pipeline.
# The priority level determines the order in which the filter pipelines are executed.
# The lower the number, the higher the priority.
priority: int = 0
def __init__(self):
# Pipeline filters are only compatible with Open WebUI
# You can think of filter pipeline as a middleware that can be used to edit the form data before it is sent to the OpenAI API.
self.type = "filter"
# Optionally, you can set the id and name of the pipeline.
# Best practice is to not specify the id so that it can be automatically inferred from the filename, so that users can install multiple versions of the same pipeline.
# The identifier must be unique across all pipelines.
# The identifier must be an alphanumeric string that can include underscores or hyphens. It cannot contain spaces, special characters, slashes, or backslashes.
self.name = "Chat Info Filter"
# Initialize
self.valves = self.Valves(
**{
"pipelines": ["*"], # Connect to all pipelines
"priority": 0
}
)
self.chat_generations = {}
pass
async def on_startup(self):
# This function is called when the server is started.
print(f"on_startup:{__name__}")
pass
async def on_shutdown(self):
# This function is called when the server is stopped.
print(f"on_shutdown:{__name__}")
pass
async def on_valves_updated(self):
# This function is called when the valves are updated.
pass
async def inlet(self, body: dict, user: Optional[dict] = None) -> dict:
print(f"inlet:{__name__}")
body["custom_metadata"] = {"session_id": body['chat_id']}
if user := user if user else body.get("user"):
body["custom_metadata"]["trace_user_id"] = f'{user["name"]} / {user["email"]}'
else:
print(f"Error: user & body[\"user\"] are both None")
return body pipe"""
title: LiteLLM Manifold Pipeline
author: open-webui
date: 2024-05-30
version: 1.0.1
license: MIT
description: A manifold pipeline that uses LiteLLM.
"""
from typing import List, Union, Generator, Iterator
from schemas import OpenAIChatMessage
from pydantic import BaseModel
import requests
import os
class Pipeline:
class Valves(BaseModel):
LITELLM_BASE_URL: str = ""
LITELLM_API_KEY: str = ""
LITELLM_PIPELINE_DEBUG: bool = False
def __init__(self):
# You can also set the pipelines that are available in this pipeline.
# Set manifold to True if you want to use this pipeline as a manifold.
# Manifold pipelines can have multiple pipelines.
self.type = "manifold"
# Optionally, you can set the id and name of the pipeline.
# Best practice is to not specify the id so that it can be automatically inferred from the filename, so that users can install multiple versions of the same pipeline.
# The identifier must be unique across all pipelines.
# The identifier must be an alphanumeric string that can include underscores or hyphens. It cannot contain spaces, special characters, slashes, or backslashes.
# self.id = "litellm_manifold"
# Optionally, you can set the name of the manifold pipeline.
# self.name = "LiteLLM: "
self.name = ""
# Initialize rate limits
self.valves = self.Valves(
**{
"LITELLM_BASE_URL": os.getenv(
"LITELLM_BASE_URL", "http://localhost:4001"
),
"LITELLM_API_KEY": os.getenv("LITELLM_API_KEY", "your-api-key-here"),
"LITELLM_PIPELINE_DEBUG": os.getenv("LITELLM_PIPELINE_DEBUG", True),
}
)
# Get models on initialization
self.pipelines = self.get_litellm_models()
pass
async def on_startup(self):
# This function is called when the server is started.
print(f"on_startup:{__name__}")
# Get models on startup
self.pipelines = self.get_litellm_models()
pass
async def on_shutdown(self):
# This function is called when the server is stopped.
print(f"on_shutdown:{__name__}")
pass
async def on_valves_updated(self):
# This function is called when the valves are updated.
self.pipelines = self.get_litellm_models()
pass
def get_litellm_models(self):
headers = {}
if self.valves.LITELLM_API_KEY:
headers["Authorization"] = f"Bearer {self.valves.LITELLM_API_KEY}"
if self.valves.LITELLM_BASE_URL:
try:
r = requests.get(
f"{self.valves.LITELLM_BASE_URL}/v1/models", headers=headers
)
models = r.json()
return [
{
"id": model["id"],
"name": model["name"] if "name" in model else model["id"],
}
for model in models["data"]
]
except Exception as e:
print(f"Error fetching models from LiteLLM: {e}")
return [
{
"id": "error",
"name": "Could not fetch models from LiteLLM, please update the URL in the valves.",
},
]
else:
print("LITELLM_BASE_URL not set. Please configure it in the valves.")
return []
def pipe(
self, user_message: str, model_id: str, messages: List[dict], body: dict
) -> Union[str, Generator, Iterator]:
if "user" in body:
print("######################################")
print(f'# User: {body["user"]["name"]} / {body["user"]["email"]}')
print(f"# Message: {user_message}")
print("######################################")
headers = {}
if self.valves.LITELLM_API_KEY:
headers["Authorization"] = f"Bearer {self.valves.LITELLM_API_KEY}"
try:
payload = {**body, "model": model_id}
payload.pop("chat_id", None)
payload.pop("user", None)
payload.pop("title", None)
payload.pop("custom_metadata", None)
if body.get('custom_metadata'):
payload["metadata"] = body["custom_metadata"]
r = requests.post(
url=f"{self.valves.LITELLM_BASE_URL}/v1/chat/completions",
json=payload,
headers=headers,
stream=True,
)
r.raise_for_status()
if body["stream"]:
return r.iter_lines()
else:
return r.json()
except Exception as e:
return f"Error: {e}" graph TB
subgraph My Solution
direction TB
subgraph "User 👤"
Client[Client 🌐]
end
note["`2️⃣. Automaticly remove metadata by Open WebUI's OpenAI interface, insert **session_id** & **trace_user_id** into custom column **custom_metadata** for LiteLLM
4️⃣. Insert **custom_metadat** into payload as **metadata** and remove from original body
8️⃣. Success/Failure callback to Langfuse`"]
subgraph FTC GPT
direction TB
OW[Open WebUI]
AO[Azure OpenAI]
Langfuse
LiteLLM
subgraph Pipelines
direction LR
filter
pipe
end
end
end
Client -->|1| OW
OW -->|2| filter
filter -.->|3| OW
OW -.->|4| pipe
pipe -->|5| LiteLLM
LiteLLM -->|6| AO
AO -->|7| LiteLLM
LiteLLM -->|8| Langfuse
LiteLLM -->|9| pipe
pipe -->|10| OW
OW -.->|11| Client
linkStyle 1 color:#6a83a8
linkStyle 3 color:#6a83a8
linkStyle 7 color:#6a83a8
style note text-align: left
|
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
Currently, it's challenging to track the cost associated with each user.
Describe the solution you'd like
<litellm>/key/generate
API.LiteLLM
models using their unique API keys.Describe alternatives you've considered
open-webui
(hosted web) account for user and retrieves<litellm>/key/generate
api using the email addressAdditional context
Reference:
Beta Was this translation helpful? Give feedback.
All reactions