-
-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Users adding more OpenAIApi-compatible models to providers.py, "modding settings" from self-written external packages for example. #397
Comments
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗 |
Ability to add OpenAI Compatible endpoints for Chat & Embeddings will be awesome feature.
Sample code (using Langchain) class ChatCompletionRequest(BaseModel):
model: str
messages: List[Dict[str, Any]]
max_tokens: Optional[int] = Field(default=2048, description="Optional max tokens, default is 2048")
temperature: Optional[float] = Field(default=0.1, description="Optional temperature, default is 0.1")
stream: Optional[bool] = False
tools: Sequence[Union[Dict[str, Any]]] = Field(default=None)
tool_choice: Optional[Literal["none", "auto", "required"]] = Field(default=None)
@router.post("/chat/completions")
async def chat_completions_create(openai_request: ChatCompletionRequest):
"""
Endpoint to create chat completions.
Args:
openai_request (ChatCompletionRequest): The request payload containing model, messages, max_tokens, temperature,
and stream flag.
Returns:
JSON response with chat completions or a stream of message chunks.
"""
llm_model = openai_request.model
if openai_request.tools:
llm_model = llm_model.bind_tools(tools=openai_request.tools,
tool_choice=openai_request.tool_choice)
# Convert OpenAI messages to the required format
converted_messages = convert_openai_messages(openai_request.messages)
# If streaming is not requested, invoke the model and return the result
if not openai_request.stream:
result = llm_model.invoke(converted_messages)
return {
"id": uuid.uuid4().__str__(),
"object": "chat.completion",
"created": time.time(),
"model": openai_request.model,
"choices": [
{
"message": convert_message_to_dict_response(result)
}
]
}
else:
generator = streaming_invoke(openai_request, llm_model, converted_messages)
return StreamingResponse(generator, media_type="text/event-stream")
async def streaming_invoke(openai_request, llm_model, converted_messages) -> AsyncIterable[str]:
i = 0
async for chunk in llm_model.astream(converted_messages):
result = {
"id": uuid.uuid4().__str__(),
"object": "chat.completion.chunk",
"created": int(time.time()),
"model": openai_request.model,
**convert_message_chunk_to_delta(chunk, i)
}
yield f"data: {json.dumps(result)}\n\n"
i += 1
yield "data: [DONE]\n\n"
# https://github.com/langchain-ai/langchain/issues/25436
def convert_message_chunk_to_delta(chunk: BaseMessageChunk, i: int) -> Dict[str, Any]:
_dict = _convert_message_chunk(chunk, i)
if "tool_calls" in chunk.additional_kwargs:
_dict["tool_calls"] = chunk.additional_kwargs["tool_calls"]
return {"choices": [{"delta": _dict}]}
def convert_message_to_dict_response(message: BaseMessage) -> dict:
message_to_dict = convert_message_to_dict(message)
if isinstance(message, AIMessage):
if "tool_calls" in message.tool_calls and "tool_calls" not in message_to_dict:
message_dict["tool_calls"] = message.tool_calls
return message_to_dict |
@rhlarora84 Related to this documentation: https://jupyter-ai.readthedocs.io/en/latest/developers/index.html#custom-model-providers. |
Thanks, this helps. I will look into it. |
Problem
Getting new models, that we provide from our own OpenAi-compatible API into the settings-gui.
Proposed Solution
Possibility of hooking into jupyter_ai_magics/providers.py, adding to the models-list somehow (our own add-on package for example). Before the GUI is rendered.
jupyter-ai/packages/jupyter-ai-magics/jupyter_ai_magics/providers.py
Line 453 in 8f3bfe0
Still passing in a different URL (our own) to the ENV VAR OPENAI_API_BASE.
Alternatively, make the settings-GUI typeable so we can add the models directly into the GUI. (Would force each of our users to manually type in the correct models...)
Additional context
We are trying to deploy opensource models using the FastChat-framework to our internal kubernetes clusters, when these are running, we would like to provide the jupyter-ai-extension with a different URL and the models of our choosing. FastChat provides an OpenAI-compatible API, which we would like to provide for our JupyterHub users on the same clusters. We have been able to reliably deploy the models of our choosing, with wildly different naming. The reason for us wanting to host our own models and API is because of the sensitive nature of our code and data. The main use will be generating and "transpiling" into Python code, from existing code bases in SAS.
The text was updated successfully, but these errors were encountered: