Support for OpenAI Structured Output #295

mnicstruwig · 2024-08-07T07:31:17Z

OpenAI has announced structured outputs as an officially-supported part of the API.

Structured outputs can be used in two ways:

In function calling (by simple passing through a new strict argument). I think this should be an "easy win" for magentic given the current implementation of how we do structured outputs.
The response_format output type that allows you to now supply a JSON-schema, which combined with the strict argument, will guarantee (?) that outputs match the schema.

There are a bunch of other API additions (eg. a refusal string in the API output) that could also be used to give better responses in case the model refuses structured outputs.

This would be an amazing addition for reliability.

The text was updated successfully, but these errors were encountered:

jackmpcollins · 2024-08-08T02:17:51Z

Absolutely should enable turning this on strict mode for function calling. I think it should follow the openai default of off, and allow turning it on by setting strict = True in OpenaiChatModel.

I would also like to switch to using the response_format for the structured outputs in magentic. This will take a bit more work to figure out how it works in conjunction with function calls, streaming, etc. It also only works with the newest models so it might make sense to wait a little while before switching to this.

This page from Simon Willison has a thorough overview of the new feature https://simonwillison.net/2024/Aug/6/openai-structured-outputs/

jackmpcollins · 2024-08-08T06:58:09Z

On closer look, strict is set on each individual function/tool so it might be better to allow it be set on individual tools in magentic as well.

For functions this could be done by decorating the functions to add this additional metadata

from magentic import tool

@tool(strict=True)
def my_func(a: int) -> str:
    return "hello"

@prompt(
    ...
    functions=[my_func],
)
def my_prompt_function() -> FunctionCall[str]: ...

Return types could be done similarly using Annotated

@prompt("Return an integer")
def make_int() -> Annotated[int, tool(strict=True)]: ...

jackmpcollins · 2024-08-11T02:23:22Z

Another option here for return types that might be neater than Annotated is to extend pydantic's ConfigDict to add an openai_strict field so that this setting gets tracked on the pydantic model.

Within magentic:

from pydantic import ConfigDict as _ConfigDict

class ConfigDict(_ConfigDict, total=False):
    openai_strict: bool

Usage:

from magentic import ConfigDict, prompt
from pydantic import BaseModel

class Test(BaseModel):
    model_config = ConfigDict(openai_strict=True)
    value: int

@prompt("Make a Test")
def foo() -> Test: ...

CiANSfi · 2024-08-12T16:47:11Z

When do you think this will be rolled out? No rush on my end, just curious. I also thought the project might benefit from updating the README/homepage with brief reasons as to why one should use this library, even though OpenAI now has structured outputs

jackmpcollins · 2024-08-18T09:19:41Z

@mnicstruwig @CiANSfi

Support added for "strict" in https://github.com/jackmpcollins/magentic/releases/tag/v0.32.0 using an extended ConfigDict. See the release note for examples, and docs page for more info https://magentic.dev/structured-outputs/#configdict

magentic still uses tools for structured output as a bigger refactor is needed to switch to using JSON output / response_model. So this issue can be left open until that switch has been made.

mnicstruwig · 2024-08-21T16:00:20Z

@jackmpcollins All right, gotcha. So this will be targeted specifically for structured output generation until we can refactor the function calls.

I'm really looking forward to the function calls!

The only nitpick I have with the ConfigDict approach vs. something like Annotated is for dictionaries. It would've been nice to be able to eg. -> Annotated[dict, tool(strict=True)], but this can still easily be overcome with using a pydantic model with a dict field (and it's neater than having to write a tool annotation).

jackmpcollins · 2024-08-21T20:42:06Z

For third party types you can modify the type as shown in the pydantic docs for Strict Mode, but for python builtins you will have to subclass and add this attribute. (For dict specifically there is a small change needed to DictFunctionSchema to respect the config on serialization).

with_config(ConfigDict(openai_strict=True))(MyClass)
# OR
MyClass.__pydantic_config__ = ConfigDict(openai_strict=True)

pydantic supports Annotated[..., Strict()] for their strict setting, so maybe I should also support Annotated[..., OpenaiStrict()] or similar, though this would not make sense on individual fields. I left this for the moment as I'm not sure there's much demand for non-pydantic types (but thinking now, Iterable is a great example of where I myself would want this).

RootModel might be a way to support strict mode for arbitrary types that is closer to the existing approach.

Function calls can be strict! through the same mechanism

from typing import Annotated, Literal

from magentic import ConfigDict, with_config
from pydantic import Field


@with_config(ConfigDict(openai_strict=True))
def activate_oven(
    temperature: Annotated[int, Field(description="Temp in Fahrenheit", lt=500)],
    mode: Literal["broil", "bake", "roast"],
) -> str:
    """Turn the oven on with the provided settings."""
    return f"Preheating to {temperature} F with mode {mode}"

jackmpcollins mentioned this issue Aug 18, 2024

Add support for OpenAI structured outputs #305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for OpenAI Structured Output #295

Support for OpenAI Structured Output #295

mnicstruwig commented Aug 7, 2024

jackmpcollins commented Aug 8, 2024

jackmpcollins commented Aug 8, 2024

jackmpcollins commented Aug 11, 2024

CiANSfi commented Aug 12, 2024 •

edited

Loading

jackmpcollins commented Aug 18, 2024 •

edited

Loading

mnicstruwig commented Aug 21, 2024 •

edited

Loading

jackmpcollins commented Aug 21, 2024

Support for OpenAI Structured Output #295

Support for OpenAI Structured Output #295

Comments

mnicstruwig commented Aug 7, 2024

jackmpcollins commented Aug 8, 2024

jackmpcollins commented Aug 8, 2024

jackmpcollins commented Aug 11, 2024

CiANSfi commented Aug 12, 2024 • edited Loading

jackmpcollins commented Aug 18, 2024 • edited Loading

mnicstruwig commented Aug 21, 2024 • edited Loading

jackmpcollins commented Aug 21, 2024

CiANSfi commented Aug 12, 2024 •

edited

Loading

jackmpcollins commented Aug 18, 2024 •

edited

Loading

mnicstruwig commented Aug 21, 2024 •

edited

Loading