Skip to content

Commit

Permalink
docu: WIP: structured outputs
Browse files Browse the repository at this point in the history
  • Loading branch information
langchain4j committed Nov 22, 2024
1 parent 2cfa56d commit 5f78c9a
Show file tree
Hide file tree
Showing 2 changed files with 237 additions and 3 deletions.
9 changes: 7 additions & 2 deletions docs/docs/tutorials/5-ai-services.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ greatly simplifies using AI Services in Quarkus applications.
More information can be found [here](https://docs.quarkiverse.io/quarkus-langchain4j/dev/ai-services.html).

## AI Services in Spring Boot Application
[LangChain4j Spring Boot starter](/tutorials/spring-boot-integration)
[LangChain4j Spring Boot starter](/tutorials/spring-boot-integration/#spring-boot-starter-for-declarative-ai-services)
greatly simplifies using AI Services in Spring Boot applications.

## @SystemMessage
Expand Down Expand Up @@ -251,6 +251,11 @@ AI services currently do not support multimodality,
please use the [low-level API](/tutorials/chat-and-language-models#multimodality) for this.

## Structured Outputs

:::note
More info on Structured Outputs can be found [here](/tutorials/structured-outputs).
:::

If you want to receive a structured output from the LLM,
you can change the return type of your AI Service method from `String` to something else.
Currently, AI Services support the following return types:
Expand All @@ -272,7 +277,7 @@ Before the method returns, the AI Service will parse the output of the LLM into
You can observe appended instructions by [enabling logging](/tutorials/logging).

:::note
Some LLMs support JSON mode (aka [Structured Outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/)),
Some LLMs support [Structured Outputs](https://openai.com/index/introducing-structured-outputs-in-the-api/) feature,
where the LLM API has an option to specify a JSON schema for the desired output. If such a feature is supported and enabled,
instructions will not be appended to the end of the `UserMessage`. In this case, the JSON schema will be automatically
created from your POJO and passed to the LLM. This will guarantee that the LLM adheres to this JSON schema.
Expand Down
231 changes: 230 additions & 1 deletion docs/docs/tutorials/structured-outputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,233 @@ sidebar_position: 11

# Structured Outputs

Documentation on structured outputs can be found [here](/tutorials/ai-services#structured-outputs).
Many LLMs and LLM providers support generating outputs in a structured format, typically JSON.
These outputs can be easily mapped to Java objects and integrated into other parts of your application.

For instance, let’s assume we have a `Person` class:
```java
record Person(String name, int age, double height, boolean married) {
}
```
We aim to extract a `Person` object from unstructured text like this:
```
John is 42 years old and lives an independent life.
He stands 1.75 meters tall and carries himself with confidence.
Currently unmarried, he enjoys the freedom to focus on his personal goals and interests.
```

Currently, depending on the LLM and the LLM provider, there are four ways how this can be achieved
(from most to least reliable):
- [Structured Outputs](/tutorials/structured-outputs#structured-outputs)
- [Tools (Function Calling)](/tutorials/structured-outputs#tools-function-calling)
- [Prompting + JSON Mode](/tutorials/structured-outputs#prompting-json-mode)
- [Prompting](/tutorials/structured-outputs#prompting)


## Structured Outputs
Some LLM providers (currently only [OpenAI](https://platform.openai.com/docs/guides/structured-outputs)
and [Google Gemini](https://ai.google.dev/gemini-api/docs/structured-output)) support a specialized
"Structured Outputs" API that allows specifying JSON schema for the desired output.
You can view all supported LLM providers [here](/integrations/language-models) in the "Structured Outputs" column.

When a JSON schema is specified in the request, the LLM is expected to generate an output that adheres to this schema.
Please note that the JSON schema is specified in a separate attribute in the request to the LLM provider's API
and does not require additional free-form instructions to be included in the prompt (e.g., in system or user messages).

LangChain4j supports the Structured Outputs feature in both the low-level `ChatLanguageModel` API
and the high-level AI Service API.

### Low Level Structured Outputs API

In the low-level `ChatLanguageModel` API, JSON schema can be specified
using `JsonSchema` and `ResponseFormat` when creating a `ChatRequest`:
```java
ChatLanguageModel chatModel = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o-mini")
.responseFormat("json_schema") // see [1] below
.strictJsonSchema(true) // see [1] below
.logRequests(true)
.logResponses(true)
.build();
// OR
ChatLanguageModel chatModel = GoogleAiGeminiChatModel.builder()
.apiKey(System.getenv("GOOGLE_AI_GEMINI_API_KEY"))
.modelName("gemini-1.5-flash")
.responseFormat(ResponseFormat.JSON) // see [2] below
.temperature(0.0)
.logRequestsAndResponses(true)
.build();

UserMessage userMessage = UserMessage.from("""
John is 42 years old and lives an independent life.
He stands 1.75 meters tall and carries himself with confidence.
Currently unmarried, he enjoys the freedom to focus on his personal goals and interests.
""");

ResponseFormat responseFormat = ResponseFormat.builder()
.type(JSON) // see [3] below
.jsonSchema(JsonSchema.builder()
.name("Person") // see [4] below
.rootElement(JsonObjectSchema.builder() // see [5] below
.addStringProperty("name")
.addIntegerProperty("age")
.addNumberProperty("height")
.addBooleanProperty("married")
.required("name", "age", "height", "married")
.build())
.build())
.build();

ChatRequest chatRequest = ChatRequest.builder()
.messages(userMessage)
.responseFormat(responseFormat)
.build();

ChatResponse chatResponse = chatModel.chat(chatRequest);

String output = chatResponse.aiMessage().text();
System.out.println(output); // {"name":"John","age":42,"height":1.75,"married":false}

Person person = new ObjectMapper().readValue(output, Person.class);
System.out.println(person); // Person[name=John, age=42, height=1.75, married=false]
```
Notes:
- [1] - This is required to activate the Structured Outputs feature for OpenAI, see more details [here](/integrations/language-models/open-ai#structured-outputs-for-json-mode).
- [2] - This is required to activate the Structured Outputs feature for [Google AI Gemini](/integrations/language-models/google-ai-gemini).
- [3] - Response format type can be either `TEXT` (default) or `JSON`.
- [4] - OpenAI requires specifying the name for the schema.
- [5] - In most cases, the root element must be of `JsonObjectSchema` type,
however Gemini allows `JsonEnumSchema` and `JsonArraySchema` as well.

:::note
Make sure to explicitly enable Structured Outputs feature when configuring `ChatLanguageModel`,
as it is disabled by default.
:::

To support LLM-provider-agnostic way for defining JSON schema,
LangChain4j offers `ResponseFormat` and `JsonSchema` types.

The structure of the schema is defined using `JsonSchemaElement` interface,
with the following subtypes:
- `JsonStringSchema` - to support `String`, `char`/`Character`, etc.
- `JsonIntegerSchema` - to support `int`/`Integer`, `long`/`Long`, `BigInteger`, etc.
- `JsonNumberSchema` - to support `float`/`Float`, `double`/`Double`, `BigDecimal`, etc.
- `JsonBooleanSchema` - to support `boolean`/`Boolean` types.
- `JsonEnumSchema` - to support `enum`s.
- `JsonArraySchema` - to support arrays and other collection types.
- `JsonObjectSchema` - to support object types.
- `JsonReferenceSchema` - to support recursion (e.g., `Person` has a `Set<Person> children` field).

See more information in the Javadoc of these types.

### High Level Structured Outputs API

When using [AI Services](/tutorials/ai-services), one can achieve the same much easier and with less code:
```java
interface PersonExtractor {

Person extractPersonFrom(String text);
}

ChatLanguageModel chatModel = OpenAiChatModel.builder() // see [1] below
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o-mini")
.responseFormat("json_schema") // see [2] below
.strictJsonSchema(true) // see [2] below
.logRequests(true)
.logResponses(true)
.build();
// OR
ChatLanguageModel chatModel = GoogleAiGeminiChatModel.builder() // see [1] below
.apiKey(System.getenv("GOOGLE_AI_GEMINI_API_KEY"))
.modelName("gemini-1.5-flash")
.responseFormat(ResponseFormat.JSON) // see [3] below
.temperature(0.0)
.logRequestsAndResponses(true)
.build();

PersonExtractor personExtractor = AiServices.create(PersonExtractor.class, chatModel); // see [1] below

String text = """
John is 42 years old and lives an independent life.
He stands 1.75 meters tall and carries himself with confidence.
Currently unmarried, he enjoys the freedom to focus on his personal goals and interests.
""";

Person person = personExtractor.extractPersonFrom(text);

System.out.println(person); // Person[name=John, age=42, height=1.75, married=false]
```
Notes:
- [1] - In a Quarkus or a Spring Boot application, there is no need to explicitly create the `ChatLanguageModel` and the AI Service,
as these beans are created automatically. More info on this:
[for Quarkus](https://docs.quarkiverse.io/quarkus-langchain4j/dev/ai-services.html),
[for Spring Boot](https://docs.langchain4j.dev/tutorials/spring-boot-integration#spring-boot-starter-for-declarative-ai-services).
- [2] - This is required to activate the Structured Outputs feature for OpenAI, see more details [here](/integrations/language-models/open-ai#structured-outputs-for-json-mode).
- [3] - This is required to activate the Structured Outputs feature for [Google AI Gemini](/integrations/language-models/google-ai-gemini).

When AI Service returns a POJO **and** used `ChatLanguageModel` supports/enables Structured Outputs feature,
`JsonSchema`/`ResponseFormat` will be generated automatically from the specified return type.
:::note
Make sure to explicitly enable Structured Outputs feature when configuring `ChatLanguageModel`,
as it is disabled by default.
:::
:::note
The `name` of the generated `JsonSchema` is a simple name of the return type, in this case: "Person".
:::

Once LLM responds, the output is parsed into an object and returned to the caller.
:::note
While we are gradually migrating to Jackson, Gson is still used for parsing the outputs,
so Jackson annotations on your POJOs will have no effect.
:::

### Limitations
When using Structured Outputs with AI Services, there are some limitations:
- It works only with supported OpenAI and Gemini models.
- Support for Structured Outputs needs to be enabled explicitly when configuring `ChatLanguageModel`.
- It does not work in the [streaming mode](/tutorials/ai-services#streaming).
- Currently, it works only when return type is a (single) POJO or a `Result<POJO>`.
If you need other types (e.g., `List<POJO>`, `enum`, etc.), please wrap these into a POJO.
We are [working](https://github.com/langchain4j/langchain4j/pull/1938) on supporting more return types soon.
- POJOs can contain:
- Scalar/simple types (e.g., `String`, `int`/`Integer`, `double`/`Double`, `boolean`/`Boolean`, etc.)
- `enum`s
- Nested POJOs
- `List<T>`, `Set<T>` and `T[]`, where `T` is a scalar, an enum or a POJO
- All fields and sub-fields in the generated `JsonSchema` are marked as `required`, there is currently no way to make them optional.
- Classes and fields can be annotated with `@Description` to guide the LLM, foe example:
```java
@Description("a person")
record Person(@Description("person's name") String name,
@Description("person's age") int age,
@Description("person's height") double height,
@Description("is person married or not") boolean married) {
}
```
- When LLM does not support Structured Outputs feature, or it is not enabled, or return type is not a POJO,
AI Service will fall back to [prompting](/tutorials/structured-outputs#prompting).
- Recursion is currently supported only by OpenAI.


## Tools (Function Calling)
More info is coming soon.
In the meantime, please read [this section](/tutorials/tools)
and [this article](https://glaforge.dev/posts/2024/11/18/data-extraction-the-many-ways-to-get-llms-to-spit-json-content/).


## Prompting + JSON Mode
More info is coming soon.
In the meantime, please read [this section](/tutorials/ai-services#json-mode)
and [this article](https://glaforge.dev/posts/2024/11/18/data-extraction-the-many-ways-to-get-llms-to-spit-json-content/).


## Prompting
More info is coming soon.
In the meantime, please read [this section](/tutorials/ai-services#structured-outputs)
and [this article](https://glaforge.dev/posts/2024/11/18/data-extraction-the-many-ways-to-get-llms-to-spit-json-content/).


## Related Tutorials
- [Data extraction: The many ways to get LLMs to spit JSON content](https://glaforge.dev/posts/2024/11/18/data-extraction-the-many-ways-to-get-llms-to-spit-json-content/) by [Guillaume Laforge](https://glaforge.dev/about/)

0 comments on commit 5f78c9a

Please sign in to comment.