Skip to content

Commit

Permalink
docs: restructure navigation and fix code formatting (#1191)
Browse files Browse the repository at this point in the history
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Ivan Leo <[email protected]>
  • Loading branch information
devin-ai-integration[bot] and ivanleomk authored Nov 19, 2024
1 parent 3f371ab commit 2490702
Show file tree
Hide file tree
Showing 49 changed files with 3,192 additions and 873 deletions.
9 changes: 4 additions & 5 deletions docs/blog/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,14 +47,13 @@ If you want to get updates on new features and tips on how to use Instructor, yo

## Integrations and Tools

- [Ollama Integration](../hub/ollama.md)
- [llama-cpp-python Integration](../hub/llama-cpp-python.md)
- [Anyscale Integration](../hub/anyscale.md)
- [Together Compute Integration](../hub/together.md)
- [Ollama Integration](../integrations/ollama.md)
- [llama-cpp-python Integration](../integrations/llama-cpp-python.md)
- [Together Compute Integration](../integrations/together.md)
- [Extracting Data into Pandas DataFrame using GPT-3.5 Turbo](../hub/pandas_df.md)
- [Implementing Streaming Partial Responses with Field-Level Streaming](../hub/partial_streaming.md)

## Media and Resources

- [Course: Structured Outputs with Instructor](https://www.wandb.courses/courses/steering-language-models?x=1)
- [Keynote: Pydantic is All You Need](posts/aisummit-2023.md)
- [Keynote: Pydantic is All You Need](posts/aisummit-2023.md)
12 changes: 6 additions & 6 deletions docs/blog/posts/best_framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ from pydantic import BaseModel
import instructor

class User(BaseModel):
name: str
name: str
age: int

client = instructor.from_openai(openai.OpenAI())
Expand All @@ -42,7 +42,7 @@ user = client.chat.completions.create(
response_model=User, # (1)!
messages=[
{
"role": "user",
"role": "user",
"content": "Extract the user's name and age from this: John is 25 years old"
}
]
Expand All @@ -63,14 +63,14 @@ Other features on instructor, in and out of the llibrary are:
2. Ability to use [Pydantic's validation context](../../concepts/reask_validation.md)
3. [Parallel Tool Calling](../../concepts/parallel.md) with correct types
4. Streaming [Partial](../../concepts/partial.md) and [Iterable](../../concepts/iterable.md) data.
5. Returning [Primitive](../../concepts/types.md) Types and [Unions](../../concepts/unions.md) as well!
6. Lots, and Lots of [Cookbooks](../../examples/index.md), [Tutorials](../../tutorials/1-introduction.ipynb), Documentation and even [instructor hub](../../hub/index.md)
5. Returning [Primitive](../../concepts/types.md) Types and [Unions](../../concepts/unions.md) as well!
6. Lots, and Lots of [Cookbooks](../../examples/index.md), [Tutorials](../../tutorials/1-introduction.ipynb), Documentation and even [instructor hub](../../integrations/index.md)

## Instructor's Broad Applicability

One of the key strengths of Instructor is that it's designed as a lightweight patch over the official OpenAI Python SDK. This means it can be easily integrated not just with OpenAI's hosted API service, but with any provider or platform that exposes an interface compatible with the OpenAI SDK.

For example, providers like [Anyscale](../../hub/anyscale.md), [Together](../../hub/together.md), [Ollama](../../hub/ollama.md), [Groq](../../hub/groq.md), and [llama-cpp-python](../../hub/llama-cpp-python.md) all either use or mimic the OpenAI Python SDK under the hood. With Instructor's zero-overhead patching approach, teams can immediately start deriving structured data outputs from any of these providers. There's no need for custom integration work.
For example, providers like [Together](../../integrations/together.md), [Ollama](../../integrations/ollama.md), [Groq](../../integrations/groq.md), and [llama-cpp-python](../../integrations/llama-cpp-python.md) all either use or mimic the OpenAI Python SDK under the hood. With Instructor's zero-overhead patching approach, teams can immediately start deriving structured data outputs from any of these providers. There's no need for custom integration work.

## Direct access to the messages array

Expand All @@ -84,4 +84,4 @@ This incremental, zero-overhead adoption path makes Instructor perfect for sprin

And if you decide Instructor isn't a good fit after all, removing it is as simple as not applying the patch! The familiarity and flexibility of working directly with the OpenAI SDK is a core strength.

Instructor solves the "string hellll" of unstructured LLM outputs. It allows teams to easily realize the full potential of tools like GPTs by mapping their text to type-safe, validated data structures. If you're looking to get more structured value out of LLMs, give Instructor a try!
Instructor solves the "string hellll" of unstructured LLM outputs. It allows teams to easily realize the full potential of tools like GPTs by mapping their text to type-safe, validated data structures. If you're looking to get more structured value out of LLMs, give Instructor a try!
9 changes: 5 additions & 4 deletions docs/blog/posts/introducing-structured-outputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ In this article, we'll show how `instructor` addresses many of these challenges

### Limited Validation and Retry Logic

Validation is crucial for building reliable and effective applications. We want to catch errors in real time using `Pydantic` [validators](/concepts/reask_validation/) in order to allow our LLM to correct its responses on the fly.
Validation is crucial for building reliable and effective applications. We want to catch errors in real time using `Pydantic` [validators](../../concepts/reask_validation.md) in order to allow our LLM to correct its responses on the fly.

Let's see an example of a simple validator below which ensures user names are always in uppercase.

Expand Down Expand Up @@ -192,12 +192,13 @@ This built-in retry logic allows for targetted correction to the generated respo

### Real-time Streaming Validation

A common use-case is to define a single schema and extract multiple instances of it. With `instructor`, doing this is relatively straightforward by using [our `create_iterable` method](/concepts/lists/).
A common use-case is to define a single schema and extract multiple instances of it. With `instructor`, doing this is relatively straightforward by using [our `create_iterable` method](../../concepts/lists.md).

```python
import instructor
import openai
from pydantic import BaseModel
```

client = instructor.from_openai(openai.OpenAI(), mode=instructor.Mode.TOOLS_STRICT)

Expand Down Expand Up @@ -228,7 +229,7 @@ for user in users:
#> name='John' age=10
```
Other times, we might also want to stream out information as it's dynamically generated into some sort of frontend component With `instructor`, you'll be able to do just that [using the `create_partial` method](/concepts/partial/).
Other times, we might also want to stream out information as it's dynamically generated into some sort of frontend component With `instructor`, you'll be able to do just that [using the `create_partial` method](../../concepts/partial.md).
```python
import instructor
Expand Down Expand Up @@ -375,4 +376,4 @@ While OpenAI's Structured Outputs shows promise, it has key limitations. The sys

If you're interested in Structured Outputs, `instructor` addresses these critical issues. It provides automatic retries, real-time input validation, and multi-provider integration, allowing developers to more effectively implement Structured Outputs in their AI projects.

if you haven't given `instructor` a shot, try it today!
if you haven't given `instructor` a shot, try it today!
91 changes: 30 additions & 61 deletions docs/blog/posts/open_source.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ tags:
- API Integration
---

# Structured Output for Open Source and Local LLMs
# Structured Output for Open Source and Local LLMs

Instructor has expanded its capabilities for language models. It started with API interactions via the OpenAI SDK, using [Pydantic](https://pydantic-docs.helpmanual.io/) for structured data validation. Now, Instructor supports multiple models and platforms.

The integration of [JSON mode](../../concepts/patching.md#json-mode) improved adaptability to vision models and open source alternatives. This allows support for models from [GPT](https://openai.com/api/) and [Mistral](https://mistral.ai) to models on [Ollama](https://ollama.ai) and [Hugging Face](https://huggingface.co/models), using [llama-cpp-python](../../hub/llama-cpp-python.md).
The integration of [JSON mode](../../concepts/patching.md#json-mode) improved adaptability to vision models and open source alternatives. This allows support for models from [GPT](https://openai.com/api/) and [Mistral](https://mistral.ai) to models on [Ollama](https://ollama.ai) and [Hugging Face](https://huggingface.co/models), using [llama-cpp-python](../../integrations/llama-cpp-python.md).

Instructor now works with cloud-based APIs and local models for structured data extraction. Developers can refer to our guide on [Patching](../../concepts/patching.md) for information on using JSON mode with different models.

Expand All @@ -40,7 +40,7 @@ OpenAI clients offer functionalities for different needs. We explore clients int

### Ollama: A New Frontier for Local Models

Ollama enables structured outputs with local models using JSON schema. See our [Ollama documentation](../../hub/ollama.md) for details.
Ollama enables structured outputs with local models using JSON schema. See our [Ollama documentation](../../integrations/ollama.md) for details.

For setup and features, refer to the documentation. The [Ollama website](https://ollama.ai/download) provides resources, models, and support.

Expand Down Expand Up @@ -68,6 +68,7 @@ client = instructor.from_openai(
mode=instructor.Mode.JSON,
)


user = client.chat.completions.create(
model="llama2",
messages=[
Expand All @@ -93,7 +94,6 @@ Example of using llama-cpp-python for structured outputs:
```python
import llama_cpp
import instructor

from llama_cpp.llama_speculative import LlamaPromptLookupDecoding
from pydantic import BaseModel

Expand All @@ -111,9 +111,10 @@ llama = llama_cpp.Llama(

create = instructor.patch(
create=llama.create_chat_completion_openai_v1,
mode=instructor.Mode.JSON_SCHEMA,
mode=instructor.Mode.JSON_SCHEMA,
)


class UserDetail(BaseModel):
name: str
age: int
Expand All @@ -131,72 +132,32 @@ user = create(

print(user)
#> name='Jason' age=30
"""
```

## Alternative Providers

### Anyscale
Anyscale's Mistral model, as detailed in our [Anyscale documentation](../../hub/anyscale.md) and on [Anyscale's official documentation](https://docs.anyscale.com/), introduces the ability to obtain structured outputs using JSON schema.
```bash
export ANYSCALE_API_KEY="your-api-key"
```
```python
import os
from openai import OpenAI
from pydantic import BaseModel
import instructor
class UserDetails(BaseModel):
name: str
age: int
# enables `response_model` in create call
client = instructor.from_openai(
OpenAI(
base_url="https://api.endpoints.anyscale.com/v1",
api_key=os.environ["ANYSCALE_API_KEY"],
),
# This uses Anyscale's json schema output mode
mode=instructor.Mode.JSON_SCHEMA,
)
resp = client.chat.completions.create(
model="mistralai/Mixtral-8x7B-Instruct-v0.1",
messages=[
{"role": "system", "content": "You are a world class extractor"},
{"role": "user", "content": 'Extract the following entities: "Jason is 20"'},
],
response_model=UserDetails,
)
print(resp)
#> name='Jason' age=20
```
### Groq

Groq's platform, detailed further in our [Groq documentation](../../hub/groq.md) and on [Groq's official documentation](https://groq.com/), offers a unique approach to processing with its tensor architecture. This innovation significantly enhances the performance of structured output processing.
Groq's platform, detailed further in our [Groq documentation](../../integrations/groq.md) and on [Groq's official documentation](https://groq.com/), offers a unique approach to processing with its tensor architecture. This innovation significantly enhances the performance of structured output processing.

```bash
export GROQ_API_KEY="your-api-key"
```

```python
import os
import instructor
import groq
from pydantic import BaseModel

client = qrog.Groq(
import groq
import instructor


client = groq.Groq(
api_key=os.environ.get("GROQ_API_KEY"),
)

# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.create methods to support the response_model parameter
# By default, the patch function will patch the ChatCompletion.create and ChatCompletion.create methods
# to support the response_model parameter
client = instructor.from_openai(client, mode=instructor.Mode.MD_JSON)


Expand All @@ -216,24 +177,26 @@ user: UserExtract = client.chat.completions.create(
)

assert isinstance(user, UserExtract), "Should be instance of UserExtract"

print(user)
#> name='jason' age=25
"""
```

### Together AI

Together AI, when combined with Instructor, offers a seamless experience for developers looking to leverage structured outputs in their applications. For more details, refer to our [Together AI documentation](../hub/together.md) and explore the [patching guide](../concepts/patching.md) to enhance your applications.
Together AI, when combined with Instructor, offers a seamless experience for developers looking to leverage structured outputs in their applications. For more details, refer to our [Together AI documentation](../../integrations/together.md) and explore the [patching guide](../../concepts/patching.md) to enhance your applications.

```bash
export TOGETHER_API_KEY="your-api-key"
```

```python
import os
import openai
from pydantic import BaseModel

import instructor
import openai


client = openai.OpenAI(
base_url="https://api.together.xyz/v1",
Expand All @@ -242,6 +205,7 @@ client = openai.OpenAI(

client = instructor.from_openai(client, mode=instructor.Mode.TOOLS)


class UserExtract(BaseModel):
name: str
age: int
Expand All @@ -256,29 +220,33 @@ user: UserExtract = client.chat.completions.create(
)

assert isinstance(user, UserExtract), "Should be instance of UserExtract"
print(user)

print(user)
#> name='jason' age=25
```

### Mistral

For those interested in exploring the capabilities of Mistral Large with Instructor, we highly recommend checking out our comprehensive guide on [Mistral Large](../../hub/mistral.md).
For those interested in exploring the capabilities of Mistral Large with Instructor, we highly recommend checking out our comprehensive guide on [Mistral Large](../../integrations/mistral.md).

```python
import instructor

from pydantic import BaseModel
from mistralai.client import MistralClient


client = MistralClient()

patched_chat = instructor.from_openai(create=client.chat, mode=instructor.Mode.MISTRAL_TOOLS)
patched_chat = instructor.from_openai(
create=client.chat, mode=instructor.Mode.MISTRAL_TOOLS
)


class UserDetails(BaseModel):
name: str
age: int


resp = patched_chat(
model="mistral-large-latest",
response_model=UserDetails,
Expand All @@ -289,6 +257,7 @@ resp = patched_chat(
},
],
)

print(resp)
#> name='Jason' age=20
```
```
7 changes: 3 additions & 4 deletions docs/blog/posts/pairwise-llm-judge.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Next, we'll create a function that uses our LLM to judge the relevance between a
```python
def judge_relevance(question: str, text: str) -> Judgment:
return client.chat.create(
model="gpt-4o-mini",
model="gpt-4",
messages=[
{
"role": "system",
Expand Down Expand Up @@ -102,8 +102,7 @@ def judge_relevance(question: str, text: str) -> Judgment:
{{text}}
</text>
"""
},
},
}
],
response_model=Judgment,
context={"question": question, "text": text},
Expand Down Expand Up @@ -134,7 +133,7 @@ if __name__ == "__main__":
score += 1

print(f"Score: {score}/{len(test_pairs)}")
# > Score 9/10
#> Score 9/10
```

This test loop runs the judge on each pair and compares the result to a predetermined similarity value, calculating an overall score.
Expand Down
6 changes: 3 additions & 3 deletions docs/blog/posts/pydantic-is-still-all-you-need.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Pydantic, combined with function calling, offers a superior alternative for stru
- Validators to improve system reliability
- Cleaner, more maintainable code

For more details on how Pydantic enhances data validation, check out our [Data Validation with Pydantic](../concepts/models.md) guide.
For more details on how Pydantic enhances data validation, check out our [Data Validation with Pydantic](../../concepts/models.md) guide.

And here's the kicker: nothing's really changed in the past year. The core API is still just:

Expand All @@ -63,7 +63,7 @@ Since last year:
- Built a version in Rust
- Seen 40% month-over-month growth in the Python library

We now support [Ollama](../../hub/ollama.md), [llama-cpp-python](../../hub/llama-cpp-python.md), [Anthropic](../../hub/anthropic.md), [Cohere](../../hub/cohere.md), [Google](../../hub/google.md), [Vertex AI](../../hub/vertexai.md), and more. As long as language models support function calling capabilities, this API will remain standard.
We now support [Ollama](../../integrations/ollama.md), [llama-cpp-python](../../integrations/llama-cpp-python.md), [Anthropic](../../integrations/anthropic.md), [Cohere](../../integrations/cohere.md), [Google](../../integrations/google.md), [Vertex AI](../../integrations/vertex.md), and more. As long as language models support function calling capabilities, this API will remain standard.

## Key Features

Expand Down Expand Up @@ -123,4 +123,4 @@ Pydantic is still all you need for effective structured outputs with LLMs. It's

As we continue to refine AI language models, keeping these principles in mind will lead to more robust, maintainable, and powerful applications. The future of AI isn't just about what the models can do, but how seamlessly we can integrate them into our existing software ecosystems.

For more advanced use cases and integrations, check out our [examples](../../examples/index.md) section, which covers various LLM providers and specialized implementations.
For more advanced use cases and integrations, check out our [examples](../../examples/index.md) section, which covers various LLM providers and specialized implementations.
Loading

0 comments on commit 2490702

Please sign in to comment.