Skip to content

Commit

Permalink
Merge branch 'main' into hf/1104
Browse files Browse the repository at this point in the history
  • Loading branch information
ivanbelenky authored Oct 28, 2024
2 parents df07c39 + 651465b commit 821afdf
Show file tree
Hide file tree
Showing 13 changed files with 632 additions and 330 deletions.
20 changes: 20 additions & 0 deletions .github/workflows/ai-label.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: AI Labeler

on:
issues:
types: [opened, reopened]
pull_request:
types: [opened, reopened]

jobs:
ai-labeler:
runs-on: ubuntu-latest
permissions:
contents: read
issues: write
pull-requests: write
steps:
- uses: actions/checkout@v4
- uses: jlowin/[email protected]
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Instructor is the most popular Python library for working with structured output

## Want your logo on our website?

If your company use instructor a lot, we'd love to have your logo on our website! Please fill out [this form](https://q7gjsgfstrp.typeform.com/to/wluQlVVQ)
If your company uses Instructor a lot, we'd love to have your logo on our website! Please fill out [this form](https://q7gjsgfstrp.typeform.com/to/wluQlVVQ)

## Key Features

Expand Down Expand Up @@ -46,7 +46,7 @@ client = instructor.from_openai(OpenAI())

# Extract structured data from natural language
user_info = client.chat.completions.create(
model="gpt-3.5-turbo",
model="gpt-4o-mini",
response_model=UserInfo,
messages=[{"role": "user", "content": "John Doe is 30 years old."}],
)
Expand Down Expand Up @@ -84,7 +84,7 @@ client.on("completion:kwargs", log_kwargs)
client.on("completion:error", log_exception)

user_info = client.chat.completions.create(
model="gpt-3.5-turbo",
model="gpt-4o-mini",
response_model=UserInfo,
messages=[{"role": "user", "content": "Extract the user name: 'John is 20 years old'"}],
)
Expand All @@ -99,7 +99,7 @@ user_info = client.chat.completions.create(
'content': "Extract the user name: 'John is 20 years old'",
}
],
'model': 'gpt-3.5-turbo',
'model': 'gpt-4o-mini',
'tools': [
{
'type': 'function',
Expand Down Expand Up @@ -235,7 +235,7 @@ client = instructor.from_gemini(
)
```

Alternatively, you can [call Gemini from the OpenAI client](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library#python).You'll have to setup [`gcloud`](https://cloud.google.com/docs/authentication/provide-credentials-adc#local-dev), get setup on Vertex AI, and install the Google Auth library.
Alternatively, you can [call Gemini from the OpenAI client](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library#python). You'll have to setup [`gcloud`](https://cloud.google.com/docs/authentication/provide-credentials-adc#local-dev), get setup on Vertex AI, and install the Google Auth library.

```sh
pip install google-auth
Expand Down Expand Up @@ -321,7 +321,7 @@ assert resp.age == 25

## Types are inferred correctly

This was the dream of instructor but due to the patching of openai, it wasnt possible for me to get typing to work well. Now, with the new client, we can get typing to work well! We've also added a few `create_*` methods to make it easier to create iterables and partials, and to access the original completion.
This was the dream of Instructor but due to the patching of OpenAI, it wasn't possible for me to get typing to work well. Now, with the new client, we can get typing to work well! We've also added a few `create_*` methods to make it easier to create iterables and partials, and to access the original completion.

### Calling `create`

Expand Down Expand Up @@ -500,21 +500,21 @@ for user in users:

## [Evals](https://github.com/jxnl/instructor/tree/main/tests/llm/test_openai/evals#how-to-contribute-writing-and-running-evaluation-tests)

We invite you to contribute to evals in `pytest` as a way to monitor the quality of the OpenAI models and the `instructor` library. To get started check out the evals for [anthropic](https://github.com/jxnl/instructor/blob/main/tests/llm/test_anthropic/evals/test_simple.py) and [OpenAI](https://github.com/jxnl/instructor/tree/main/tests/llm/test_openai/evals#how-to-contribute-writing-and-running-evaluation-tests) and contribute your own evals in the form of pytest tests. These evals will be run once a week and the results will be posted.
We invite you to contribute to evals in `pytest` as a way to monitor the quality of the OpenAI models and the `instructor` library. To get started check out the evals for [Anthropic](https://github.com/jxnl/instructor/blob/main/tests/llm/test_anthropic/evals/test_simple.py) and [OpenAI](https://github.com/jxnl/instructor/tree/main/tests/llm/test_openai/evals#how-to-contribute-writing-and-running-evaluation-tests) and contribute your own evals in the form of pytest tests. These evals will be run once a week and the results will be posted.

## Contributing

If you want to help, checkout some of the issues marked as `good-first-issue` or `help-wanted` found [here](https://github.com/jxnl/instructor/labels/good%20first%20issue). They could be anything from code improvements, a guest blog post, or a new cookbook.

## CLI

We also provide some added CLI functionality for easy convinience:
We also provide some added CLI functionality for easy convenience:

- `instructor jobs` : This helps with the creation of fine-tuning jobs with OpenAI. Simple use `instructor jobs create-from-file --help` to get started creating your first fine-tuned GPT3.5 model
- `instructor jobs` : This helps with the creation of fine-tuning jobs with OpenAI. Simple use `instructor jobs create-from-file --help` to get started creating your first fine-tuned GPT-3.5 model

- `instructor files` : Manage your uploaded files with ease. You'll be able to create, delete and upload files all from the command line

- `instructor usage` : Instead of heading to the OpenAI site each time, you can monitor your usage from the cli and filter by date and time period. Note that usage often takes ~5-10 minutes to update from OpenAI's side
- `instructor usage` : Instead of heading to the OpenAI site each time, you can monitor your usage from the CLI and filter by date and time period. Note that usage often takes ~5-10 minutes to update from OpenAI's side

## License

Expand Down
187 changes: 187 additions & 0 deletions docs/blog/posts/llm-as-reranker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
---
authors:
- jxnl
categories:
- LLM
- Pydantic
comments: true
date: 2024-10-23
description: Learn how to use Instructor and Pydantic to create an LLM-based reranker for improving search results relevance.
draft: false
tags:
- LLM
- Pydantic
- Instructor
- Search Relevance
- Reranking
---

# Building an LLM-based Reranker for your RAG pipeline

Are you struggling with irrelevant search results in your Retrieval-Augmented Generation (RAG) pipeline?

Imagine having a powerful tool that can intelligently reassess and reorder your search results, significantly improving their relevance to user queries.

In this blog post, we'll show you how to create an LLM-based reranker using Instructor and Pydantic. This approach will:

- Enhance the accuracy of your search results
- Leverage the power of large language models (LLMs)
- Utilize structured outputs for precise information retrieval

By the end of this tutorial, you'll be able to implement a llm reranker to label your synthetic data for fine-tuning a traditional reranker, or to build out an evaluation pipeline for your RAG system. Let's dive in!

## Setting Up the Environment

First, let's set up our environment with the necessary imports:

```python
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field, field_validator

client = instructor.from_openai(OpenAI())
```

We're using the `instructor` library, which integrates seamlessly with OpenAI's API and Pydantic for structured outputs.

## Defining the Reranking Models

We'll use Pydantic to define our `Label` and `RerankedResults` models that structure the output of our LLM:

Notice that not only do I reference the chunk_id in the label class, I also asked a language model to use chain of thought. This is very useful for using models like 4o Mini or Claude, but not necessarily if we plan to use the `o1-mini` and `o1-preview` models.

```python
class Label(BaseModel):
chunk_id: int = Field(description="The unique identifier of the text chunk")
chain_of_thought: str = Field(description="The reasoning process used to evaluate the relevance")
relevancy: int = Field(
description="Relevancy score from 0 to 10, where 10 is most relevant",
ge=0,
le=10,
)


class RerankedResults(BaseModel):
labels: list[Label] = Field(description="List of labeled and ranked chunks")

@field_validator("labels")
@classmethod
def model_validate(cls, v: list[Label]) -> list[Label]:
return sorted(v, key=lambda x: x.relevancy, reverse=True)
```

These models ensure that our LLM's output is structured and includes a list of labeled chunks with their relevancy scores. The `RerankedResults` model includes a validator that automatically sorts the labels by relevancy in descending order.

## Creating the Reranker Function

Next, we'll create a function that uses our LLM to rerank a list of text chunks based on their relevance to a query:

```python
def rerank_results(query: str, chunks: list[dict]) -> RerankedResults:
return client.chat.completions.create(
model="gpt-4o-mini",
response_model=RerankedResults,
messages=[
{
"role": "system",
"content": """
You are an expert search result ranker. Your task is to evaluate the relevance of each text chunk to the given query and assign a relevancy score.
For each chunk:
1. Analyze its content in relation to the query.
2. Provide a chain of thought explaining your reasoning.
3. Assign a relevancy score from 0 to 10, where 10 is most relevant.
Be objective and consistent in your evaluations.
""",
},
{
"role": "user",
"content": """
<query>{{ query }}</query>
<chunks_to_rank>
{% for chunk in chunks %}
<chunk id="{{ chunk.id }}">
{{ chunk.text }}
</chunk>
{% endfor %}
</chunks_to_rank>
Please provide a RerankedResults object with a Label for each chunk.
""",
},
],
context={"query": query, "chunks": chunks},
)
```

This function takes a query and a list of text chunks as input, sends them to the LLM with a predefined prompt, and returns a structured `RerankedResults` object. Thanks to instructor we can use jinja templating to inject the query and chunks into the prompt by passing in the `context` parameter.

## Testing the Reranker

To test our LLM-based reranker, we can create a sample query and a list of text chunks. Here's an example of how to use the reranker:

```python
def main():
query = "What are the health benefits of regular exercise?"
chunks = [
{
"id": 0,
"text": "Regular exercise can improve cardiovascular health and reduce the risk of heart disease.",
},
{
"id": 1,
"text": "The price of gym memberships varies widely depending on location and facilities.",
},
{
"id": 2,
"text": "Exercise has been shown to boost mood and reduce symptoms of depression and anxiety.",
},
{
"id": 3,
"text": "Proper nutrition is essential for maintaining a healthy lifestyle.",
},
{
"id": 4,
"text": "Strength training can increase muscle mass and improve bone density, especially important as we age.",
},
]

results = rerank_results(query, chunks)

print("Reranked results:")
for label in results.labels:
print(f"Chunk {label.chunk_id} (Relevancy: {label.relevancy}):")
print(f"Text: {chunks[label.chunk_id]['text']}")
print(f"Reasoning: {label.chain_of_thought}")
print()

if __name__ == "__main__":
main()
```

This test demonstrates how the reranker evaluates and sorts the chunks based on their relevance to the query. The full implementation can be found in the `examples/reranker/run.py` file.

If you want to extend this example, you could use the `rerank_results` function to label synthetic data for fine-tuning a traditional reranker, or to build out an evaluation pipeline for your RAG system.

Moreover, we could also add validators to the `Label.chunk_id` field to ensure that the chunk_id is present in the `chunks` list. This might be useful if labels are `uuids` or complex strings and we want to ensure that the chunk_id is a valid index for the chunks list.

heres an example

```python
class Label(BaseModel):
chunk_id: int = Field(description="The unique identifier of the text chunk")
...

@field_validator("chunk_id")
@classmethod
def validate_chunk_id(cls, v: int, info: ValidationInfo) -> int:
context = info.context
chunks = context["chunks"]
if v not in [chunk["id"] for chunk in chunks]:
raise ValueError(f"Chunk with id {v} not found, must be one of {[chunk['id'] for chunk in chunks]}")
return v
```

This will automatically check that the `chunk_id` is present in the `chunks` list and raise a `ValueError` if it is not, where `context` is the context dictionary that we passed into the `rerank_results` function.
6 changes: 3 additions & 3 deletions docs/blog/posts/openai-multimodal.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ categories:
- OpenAI
- Audio
comments: true
date: 2025-10-17
date: 2024-10-17
description: Explore the new audio capabilities in OpenAI's Chat Completions API using the gpt-4o-audio-preview model.
draft: false
tags:
Expand Down Expand Up @@ -33,7 +33,7 @@ The new audio support in the Chat Completions API offers several compelling feat

To demonstrate how to use this new functionality, let's look at a simple example using the `instructor` library:

"""python
```python
from openai import OpenAI
from pydantic import BaseModel
import instructor
Expand Down Expand Up @@ -64,7 +64,7 @@ resp = client.chat.completions.create(

print(resp)
# Expected output: Person(name='Jason', age=20)
"""
```

In this example, we're using the `gpt-4o-audio-preview` model to extract information from an audio file. The API processes the audio input and returns structured data (a Person object with name and age) based on the content of the audio.

Expand Down
2 changes: 1 addition & 1 deletion docs/blog/posts/youtube-flashcards.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Flashcards help break down complex topics and learn anything from biology to a n
language or lines for a play. This blog will show how to use LLMs to generate
flashcards and kickstart your learning!

**Instructor** lets us get structured outputs from LLMs reliably, and **Burr** helps
**Instructor** lets us get structured outputs from LLMs reliably, and [Burr](https://github.com/dagworks-inc/burr) helps
create an LLM application that's easy to understand and debug. It comes with **Burr UI**,
a free, open-source, and local-first tool for observability, annotations, and more!

Expand Down
Loading

0 comments on commit 821afdf

Please sign in to comment.