From 27e67cc41051c19c291ad4bca8802773b1f11c98 Mon Sep 17 00:00:00 2001
From: Jason Liu <jason@jxnl.co>
Date: Sat, 23 Nov 2024 08:52:16 -0500
Subject: [PATCH] bump

---
 docs/index.md | 913 +++++++++++++++++++++++++-------------------------
 1 file changed, 452 insertions(+), 461 deletions(-)
diff --git a/docs/index.md b/docs/index.md
index b5b18b19f..f66b56a88 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -18,6 +18,27 @@ It stands out for its simplicity, transparency, and user-centric design, built o
 
 [:material-star: Star the Repo](https://github.com/jxnl/instructor){: .md-button .md-button--primary } [:material-book-open-variant: Cookbooks](./examples/index.md){: .md-button } [:material-lightbulb: Prompting Guide](./prompting/index.md){: .md-button }
 
+=== "pip"
+    ```bash
+    pip install instructor
+    ```
+
+=== "uv"
+    ```bash
+    uv pip install instructor
+    ```
+
+=== "poetry"
+    ```bash
+    poetry add instructor
+    ```
+
+If you ever get stuck, you can always run `instructor docs` to open the documentation in your browser. It even supports searching for specific topics.
+
+```bash
+instructor docs [QUERY]
+```
+
 ## Newsletter
 
 If you want to be notified of tips, new blog posts, and research, subscribe to our newsletter. Here's what you can expect:
@@ -32,627 +53,542 @@ Subscribe to our newsletter for updates on AI development. We provide content to
 
 <iframe src="https://embeds.beehiiv.com/2faf420d-8480-4b6e-8d6f-9c5a105f917a?slim=true" data-test-id="beehiiv-embed" height="52" width="80%" frameborder="0" scrolling="no" style="margin: 0; border-radius: 0px !important; background-color: transparent;"></iframe>
 
-## Why use Instructor?
-
-<div class="grid cards" markdown>
-
-- :material-code-tags: **Simple API with Full Prompt Control**
-
-    Instructor provides a straightforward API that gives you complete ownership and control over your prompts. This allows for fine-tuned customization and optimization of your LLM interactions.
-
-    [:octicons-arrow-right-16: Explore Concepts](./concepts/models.md)
-
-- :material-translate: **Multi-Language Support**
-
-    Simplify structured data extraction from LLMs with type hints and validation.
-
-    [:simple-python: Python](https://python.useinstructor.com) · [:simple-typescript: TypeScript](https://js.useinstructor.com) · [:simple-ruby: Ruby](https://ruby.useinstructor.com) · [:simple-go: Go](https://go.useinstructor.com) · [:simple-elixir: Elixir](https://hex.pm/packages/instructor) · [:simple-rust: Rust](https://rust.useinstructor.com)
-
-- :material-refresh: **Reasking and Validation**
-
-    Automatically reask the model when validation fails, ensuring high-quality outputs. Leverage Pydantic's validation for robust error handling.
-
-    [:octicons-arrow-right-16: Learn about Reasking](./concepts/reask_validation.md)
-
-- :material-repeat-variant: **Streaming Support**
-
-    Stream partial results and iterables with ease, allowing for real-time processing and improved responsiveness in your applications.
-
-    [:octicons-arrow-right-16: Learn about Streaming](./concepts/partial.md)
-
-- :material-code-braces: **Powered by Type Hints**
-
-    Leverage Pydantic for schema validation, prompting control, less code, and IDE integration.
-
-    [:octicons-arrow-right-16: Learn more](https://docs.pydantic.dev/)
-
-- :material-lightning-bolt: **Simplified LLM Interactions**
-
-    Support for [OpenAI](./integrations/openai.md), [Anthropic](./integrations/anthropic.md), [Google](./integrations/google.md), [Vertex AI](./integrations/vertex.md), [Mistral/Mixtral](./integrations/together.md), [Ollama](./integrations/ollama.md), [llama-cpp-python](./integrations/llama-cpp-python.md), [Cohere](./integrations/cohere.md), [LiteLLM](./integrations/litellm.md).
-
-    [:octicons-arrow-right-16: See Hub](./integrations/index.md)
+## Getting Started
 
-</div>
+If you want to see all the integrations, check out the [integrations guide](./integrations/index.md).
 
-## Getting Started
+=== "OpenAI"
+    ```bash
+    pip install instructor
+    ```
 
-```
-pip install -U instructor
-```
+    !!! info "Using OpenAI's Structured Output Response"
 
-If you ever get stuck, you can always run `instructor docs` to open the documentation in your browser. It even supports searching for specific topics.
+        You can now use OpenAI's structured output response with Instructor. This feature combines the strengths of Instructor with OpenAI's precise sampling.
 
-```
-instructor docs [QUERY]
-```
+        ```python
+        client = instructor.from_openai(OpenAI(), mode=Mode.TOOLS_STRICT)
+        ```
 
-You can also check out our [cookbooks](./examples/index.md) and [concepts](./concepts/models.md) to learn more about how to use Instructor.
+    ```python
+    import instructor
+    from pydantic import BaseModel
+    from openai import OpenAI
 
-??? info "Make sure you've installed the dependencies for your specific client"
+    # Define your desired output structure
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-    To keep the bundle size small, `instructor` only ships with the OpenAI client. Before using the other clients and their respective `from_xx` method, make sure you've installed the dependencies following the instructions below.
+    # Patch the OpenAI client
+    client = instructor.from_openai(OpenAI())
 
-    1. Anthropic : `pip install "instructor[anthropic]"`
-    2. Google Generative AI: `pip install "instructor[google-generativeai]"`
-    3. Vertex AI: `pip install "instructor[vertexai]"`
-    4. Cohere: `pip install "instructor[cohere]"`
-    5. Litellm: `pip install "instructor[litellm]"`
-    6. Mistral: `pip install "instructor[mistralai]"`
+    # Extract structured data from natural language
+    res = client.chat.completions.create(
+        model="gpt-4o-mini",
+        response_model=ExtractUser,
+        messages=[{"role": "user", "content": "John Doe is 30 years old."}],
+    )
 
-Now, let's see Instructor in action with a simple example:
+    assert res.name == "John Doe"
+    assert res.age == 30
+    ```
 
-### Using OpenAI
+    [See more :material-arrow-right:](./integrations/openai.md){: .md-button }
 
-??? info "Want to use OpenAI's Structured Output Response?"
+=== "Ollama"
 
-    We've added support for OpenAI's structured output response. With this, you'll get all the benefits of instructor you like with the constrained sampling from OpenAI.
+    ```bash
+    pip install "instructor[ollama]"
+    ```
 
     ```python
     from openai import OpenAI
-    from instructor import from_openai, Mode
-    from pydantic import BaseModel
-
-    client = from_openai(OpenAI(), mode=Mode.TOOLS_STRICT)
-
+    from pydantic import BaseModel, Field
+    from typing import List
+    import instructor
 
-    class User(BaseModel):
+    class ExtractUser(BaseModel):
         name: str
         age: int
 
+    client = instructor.from_openai(
+        OpenAI(
+            base_url="http://localhost:11434/v1",
+            api_key="ollama",
+        ),
+        mode=instructor.Mode.JSON,
+    )
 
     resp = client.chat.completions.create(
-        response_model=User,
+        model="llama3",
         messages=[
             {
                 "role": "user",
                 "content": "Extract Jason is 25 years old.",
             }
         ],
-        model="gpt-4o",
+        response_model=ExtractUser,
     )
+    assert resp.name == "Jason"
+    assert resp.age == 25
     ```
 
-```python
-import instructor
-from pydantic import BaseModel
-from openai import OpenAI
-
-
-# Define your desired output structure
-class UserInfo(BaseModel):
-    name: str
-    age: int
-
-
-# Patch the OpenAI client
-client = instructor.from_openai(OpenAI())
-
-# Extract structured data from natural language
-user_info = client.chat.completions.create(
-    model="gpt-3.5-turbo",
-    response_model=UserInfo,
-    messages=[{"role": "user", "content": "John Doe is 30 years old."}],
-)
-
-print(user_info.name)
-#> John Doe
-print(user_info.age)
-#> 30
-```
+    [See more :material-arrow-right:](./integrations/ollama.md){: .md-button }
 
+=== "llama-cpp-python"
+    ```bash
+    pip install "instructor[llama-cpp-python]"
+    ```
 
-### Using Hooks
+    ```python
+    import llama_cpp
+    import instructor
+    from llama_cpp.llama_speculative import LlamaPromptLookupDecoding
+    from pydantic import BaseModel
 
-Instructor provides a powerful hooks system that allows you to intercept and log various stages of the LLM interaction process. Here's a simple example demonstrating how to use hooks:
+    llama = llama_cpp.Llama(
+        model_path="../../models/OpenHermes-2.5-Mistral-7B-GGUF/openhermes-2.5-mistral-7b.Q4_K_M.gguf",
+        n_gpu_layers=-1,
+        chat_format="chatml",
+        n_ctx=2048,
+        draft_model=LlamaPromptLookupDecoding(num_pred_tokens=2),
+        logits_all=True,
+        verbose=False,
+    )
 
-```python
-import instructor
-from openai import OpenAI
-from pydantic import BaseModel
+    create = instructor.patch(
+        create=llama.create_chat_completion_openai_v1,
+        mode=instructor.Mode.JSON_SCHEMA,
+    )
 
-class UserInfo(BaseModel):
-    name: str
-    age: int
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-# Initialize the OpenAI client with Instructor
-client = instructor.from_openai(OpenAI())
+    user = create(
+        messages=[
+            {
+                "role": "user",
+                "content": "Extract `Jason is 30 years old`",
+            }
+        ],
+        response_model=ExtractUser,
+    )
 
-# Define hook functions
-def log_kwargs(**kwargs):
-    print(f"Function called with kwargs: {kwargs}")
+    assert user.name == "Jason"
+    assert user.age == 30
+    ```
 
-def log_exception(exception: Exception):
-    print(f"An exception occurred: {str(exception)}")
+    [See more :material-arrow-right:](./integrations/llama-cpp-python.md){: .md-button }
 
-client.on("completion:kwargs", log_kwargs)
-client.on("completion:error", log_exception)
+=== "Anthropic"
+    ```bash
+    pip install "instructor[anthropic]"
+    ```
 
-user_info = client.chat.completions.create(
-    model="gpt-3.5-turbo",
-    response_model=UserInfo,
-    messages=[{"role": "user", "content": "Extract the user name: 'John is 20 years old'"}],
-)
+    ```python
+    import instructor
+    from anthropic import Anthropic
+    from pydantic import BaseModel
 
-"""
-{
-        'args': (),
-        'kwargs': {
-            'messages': [
-                {
-                    'role': 'user',
-                    'content': "Extract the user name: 'John is 20 years old'",
-                }
-            ],
-            'model': 'gpt-3.5-turbo',
-            'tools': [
-                {
-                    'type': 'function',
-                    'function': {
-                        'name': 'UserInfo',
-                        'description': 'Correctly extracted `UserInfo` with all the required parameters with correct types',
-                        'parameters': {
-                            'properties': {
-                                'name': {'title': 'Name', 'type': 'string'},
-                                'age': {'title': 'Age', 'type': 'integer'},
-                            },
-                            'required': ['age', 'name'],
-                            'type': 'object',
-                        },
-                    },
-                }
-            ],
-            'tool_choice': {'type': 'function', 'function': {'name': 'UserInfo'}},
-        },
-    }
-"""
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-print(f"Name: {user_info.name}, Age: {user_info.age}")
-#> Name: John, Age: 20
-```
+    client = instructor.from_anthropic(Anthropic())
 
-This example demonstrates:
-1. A pre-execution hook that logs all kwargs passed to the function.
-2. An exception hook that logs any exceptions that occur during execution.
+    # note that client.chat.completions.create will also work
+    resp = client.messages.create(
+        model="claude-3-5-sonnet-20240620",
+        max_tokens=1024,
+        messages=[
+            {
+                "role": "user",
+                "content": "Extract Jason is 25 years old.",
+            }
+        ],
+        response_model=ExtractUser,
+    )
 
-The hooks provide valuable insights into the function's inputs and any errors,
-enhancing debugging and monitoring capabilities.
+    assert isinstance(resp, ExtractUser)
+    assert resp.name == "Jason"
+    assert resp.age == 25
+    ```
 
-### Using Anthropic
+    [See more :material-arrow-right:](./integrations/anthropic.md){: .md-button }
 
-```python
-import instructor
-from anthropic import Anthropic
-from pydantic import BaseModel
+=== "Gemini"
+    ```bash
+    pip install "instructor[google-generativeai]"
+    ```
 
+    ```python
+    import instructor
+    import google.generativeai as genai
+    from pydantic import BaseModel
 
-class User(BaseModel):
-    name: str
-    age: int
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
+    client = instructor.from_gemini(
+        client=genai.GenerativeModel(
+            model_name="models/gemini-1.5-flash-latest",
+        ),
+        mode=instructor.Mode.GEMINI_JSON,
+    )
 
-client = instructor.from_anthropic(Anthropic())
+    # note that client.chat.completions.create will also work
+    resp = client.messages.create(
+        messages=[
+            {
+                "role": "user",
+                "content": "Extract Jason is 25 years old.",
+            }
+        ],
+        response_model=ExtractUser,
+    )
 
-# note that client.chat.completions.create will also work
-resp = client.messages.create(
-    model="claude-3-opus-20240229",
-    max_tokens=1024,
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-    response_model=User,
-)
+    assert isinstance(resp, ExtractUser)
+    assert resp.name == "Jason"
+    assert resp.age == 25
+    ```
 
-assert isinstance(resp, User)
-assert resp.name == "Jason"
-assert resp.age == 25
-```
+    [See more :material-arrow-right:](./integrations/google.md){: .md-button }
 
-### Using Gemini
+=== "Vertex AI"
+    ```bash
+    pip install "instructor[vertexai]"
+    ```
 
-The Vertex AI and Gemini Clients have different APIs. When using instructor with these clients, make sure to read the documentation for the specific client you're using to make sure you're using the correct methods.
+    ```python
+    import instructor
+    import vertexai  # type: ignore
+    from vertexai.generative_models import GenerativeModel  # type: ignore
+    from pydantic import BaseModel
 
-**Note**: Gemini Tool Calling is still in preview, and there are some limitations. You can learn more about them in the [Vertex AI examples notebook](./integrations/vertex.md). As of now, you cannot use tool calling with Gemini when you have multi-modal inputs (Eg. Images, Audio, Video), you must use the `JSON` mode equivalent for that client.
+    vertexai.init()
 
-#### Google AI
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-```python
-import instructor
-import google.generativeai as genai
-from pydantic import BaseModel
+    client = instructor.from_vertexai(
+        client=GenerativeModel("gemini-1.5-pro-preview-0409"),
+        mode=instructor.Mode.VERTEXAI_TOOLS,
+    )
 
+    # note that client.chat.completions.create will also work
+    resp = client.create(
+        messages=[
+            {
+                "role": "user",
+                "content": "Extract Jason is 25 years old.",
+            }
+        ],
+        response_model=ExtractUser,
+    )
 
-class User(BaseModel):
-    name: str
-    age: int
+    assert isinstance(resp, ExtractUser)
+    assert resp.name == "Jason"
+    assert resp.age == 25
+    ```
 
+    [See more :material-arrow-right:](./integrations/vertex.md){: .md-button }
 
-client = instructor.from_gemini(
-    client=genai.GenerativeModel(
-        model_name="models/gemini-1.5-flash-latest",
-    ),
-    mode=instructor.Mode.GEMINI_JSON,
-)
+=== "Groq"
+    ```bash
+    pip install "instructor[groq]"
+    ```
 
-# note that client.chat.completions.create will also work
-resp = client.messages.create(
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-    response_model=User,
-)
+    ```python
+    import instructor
+    from groq import Groq
+    from pydantic import BaseModel
 
-assert isinstance(resp, User)
-assert resp.name == "Jason"
-assert resp.age == 25
-```
+    client = instructor.from_groq(Groq())
 
-??? info "Using Gemini's multi-modal capabilities with `google-generativeai`"
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-    The `google.generativeai` library has a different API than the `vertexai` library. But, using `instructor`, working with multi-modal data is easy.
+    resp = client.chat.completions.create(
+        model="llama3-70b-8192",
+        response_model=ExtractUser,
+        messages=[{"role": "user", "content": "Extract Jason is 25 years old."}],
+    )
 
-    Here's a quick example of how to use an Audio file with `google-generativeai`. We've used this [recording](https://storage.googleapis.com/generativeai-downloads/data/State_of_the_Union_Address_30_January_1961.mp3) that's taken from the [Google Generative AI cookbook](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Audio.ipynb)
+    assert resp.name == "Jason"
+    assert resp.age == 25
+    ```
 
-    For a more in-depth example, you can check out our guide to working with Gemini using the `google-generativeai` package [here](./examples/multi_modal_gemini.md).
+    [See more :material-arrow-right:](./integrations/groq.md){: .md-button }
 
+=== "Litellm"
+    ```bash
+    pip install "instructor[litellm]"
+    ```
 
     ```python
     import instructor
-    import google.generativeai as genai
+    from litellm import completion
     from pydantic import BaseModel
 
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-    client = instructor.from_gemini(
-        client=genai.GenerativeModel(
-            model_name="models/gemini-1.5-flash-latest",
-        ),
-        mode=instructor.Mode.GEMINI_JSON,  # (1)!
-    )
-
-    mp3_file = genai.upload_file("./sample.mp3")  # (2)!
-
-
-    class Description(BaseModel):
-        description: str
-
+    client = instructor.from_litellm(completion)
 
-    resp = client.create(
-        response_model=Description,
+    resp = client.chat.completions.create(
+        model="claude-3-opus-20240229",
+        max_tokens=1024,
         messages=[
             {
                 "role": "user",
-                "content": "Summarize what's happening in this audio file and who the main speaker is",
-            },
-            {
-                "role": "user",
-                "content": mp3_file,  # (3)!
-            },
+                "content": "Extract Jason is 25 years old.",
+            }
         ],
+        response_model=ExtractUser,
     )
 
-    print(resp)
-    #> description="The main speaker is President John F. Kennedy, and he's giving a
-    #> State of the Union address to a joint session of Congress. He begins by
-    #> acknowledging his fondness for the House of Representatives and his long
-    #> history with it. He then goes on to discuss the state of the economy,
-    #> highlighting the difficulties faced by Americans, such as unemployment and
-    #> low farm incomes. He also touches on the Cold War and the international
-    #> balance of payments. He speaks of the need to strengthen the US military,
-    #> and he also discusses the importance of international cooperation and the
-    #> need to address global issues like hunger and illiteracy. He ends by urging
-    #> his audience to work together to face the challenges that lie ahead."
+    assert isinstance(resp, ExtractUser)
+    assert resp.name == "Jason"
+    assert resp.age == 25
     ```
 
-    1. Make sure to set the mode to `GEMINI_JSON`, this is important because Tool Calling doesn't work with multi-modal inputs.
-    2. Use `genai.upload_file` to upload your file. If you've already uploaded the file, you can get it by using `genai.get_file`
-    3. Pass in the file object as any normal user message
-
-#### Vertex AI
-
-```python
-import instructor
-import vertexai  # type: ignore
-from vertexai.generative_models import GenerativeModel  # type: ignore
-from pydantic import BaseModel
-
-vertexai.init()
-
-
-class User(BaseModel):
-    name: str
-    age: int
+    [See more :material-arrow-right:](./integrations/litellm.md){: .md-button }
 
+=== "Cohere"
+    ```bash
+    pip install "instructor[cohere]"
+    ```
 
-client = instructor.from_vertexai(
-    client=GenerativeModel("gemini-1.5-pro-preview-0409"),
-    mode=instructor.Mode.VERTEXAI_TOOLS,
-)
+    ```python
+    import instructor
+    from pydantic import BaseModel
+    from cohere import Client
 
-# note that client.chat.completions.create will also work
-resp = client.create(
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-    response_model=User,
-)
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-assert isinstance(resp, User)
-assert resp.name == "Jason"
-assert resp.age == 25
-```
+    client = instructor.from_cohere(Client())
 
-??? info "Using Gemini's multi-modal capabilities with VertexAI"
+    resp = client.chat.completions.create(
+        response_model=ExtractUser,
+        messages=[
+            {
+                "role": "user",
+                "content": "Extract Jason is 25 years old.",
+            }
+        ],
+    )
 
-    We've most recently added support for multi-part file formats using google's `gm.Part` objects. This allows you to pass in additional information to the LLM about the data you'd like to see.
+    assert resp.name == "Jason"
+    assert resp.age == 25
+    ```
 
-    Here are two examples of how to use multi-part formats with Instructor.
+    [See more :material-arrow-right:](./integrations/cohere.md){: .md-button }
 
-    We can combine multiple `gm.Part` objects into a single list and combine them into a single message to be sent to the LLM. Under the hood, we'll convert them into the appropriate format for Gemini.
+=== "Cerebras"
+    ```bash
+    pip install "instructor[cerebras]"
+    ```
 
     ```python
+    from cerebras.cloud.sdk import Cerebras
     import instructor
-    import vertexai.generative_models as gm  # type: ignore
-    from pydantic import BaseModel, Field
-
-    client = instructor.from_vertexai(gm.GenerativeModel("gemini-1.5-pro-001"))
-    content = [
-        "Order Details:",
-        gm.Part.from_text("Customer: Alice"),
-        gm.Part.from_text("Items:"),
-        "Name: Laptop, Price: 999.99",
-        "Name: Mouse, Price: 29.99",
-    ]
+    from pydantic import BaseModel
+    import os
 
+    client = Cerebras(
+        api_key=os.environ.get("CEREBRAS_API_KEY"),
+    )
+    client = instructor.from_cerebras(client)
 
-    class Item(BaseModel):
+    class ExtractUser(BaseModel):
         name: str
-        price: float
-
-
-    class Order(BaseModel):
-        items: list[Item] = Field(..., default_factory=list)
-        customer: str
-
+        age: int
 
-    resp = client.create(
-        response_model=Order,
+    resp = client.chat.completions.create(
+        model="llama3.1-70b",
+        response_model=ExtractUser,
         messages=[
             {
                 "role": "user",
-                "content": content,
-            },
+                "content": "Extract Jason is 25 years old.",
+            }
         ],
     )
 
-    print(resp)
-    #> items=[Item(name='Laptop', price=999.99), Item(name='Mouse', price=29.99)] customer='Alice'
+    assert resp.name == "Jason"
+    assert resp.age == 25
     ```
 
-    This is also the same for multi-modal responses when we want to work with images. In this example, we'll ask the LLM to describe an image and pass in the image as a `gm.Part` object.
+    [See more :material-arrow-right:](./integrations/cerebras.md){: .md-button }
+
+=== "Fireworks"
+    ```bash
+    pip install "instructor[fireworks]"
+    ```
 
     ```python
+    from fireworks.client import Fireworks
     import instructor
-    import vertexai.generative_models as gm  # type: ignore
     from pydantic import BaseModel
-    import requests
+    import os
 
-    client = instructor.from_vertexai(
-        gm.GenerativeModel("gemini-1.5-pro-001"), mode=instructor.Mode.VERTEXAI_JSON
+    client = Fireworks(
+        api_key=os.environ.get("FIREWORKS_API_KEY"),
     )
-    content = [
-        gm.Part.from_text("Count the number of objects in the image."),
-        gm.Part.from_data(
-            bytes(
-                requests.get(
-                    "https://img.taste.com.au/Oq97xT-Q/taste/2016/11/blueberry-scones-75492-1.jpeg"
-                ).content
-            ),
-            "image/jpeg",
-        ),
-    ]
-
-
-    class Description(BaseModel):
-        description: str
+    client = instructor.from_fireworks(client)
 
+    class ExtractUser(BaseModel):
+        name: str
+        age: int
 
-    resp = client.create(
-        response_model=Description,
+    resp = client.chat.completions.create(
+        model="accounts/fireworks/models/llama-v3p2-1b-instruct",
+        response_model=ExtractUser,
         messages=[
             {
                 "role": "user",
-                "content": content,
-            },
+                "content": "Extract Jason is 25 years old.",
+            }
         ],
     )
 
-    print(resp)
-    #> description='Seven blueberry scones sit inside a metal pie plate.'
+    assert resp.name == "Jason"
+    assert resp.age == 25
     ```
 
-### Using Litellm
+    [See more :material-arrow-right:](./integrations/fireworks.md){: .md-button }
 
-```python
-import instructor
-from litellm import completion
-from pydantic import BaseModel
 
+## Why use Instructor?
 
-class User(BaseModel):
-    name: str
-    age: int
+<div class="grid cards" markdown>
 
+- :material-code-tags: **Simple API with Full Prompt Control**
 
-client = instructor.from_litellm(completion)
+    Instructor provides a straightforward API that gives you complete ownership and control over your prompts. This allows for fine-tuned customization and optimization of your LLM interactions.
 
-resp = client.chat.completions.create(
-    model="claude-3-opus-20240229",
-    max_tokens=1024,
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-    response_model=User,
-)
+    [:octicons-arrow-right-16: Explore Concepts](./concepts/models.md)
 
-assert isinstance(resp, User)
-assert resp.name == "Jason"
-assert resp.age == 25
-```
+- :material-translate: **Multi-Language Support**
 
-### Using Cohere
+    Simplify structured data extraction from LLMs with type hints and validation.
 
-We also support users who want to use the Cohere models using the `from_cohere` method.
+    [:simple-python: Python](https://python.useinstructor.com) · [:simple-typescript: TypeScript](https://js.useinstructor.com) · [:simple-ruby: Ruby](https://ruby.useinstructor.com) · [:simple-go: Go](https://go.useinstructor.com) · [:simple-elixir: Elixir](https://hex.pm/packages/instructor) · [:simple-rust: Rust](https://rust.useinstructor.com)
 
-??? info "Want to get the original Cohere response?"
+- :material-refresh: **Reasking and Validation**
 
-    If you want to get the original response object from the LLM instead of a structured output, you can pass `response_model=None` to the `create` method. This will return the raw response from the underlying API.
+    Automatically reask the model when validation fails, ensuring high-quality outputs. Leverage Pydantic's validation for robust error handling.
 
-    ```python
-    # This will return the original Cohere response object
-    raw_response = client.chat.completions.create(
-        response_model=None,
-        messages=[
-            {
-                "role": "user",
-                "content": "Extract Jason is 25 years old.",
-            }
-        ],
-    )
-    ```
+    [:octicons-arrow-right-16: Learn about Reasking](./concepts/reask_validation.md)
 
-    This can be useful when you need access to additional metadata or want to handle the raw response yourself.
+- :material-repeat-variant: **Streaming Support**
 
-```python
-import instructor
-from pydantic import BaseModel
-from cohere import Client
+    Stream partial results and iterables with ease, allowing for real-time processing and improved responsiveness in your applications.
 
+    [:octicons-arrow-right-16: Learn about Streaming](./concepts/partial.md)
 
-class User(BaseModel):
-    name: str
-    age: int
+- :material-code-braces: **Powered by Type Hints**
 
+    Leverage Pydantic for schema validation, prompting control, less code, and IDE integration.
 
-client = instructor.from_cohere(Client())
+    [:octicons-arrow-right-16: Learn more](https://docs.pydantic.dev/)
 
-resp = client.chat.completions.create(
-    response_model=User,
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-)
+- :material-lightning-bolt: **Simplified LLM Interactions**
+
+    Support for [OpenAI](./integrations/openai.md), [Anthropic](./integrations/anthropic.md), [Google](./integrations/google.md), [Vertex AI](./integrations/vertex.md), [Mistral/Mixtral](./integrations/together.md), [Ollama](./integrations/ollama.md), [llama-cpp-python](./integrations/llama-cpp-python.md), [Cohere](./integrations/cohere.md), [LiteLLM](./integrations/litellm.md).
+
+    [:octicons-arrow-right-16: See Hub](./integrations/index.md)
+
+</div>
 
-assert resp.name == "Jason"
-assert resp.age == 25
-```
 
-### Using Cerebras
+### Using Hooks
 
-For those who want to use the Cerebras models, you can use the `from_cerebras` method to patch the client. You can see their list of models [here](https://inference-docs.cerebras.ai/api-reference/models).
+Instructor includes a hooks system that lets you manage events during the language model interaction process. Hooks allow you to intercept, log, and handle events at different stages, such as when completion arguments are provided or when a response is received. This system is based on the `Hooks` class, which handles event registration and emission. You can use hooks to add custom behavior like logging or error handling. Here's a simple example demonstrating how to use hooks:
 
 ```python
-from cerebras.cloud.sdk import Cerebras
 import instructor
+from openai import OpenAI
 from pydantic import BaseModel
-import os
-
-client = Cerebras(
-    api_key=os.environ.get("CEREBRAS_API_KEY"),
-)
-client = instructor.from_cerebras(client)
 
-
-class User(BaseModel):
+class UserInfo(BaseModel):
     name: str
     age: int
 
+# Initialize the OpenAI client with Instructor
+client = instructor.from_openai(OpenAI())
 
-resp = client.chat.completions.create(
-    model="llama3.1-70b",
-    response_model=User,
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-)
-
-print(resp)
-#> name='Jason' age=25
-```
-
-### Using Fireworks
+# Define hook functions
+def log_kwargs(**kwargs):
+    print(f"Function called with kwargs: {kwargs}")
 
-For those who want to use the Fireworks models, you can use the `from_fireworks` method to patch the client. You can see their list of models [here](https://fireworks.ai/models).
+def log_exception(exception: Exception):
+    print(f"An exception occurred: {str(exception)}")
 
-```python
-from fireworks.client import Fireworks
-import instructor
-from pydantic import BaseModel
-import os
+client.on("completion:kwargs", log_kwargs)
+client.on("completion:error", log_exception)
 
-client = Fireworks(
-    api_key=os.environ.get("FIREWORKS_API_KEY"),
+user_info = client.chat.completions.create(
+    model="gpt-3.5-turbo",
+    response_model=UserInfo,
+    messages=[{"role": "user", "content": "Extract the user name: 'John is 20 years old'"}],
 )
-client = instructor.from_fireworks(client)
 
+"""
+{
+        'args': (),
+        'kwargs': {
+            'messages': [
+                {
+                    'role': 'user',
+                    'content': "Extract the user name: 'John is 20 years old'",
+                }
+            ],
+            'model': 'gpt-3.5-turbo',
+            'tools': [
+                {
+                    'type': 'function',
+                    'function': {
+                        'name': 'UserInfo',
+                        'description': 'Correctly extracted `UserInfo` with all the required parameters with correct types',
+                        'parameters': {
+                            'properties': {
+                                'name': {'title': 'Name', 'type': 'string'},
+                                'age': {'title': 'Age', 'type': 'integer'},
+                            },
+                            'required': ['age', 'name'],
+                            'type': 'object',
+                        },
+                    },
+                }
+            ],
+            'tool_choice': {'type': 'function', 'function': {'name': 'UserInfo'}},
+        },
+    }
+"""
 
-class User(BaseModel):
-    name: str
-    age: int
+print(f"Name: {user_info.name}, Age: {user_info.age}")
+#> Name: John, Age: 20
+```
 
+This example demonstrates:
+1. A pre-execution hook that logs all kwargs passed to the function.
+2. An exception hook that logs any exceptions that occur during execution.
 
-resp = client.chat.completions.create(
-    model="accounts/fireworks/models/llama-v3p2-1b-instruct",
-    response_model=User,
-    messages=[
-        {
-            "role": "user",
-            "content": "Extract Jason is 25 years old.",
-        }
-    ],
-)
+The hooks provide valuable insights into the function's inputs and any errors,
+enhancing debugging and monitoring capabilities.
 
-print(resp)
-#> name='Jason' age=25
-```
+[Learn more about hooks :octicons-arrow-right:](./concepts/hooks.md){: .md-button .md-button-primary }
 
-## Correct Typing
+## Correct Type Inference
 
 This was the dream of instructor but due to the patching of openai, it wasnt possible for me to get typing to work well. Now, with the new client, we can get typing to work well! We've also added a few `create_*` methods to make it easier to create iterables and partials, and to access the original completion.
 
@@ -833,17 +769,72 @@ for user in users:
 
 ## Templating
 
-Instructor also ships with [Jinja](https://palletsprojects.com/p/jinja/) templating support. Check out our docs on [templating](./concepts/templating.md) to learn about how to use it to its full potential.
+Instructor supports templating with Jinja, which lets you create dynamic prompts. This is useful when you want to fill in parts of a prompt with data. Here's a simple example:
+
+```python
+import openai
+import instructor
+from pydantic import BaseModel
+
+client = instructor.from_openai(openai.OpenAI())
+
+class User(BaseModel):
+    name: str
+    age: int
 
+# Create a completion using a Jinja template in the message content
+response = client.chat.completions.create(
+    model="gpt-4o-mini",
+    messages=[
+        {
+            "role": "user",
+            "content": """Extract the information from the
+            following text: {{ data }}`""",
+        },
+    ],
+    response_model=User,
+    context={"data": "John Doe is thirty years old"},
+)
+
+print(response)
+#> User(name='John Doe', age=30)
+```
+
+[Learn more about templating :octicons-arrow-right:](./concepts/templating.md){: .md-button .md-button-primary }
 ## Validation
 
 You can also use Pydantic to validate your outputs and get the llm to retry on failure. Check out our docs on [retrying](./concepts/retrying.md) and [validation context](./concepts/reask_validation.md).
 
-## More Examples
+```python
+import instructor
+from openai import OpenAI
+from pydantic import BaseModel, ValidationError, BeforeValidator
+from typing_extensions import Annotated
+from instructor import llm_validator
 
-If you'd like to see more check out our [cookbook](examples/index.md).
+# Apply the patch to the OpenAI client
+client = instructor.from_openai(OpenAI())
 
-[Installing Instructor](installation.md) is a breeze. Just run `pip install instructor`.
+class QuestionAnswer(BaseModel):
+    question: str
+    answer: Annotated[
+        str,
+        BeforeValidator(llm_validator("don't say objectionable things", client=client)),
+    ]
+
+try:
+    qa = QuestionAnswer(
+        question="What is the meaning of life?",
+        answer="The meaning of life is to be evil and steal",
+    )
+except ValidationError as e:
+    print(e)
+    """
+    1 validation error for QuestionAnswer
+    answer
+      Assertion failed, The statement promotes objectionable behavior by encouraging evil and stealing. [type=assertion_error, input_value='The meaning of life is to be evil and steal', input_type=str]
+    """
+```
 
 ## Contributing