Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokens details are null when using create_with_completion #1104

Open
2 of 8 tasks
arcaputo3 opened this issue Oct 21, 2024 · 1 comment
Open
2 of 8 tasks

Tokens details are null when using create_with_completion #1104

arcaputo3 opened this issue Oct 21, 2024 · 1 comment

Comments

@arcaputo3
Copy link
Contributor

  • This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

What Model are you using?

  • gpt-3.5-turbo
  • gpt-4-turbo
  • gpt-4
  • Other (gpt-4o / gpt-4o-mini)

Describe the bug
When using create_with_completion, CompletionUsage.completion_tokens_details and CompletionUsage.prompt_tokens_details are both None

To Reproduce

import instructor
from openai import OpenAI
from pydantic import BaseModel


BOOK = "A long book..."  # Example: https://python.useinstructor.com/concepts/prompt_caching/#prompt-caching-in-anthropic


class Story(BaseModel):
    summary: str
    characters: list[str]


client = instructor.from_openai(OpenAI())

story, chat_completion = client.chat.completions.create_with_completion(
    response_model=Story,
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user", 
            "content": f"Summarize this book: {BOOK}"
        }
    ]
)

print(chat_completion.usage.model_dump_json(indent=2))
{
  "completion_tokens": 204,
  "prompt_tokens": 2327,
  "total_tokens": 2531,
  "completion_tokens_details": null,
  "prompt_tokens_details": null
}

Expected behavior
Using OpenAI directly, we have non-null values

from openai import OpenAI

BOOK = "A long book..."  # Example: https://python.useinstructor.com/concepts/prompt_caching/#prompt-caching-in-anthropic

openai_client = OpenAI()

raw_response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {
            "role": "user", 
            "content": f"Summarize this book: {BOOK}"
        }
    ]
)

print(raw_response.usage.model_dump_json(indent=2))  # cache hit expected
{
  "completion_tokens": 300,
  "prompt_tokens": 2269,
  "total_tokens": 2569,
  "completion_tokens_details": {
    "audio_tokens": null,
    "reasoning_tokens": 0
  },
  "prompt_tokens_details": {
    "audio_tokens": null,
    "cached_tokens": 2048
  }
}
@ivanbelenky
Copy link
Contributor

ivanbelenky commented Oct 21, 2024

Hey @arcaputo3 thanks for pointing this out, this is something that may have gone unnoticed. Retry behavior implies that there can be many calls to the client API. This calls will add up to a total_usage. This total usage is not being overwrited correctly in each call. I just finished a fix for it. Waiting on approval.

Always remember that usage information corresponds to the compounded usage from the requests and any subsequent retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants