You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been working on replicating benchmarks related to video-class Large Language Models (LLMs), and I've noticed that most of these benchmarks rely on the GPT-assistant framework. Given the complexity and potential costs associated with these benchmarks, I'm interested in gathering some feedback regarding the financial aspect of conducting such evaluations.
Could anyone share their experiences regarding the typical costs involved in running these benchmarks? Any insights into budgeting for such projects would be highly beneficial to the community.
Thank you!
The text was updated successfully, but these errors were encountered:
Hello @hb-jw
I can't remember the exact cost for the evaluation because I used shared API for multiple projects.
But this is how to calculate the estimation:
each request requires about 800 tokens, so to estimate the cost, you can multiply the number of samples in each benchmark times 800 token to get the total tokens.
then see the pricing OpenAI pricing
total number of tokens = number_of_requests * 800
Estimated cost = (total number of tokens / 1000,000 )*(price_per_1M)
Here I used GPT 3.5 for evaluation which costs 1.5/1M token for inputs
Hello everyone,
I have been working on replicating benchmarks related to video-class Large Language Models (LLMs), and I've noticed that most of these benchmarks rely on the GPT-assistant framework. Given the complexity and potential costs associated with these benchmarks, I'm interested in gathering some feedback regarding the financial aspect of conducting such evaluations.
Could anyone share their experiences regarding the typical costs involved in running these benchmarks? Any insights into budgeting for such projects would be highly beneficial to the community.
Thank you!
The text was updated successfully, but these errors were encountered: