Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare with proprietary models #2

Open
IonMich opened this issue Nov 27, 2024 · 0 comments
Open

Compare with proprietary models #2

IonMich opened this issue Nov 27, 2024 · 0 comments

Comments

@IonMich
Copy link
Owner

IonMich commented Nov 27, 2024

We should add an inference endpoint that calls OpenAI's 4o and 4o1 variants, to get an idea of what is the state of the art in DocVQA. Similarly for Anthropic's Claude API and Google's Gemini.

Notably, these models should be able to extract all relevant information from the page in one-shot, including problem statement and handwritten attempted solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant