Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Code Action agent/state [feature] #24

Open
JensRoland opened this issue Jul 9, 2024 · 1 comment
Open

Add Code Action agent/state [feature] #24

JensRoland opened this issue Jul 9, 2024 · 1 comment

Comments

@JensRoland
Copy link
Contributor

JensRoland commented Jul 9, 2024

Interesting read from the Hugging Face team -- they just set the new SotA on the GAIA benchmark largely by using Code Actions. They beat Microsoft's Autogen, which is no small feat.

Sure, GAIA is focused on reasoning, and the idea of introducing an agent that writes code but not for the output seems a little turtles-all-the-way-down. Still, there could be something to gain in terms of planning.

Not sure, only a thought.

Blog: https://huggingface.co/blog/beating-gaia

Code: https://github.com/aymeric-roucher/GAIA

@aorwall
Copy link
Owner

aorwall commented Jul 10, 2024

If I understood the Code Action concept correctly the essential thing is to let the LLM respond in code instead of a structured format like JSON? It could be interesting then to try this out in the search-identify steps. Like instructing the LLM to write code that should help it find the right parts of the code base. Like provide a set of functions it can use and a clear way in when to provide more information. For example when it writes a print() statement to see the output or just instruct it as it’s using a Jupyter notebook like code/open interpreter . A first step would be to just provide the current search functions, the identify function and a way for it to show that it found the relevant code. Then we can see if it gets creative and uses other ways to find or filter out the right code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants