You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for sharing very interesting approach! Don't you want to try its abilities in SWE bench?
They have simplified the evaluation, so it's not easier to get an evaluation result l.
The text was updated successfully, but these errors were encountered:
Thank you for your concern.
The main reason for the lack of SWE bench in our evaluation is simply that our work and SWE bench focus on different problems. In this work, we want to handle the software development task, generating a complete software based on the user's task (like ChatDev and MetaGPT), whereas SWE bench wants to test the ability of software evolution or maintenance (bug fixing or implementing new features), which is one stage of the software development process. In addition, though SWE bench is challenging, it does not fully align with Agile methodology because bug fixing or implementing new features tasks are usually done within one sprint in practice.
However, our method incorporates the bug-fixing phase in the pipeline, so it can incorporate advancements from other methods tailored to run on SWE bench. We did try to adapt some methods from SWE bench to test on our dataset, but results were bad as our work and methods on SWE bench are specifically designed for different problems.
We hope our answer can solve your question. If you have any other concerns, we are here to assist you.
Thanks for sharing very interesting approach! Don't you want to try its abilities in SWE bench?
They have simplified the evaluation, so it's not easier to get an evaluation result l.
The text was updated successfully, but these errors were encountered: