On Improving Repository-Level Code QA for Large Language Models

Authors: Strich, Jan and Schneider, Florian and Nikishina, Irina and Biemann, Chris

Abstract:

Large Language Models (LLMs) such as ChatGPT, GitHub Copilot, Llama, or Mistral assist programmers as copilots and knowledge sources to make the coding process faster and more efficient. This paper aims to improve the copilot performance by implementing different self-alignment processes and retrieval-augmented generation (RAG) pipelines, as well as their combination. To test the effectiveness of all approaches, we create a dataset and apply a model-based evaluation, using LLM as a judge. It is designed to check the model’s abilities to understand the source code semantics, the dependency between files, and the overall meta-information about the repository. We also compare our approach with other existing solutions, e.g. ChatGPT-3.5, and evaluate on the existing benchmarks. Code and dataset are available online (https://anonymous.4open.science/r/ma_llm-382D).

Link: Read Paper

Labels: general coding task, benchmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paper_7.md

paper_7.md

On Improving Repository-Level Code QA for Large Language Models

Files

paper_7.md

Latest commit

History

paper_7.md

File metadata and controls

On Improving Repository-Level Code QA for Large Language Models