prompts.py

SYSTEM_PROMPT = """\
You are an expert in python for software engineering and code review. Your responsibility is to review the patches generated by language models to fix some issues and provide feedback on the quality of their code.
"""

USER_PROMPT_TEMPLATE = """\
You are given an issue text on a github repository (wrapped with <issue_description></issue_description>), along with the whole repository's overview (wrapped with <repomap></repomap>).
You are also given a candidate patch (wrapped with <patch></patch>) that tries to resolve the target issue.

In the repomap overview, there are the paths to some important files in the repository, after each path there are the signatures of some important classes and functions.

Please help me evaluate this candidate patch, give me an integer score (ranging from 0 to 10) to indicate the correctness of the given patch, higher score means better quality.

<issue_description>
{issue_text}
</issue_description>

<repomap>
{repo_map}
</repomap>

<patch>
{model_patch}
</patch>

Please first try to explain what the patch is doing and why this may or may not solve the issue in <explanation></explanation> tag.

If the patch seems invalid to you or there is something wrong, please give a score of -1.

Then tell me your score based on your explanation, wrap it in <score></score> tags.
"""

USER_PROMPT_TEMPLATE_BEFORE_AFTER = """\
You are given an issue text on a github repository (wrapped with <issue_description></issue_description>), along with the whole repository's overview (wrapped with <repomap></repomap>).
You are also given the changed made by a candidate patch that tries to resolve the target issue.
For your convenience, you are given the hunks of original code and the code after applying the patch.

The code before the patch is wrapped with <before_patch></before_patch> and the code after the patch is wrapped with <after_patch></after_patch>.

Note that the file names in before_patch starts with 'a/' and the file names in after_patch starts with 'b/'.

Also, to help you get a fuller view of the patch, they are given within a context of 20 lines before and after the patched lines.

In the repomap overview, there are the paths to some important files in the repository, after each path there are the signatures of some important classes and functions.

Please help me evaluate this candidate patch, give me an integer score (ranging from 0 to 10) to indicate the correctness of the given patch.

<issue_description>
{issue_text}
</issue_description>

<repomap>
{repo_map}
</repomap>

<before_patch>
{before_patch}
</before_patch>

<after_patch>
{after_patch}
</after_patch>

Please first try to explain what the patch is doing and why this may or may not solve the issue in <explanation></explanation> tag, make sure to address whether the patch is fixing the correct function or not in your explanation.

If the patch seems invalid to you or there is something wrong, please give a score of -1.

Then tell me your score based on your explanation, wrap it in <score></score> tags.
"""

PARIWISE_COMPARISON_WITH_IDENTIFIED_SPANS_TEMPLATE = """\
I want you to compare two LLM-generated candidate patches that try to resolve an issue in a codebase.

To assist you in this task, you are provided with the following information:
 - You are given an issue text on a github repository (wrapped with <issue_description></issue_description>).
 - You are also given some identified code spans that are relevant to the issue.
    Each code span is wrapped with <code_span file_path=FILE_PATH span_id=SPAN_ID></code_span> tags, where FILE_PATH is the path to the file containing the code span, and SPAN_ID is the unique identifier for the code span.
    Each code span also comes with the line numbers for you to better understand the context.
 - You are given two candidate patches that try to resolve the target issue.
    The first candidate patch is wrapped with <patch1></patch1> tags and the second candidate patch is wrapped with <patch2></patch2> tags.
    Within each patch, you are given the hunks of original code and the code after applying the patch.
    The code before the patch is wrapped with <before_patch1></before_patch1> and <before_patch2></before_patch2> tags, and the code after the patch is wrapped with <after_patch1></after_patch1> and <after_patch2></after_patch2> tags.
    Note that the file names in before_patch starts with 'a/' and the file names in after_patch starts with 'b/'.
 - At least one of the two patches is correct.

Here's what you want to do:
1. Understand the issue. Explain in your own words what the issue is about. Output your explanation in <issue_exp></issue_exp> tags.
2. Understand the identified code spans. First provide a list of the span ids. Then explain how each of the identified code spans are relevant to the issue. Output your explanation in <code_span_exp></code_span_exp> tags.
3. Understand the candidate patches. First curate a list of modified hunks. For each modified hunk, explain what it's doing. Output your explanation in the <patch_exp_1></patch_exp_1> and <patch_exp_2></patch_exp_2> fields.
4. Check if the patches are introducing any new issues, especially if it contradicts with any of the identified code spans. Output your explanation in the <new_issues_exp1></new_issues_exp1> and <new_issues_exp2></new_issues_exp2> fields.
5. Check if the patches can fix the issue. Compare the generated patch agains the common mistakes made by LLMs and see if it falls into any of the categories. Output your explanation in the <fix_issue_exp></fix_issue_exp> field.
6. Point out the differences between the two patches and how these may affect their correctness. Expliclty point out how big the difference is. Refer back to your <patch_exp> explanations when pointing out the difference. Output your explanation in the <diff></diff> field.
7. Explain you choice of the better patch according to the following rubrics. Make sure to repeat issue description in your own words first when explaining. Output your explanation in the <better_patch_exp></better_patch_exp> field.
8. Finally, give me your choice of the better patch. Wrap your choice in <better_patch></better_patch> tags. Your choice should be 0 or 1 or 2, where 0 means you cannot pick a better one, 1 means the first patch is better and 2 means the second patch is better.

Here are your inputs:

<issue_description>
{issue_text}
</issue_description>

{code_spans}

<patch1>

<before_patch1>
{before_patch1}
</before_patch1>

<after_patch1>
{after_patch1}
</after_patch1>

</patch1>

<patch2>

<before_patch2>
{before_patch2}
</before_patch2>

<after_patch2>
{after_patch2}
</after_patch2>

</patch2>

Again, make sure your output ends with <better_patch></better_patch> tags containing only 1 or 2, indicating your choice of the better patch.
For example, if you think the first patch is better, the final part of output should look like this:
<better_patch>1</better_patch>
It should not contain any other information or characters.
Do not use ``` or ### or anything else to wrap your verdict.
"""


SINGLE_SCORING_WITH_IDENTIFIED_SPANS_TEMPLATE = """\
I want you to evaluate an LLM-generated candidate patch that tries to resolve an issue in a codebase.

To assist you in this task, you are provided with the following information:
 - You are given an issue text on a github repository (wrapped with <issue_description></issue_description>).
 - You are also given some identified code spans that are relevant to the issue.
    Each code span is wrapped with <code_span file_path=FILE_PATH span_id=SPAN_ID></code_span> tags, where FILE_PATH is the path to the file containing the code span, and SPAN_ID is the unique identifier for the code span.
    Each code span also comes with the line numbers for you to better understand the context.
 - You are given the candidate patch that tries to resolve the target issue.
    For your convenience, you are given the hunks of original code and the code after applying the patch.
    The code before the patch is wrapped with <before_patch></before_patch> and the code after the patch is wrapped with <after_patch></after_patch>.
    Note that the file names in before_patch starts with 'a/' and the file names in after_patch starts with 'b/'.

Here's what you want to do:

1. Understand the issue. Explain in your own words what the issue is about. Output your explanation in <issue_exp></issue_exp> tags.
2. Understand the identified code spans. First provide a list of the span ids. Then explain how each of the identified code spans are relevant to the issue. Output your explanation in <code_span_exp></code_span_exp> tags.
3. Understand the candidate patch. First curate a list of modified hunks. For each modified hunk, explain what it's doing. Output your explanation in the <patch_exp></patch_exp> field.
4. Check if the patch is fixing the correct function or not. Output your explanation in the <correct_location_exp></correct_location_exp> field.
5. Check if the patch is introducing any new issues, especially if it contradicts with any of the identified code spans. Output your explanation in the <new_issues_exp></new_issues_exp> field.
6. Check if the patch can fix the issue. Compare the generated patch agains the common mistakes made by LLMs and see if it falls into any of the categories. Be ruthless to point out any potential mistakes. Output your explanation in the <fix_issue_exp></fix_issue_exp> field.
7. Finally, give me your score. Wrap your score in <score></score> tags. Make sure to include in these tags only an integer, nothing else.

Here's the scoring rubric:

Your score should be an integer between 0 and 10, where higher scores indicate better quality.
You should give a score of -1 if you think the patch is invalid or there is something wrong with it.
For every contradiction between the identified code spans and the patch, you should deduct 1 point from the score.
If you think the patch is not fixing the correct function, you should give a 0.
If you think the patch is introducing new issues, you should deduct 2 points from the score.
Your scoring should only be about the correctness of the patch, not about its quality or style.

<issue_description>
{issue_text}
</issue_description>

<before_patch>
{before_patch}
</before_patch>

<after_patch>
{after_patch}
</after_patch>

{code_spans}

Again, make sure your output ends with <score></score> tags containing only an integer.
For example, if your score is 8, the final part of output should look like this:
<score>8</score>
It should not contain any other information or characters.
Do not use ``` or ### or anything else to wrap your score.
"""

EXPLANATION_PROMPT = """\
You are given an issue text on a github repository (wrapped with <issue_description></issue_description>), along with the whole repository's overview (wrapped with <repomap></repomap>).

In the repomap overview, there are the paths to some important files in the repository, after each path there are the signatures of some important classes and functions.

You are also given a candidate patch (wrapped with <patch></patch>) that tries to resolve the target issue.

For your convenience, you are also given the hunks of code before and after the patch. The code before the patch is wrapped with <before_patch></before_patch> and the code after the patch is wrapped with <after_patch></after_patch>.

Please help me explain what this candidate patch, and point out some of the key differences between the code before and after the patch.

Please also point out any potential problems if any. Note that the patch could be written by a rookie coder and contain many mistakes.

Note that your explanation, differences, and potential problems should be solely based on the correctness of the patch, not on its quality or style.
Your focus should be solely on the correctness of the patches, not on their quality or style.
Make sure the patch solves and only solves the issue, and does not introduce any new issues.
Make sure the patch is not redundant and does not contain any unnecessary changes.

In conclusion, your response should be a function call to the `explain` function with the following arguments:

{{
    "explanation": "Your explanation for the patch",
    "differences": "Your explanation for the difference between the code before and after the patch",
    "problems": "Any potential problems with the patch, empty if none",
}}

<issue_description>
{issue_text}
</issue_description>

<repomap>
{repo_map}
</repomap>

<patch>
{model_patch}
</patch>

<before_patch>
{before_patch}
</before_patch>

<after_patch>
{after_patch}
</after_patch>
"""


PAIRWISE_USER_PROMPT = """\
You are given an issue text on a github repository (wrapped with <issue_description></issue_description>), along with the whole repository's overview (wrapped with <repomap></repomap>).

In the repomap overview, there are the paths to some important files in the repository, after each path there are the signatures of some important classes and functions.

Please help me determine which one of the two following patches (wrapped in <patch1></patch1> and <patch2></patch2> tags) is able to resolve the target issue.

Your focus should be solely on the correctness of the patches, not on their quality or style.
Make sure the patch solves and only solves the issue, and does not introduce any new issues.
Make sure the patch is not redundant and does not contain any unnecessary changes.

For your convenience, you are also given the hunks of code before and after the patces.

The code before first patch is wrapped with <before_patch1></before_patch1> and the code after the patch is wrapped with <after_patch1></after_patch1>.

The code before second patch is wrapped with <before_patch2></before_patch2> and the code after the patch is wrapped with <after_patch2></after_patch2>.

Here's what you want to do:

1. Read the issue description and the repomap overview.
2. Explain what you think the first patch is doing. Output your explanation in the `exp1` field.
3. Explain what you think about the second patch is doing. Output your explanation in the `exp2` field.
4. Explain what you think their difference is. Output your explanation in the `diff` field.
5. Finally, give me your verdicts on which patch is better. Wrap your scores on the two patches in `score1` and `score2`. Each score must be either 1 or -1 where 1 means good and -1 means bad.

In conclusion, you should call the `compare` function with the following arguments:

{{
    "exp1": "Your explanation for the first patch",
    "exp2": "Your explanation for the second patch",
    "diff": "Your explanation for the difference between the two patches",
    "score1": 1 or -1,
    "score2": 1 or -1,
}}

ALWAYS respond with values for all parameters in this tool. If you do not have an opinion on a particular parameter, please provide an empty string.

Some tips to help you evaluate the patches:
If they are very different, their scores should different.
If they are similar, their scores should be the same.
If both are good, give them both a score of 1. If both are bad (or invalid or empty), give them both a score of -1.


<issue_description>
{issue_text}
</issue_description>

<repomap>
{repo_map}
</repomap>

<patch1>
{patch1}
</patch1>

<before_patch1>
{before_patch1}
</before_patch1>

<after_patch1>
{after_patch1}
</after_patch1>

<patch2>
{patch2}
</patch2>

<before_patch2>
{before_patch2}
</before_patch2>

<after_patch2>
{after_patch2}
</after_patch2>
"""

PAIRWISE_USER_PROMPT_GIVEN_EXP = """\
You are given an issue text on a github repository (wrapped with <issue_description></issue_description>), along with the whole repository's overview (wrapped with <repomap></repomap>).

In the repomap overview, there are the paths to some important files in the repository, after each path there are the signatures of some important classes and functions.

Please help me determine which one of the two following patches (wrapped in <patch1></patch1> and <patch2></patch2> tags) is able to resolve the target issue.

Along with the patches, you are given some auxiliary information about the patches:

- The explanation for the two patches are given in <exp1></exp1> and <exp2></exp2> tags.
- The potential problems with the two patches are given in <problems1></problems1> and <problems2></problems2> tags.

Your focus should be solely on the correctness of the patches, not on their quality or style.
Make sure the patch solves and only solves the issue, and does not introduce any new issues.
Make sure the patch is not redundant and does not contain any unnecessary changes.

For your convenience, you are also given the hunks of code before and after the patces.

The code before first patch is wrapped with <before_patch1></before_patch1> and the code after the patch is wrapped with <after_patch1></after_patch1>.

The code before second patch is wrapped with <before_patch2></before_patch2> and the code after the patch is wrapped with <after_patch2></after_patch2>.

Here's what you want to do:

1. Read the issue description and the repomap overview.
2. Read the explanations and potential problems with the patches. Reason about the correctness of the patches and how they compare to each other, output your comparison in <compare></compare> tag.
5. Finally, give me your verdicts on which patch is better. Wrap your scores on the two patches in `score1` and `score2`. Each score must be either 1 or -1 where 1 means good and -1 means bad.

In conclusion, you should call the `compare` function with the following arguments:

{{
    "compare": "Your comparison of the two patches",
    "score1": 1 or -1,
    "score2": 1 or -1,
}}

ALWAYS respond with values for all parameters in this tool. If you do not have an opinion on a particular parameter, please provide an empty string.

Some tips to help you evaluate the patches:
If they are very different, their scores should different.
If they are similar, their scores should be the same.
If both are good, give them both a score of 1. If both are bad (or invalid or empty), give them both a score of -1.


<issue_description>
{issue_text}
</issue_description>

<repomap>
{repo_map}
</repomap>

<patch1>
{patch1}
</patch1>

<before_patch1>
{before_patch1}
</before_patch1>

<after_patch1>
{after_patch1}
</after_patch1>

<exp1>
{exp1}
</exp1>

<problems1>
{problems1}
</problems1>

<patch2>
{patch2}
</patch2>

<before_patch2>
{before_patch2}
</before_patch2>

<after_patch2>
{after_patch2}
</after_patch2>

<exp2>
{exp2}
</exp2>

<problems2>
{problems2}
</problems2>
"""