Pull Precheck #45

0x4007 · 2024-09-14T21:15:06Z

We have waves of contributors that open pulls sometimes keeping our review team busy.

We can save valuable man hours by having the bot preemptively check if the pull achieves the specification.

Pull Flow

Assuming that the contributor follows directions, the pull must be initially opened up as a draft.
When it is ready for review, they should turn it into a finalized pull request.
The bot should consume the issue specification and then the pull diff into its context.
The bot should return actionable feedback for what is missing from the specification. It should leave the review state as requested changes. It should also convert the pull back into a draft. However if it passes, it should leave as just "commented" (not approval).
If a collaborator (reviewer on the team) converts the pull back from a draft into a finalized pull, then the bot should back off and stop intervening. Basically the inspection should only occur on pull.created and when it's converted from draft to finalized by the pull author.

Bonus but maybe can be handled in other tasks:

Ensure CI is passing. I didn't include this here because sometimes there are problems out of the control of the pull author.
Limits: some novice contributors might abuse the code review feature and burn out credits by continuously requesting reviews every time they make a tiny change that doesn't achieve the specification. We have seen folks do hundreds of commits for tasks that required a small amount of lines of code. We can limit the reviews to one per day for the LLM. However it isn't clear to me how we can automatically have it check again the next day if they request a review. Maybe that can be handled by some later task.

We should use Claude 3.5 Sonnet.

ubadineke · 2024-09-18T20:54:18Z

/start

ubiquity-os · 2024-09-18T20:54:27Z

! Please set your wallet address with the /wallet command first and try again.

ubiquity-os · 2024-09-18T20:54:27Z

! No wallet address found

ubadineke · 2024-09-18T21:33:32Z

/wallet 0x3Ea855E4D6440D937117c776501e7653a770b759

ubiquity-os · 2024-09-18T21:33:40Z

+ Successfully registered wallet address

ubadineke · 2024-09-18T21:41:09Z

/start

ubiquity-os · 2024-09-18T21:41:16Z

Deadline Wed, Oct 2, 9:41 PM UTC Beneficiary 0x3Ea855E4D6440D937117c776501e7653a770b759

Tips:

Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

ubadineke · 2024-09-19T09:03:39Z

@Keyrxng should I implement this feature in a new file? and also what should happen if the pull is not initially opened as a draft?

Keyrxng · 2024-09-19T12:20:08Z

@0x4007 I had gpt re-rewrite for brevity but it removed what I'd call context but this still makes sense.

Discussion:

Spec consideration > Embedding effectiveness > Bot intent & current model

https://chatgpt.com/share/66ec086f-7d70-8000-a914-991b77a819b0

Conclusion:
The current model (embedding-based) struggles to provide detailed, code-specific feedback. While embeddings are useful for high-level semantic comparisons, they lack the precision needed for effective PR reviews. You'll need additional tools to meet your goal.

Task Understanding:

Capture PR diffs (optimize by excluding files like yarn.lock)
Compare diff embedding with task spec embedding to identify issues.
If PR fails, convert it to draft and provide actionable feedback.
Stop intervention when a reviewer finalizes the draft.

This is very similar to /review, and separately, /gpt currently has the ability to process full PR diffs, spec, and conversation per query across multiple issues/repos. However, embeddings aren't ideal for this. Raw text analysis (task spec + diff) is more effective since we need precise feedback, not high-level similarity or semantic meaning.

The plugin's effectiveness must be high to avoid becoming annoying or useless. Therefore, it makes sense to re-spec or close this idea since embeddings aren’t suitable.

To be clear, the intention of this spec was to automate the initial review process and is feasible but better done without embeddings in my opinion. If the /review approach is being replaced by this then I think the following workflow works best:

Suggested Workflow:

Plugin triggers on pull_request.created (can run a review here or not and on whether it's opened as a draft or ready_for_review)
Plugin triggers on pull_request.ready_for_review.
Use the task spec and sanitized PR diff to prompt GPT for code review.
Post review comment and convert PR to draft if GPT identifies issues; add commented status otherwise.
Define a consistent comment format for GPT reviews, since all GPT tasks I've handled had pretty specific input/output formats/prompt structures introduced.
If a collaborator (not the assignee) finalizes the PR, stop intervening.
Limit bot reviews to once every 24 hours by detecting if a previous AI review comment exists or a timeline review event does.

Additionally the task is way overpriced, ubiquity/ubiquibot#746 is the /review command I authored and was priced at $150 and the /gpt command is priced at $200 lmaoo, and I spent days/weeks on these. $400 max, we are simply feeding two bodies of text to GPT with a custom prompt to have it perform a review with a couple of conditionals afterwards to add a comment and update a review status.

should I implement this feature in a new file? and also what should happen if the pull is not initially opened as a draft?

@ubadineke I'm unsure personally at this point as I did not write this task specification. I have left comments and once @0x4007 replies I'll have a much better idea of how this task should be implemented.

Yes this feature will span multiple new files, try to keep things clean and tidy.
Personally I'd account for this and just run a review on pull_request.created which will allow it through or kick it back into draft just like 0x4007 said. Future reviews would be tied to pull_request.ready_for_review distanced by at least 24hrs between each using the timeline or review comments.

ubadineke · 2024-09-19T22:18:28Z

@Keyrxng @0x4007 What if the PR is created and the related issue is not immediately mentioned in the body. We just move it back to draft right and tell the user to edit the PR and specify issue?

The existing event payload style doesn't have a field for PR Number, maybe it was just designed for comments. Can modifications be made?
If the issue must be specified from the payload then the first paragraph in my comment is redundant.

Keyrxng · 2024-09-19T22:56:10Z

I guess that's a good idea as we like to strongly enforce the linked issue although it's not a guarantee each PR has or needs this such as quick fixes for example.

I'd say don't do anything, our pull request template enforces this format and legit cases exist where it is not used.

The existing event payload style doesn't have a field for PR Number, maybe it was just designed for comments. Can modifications be made?
If the issue must be specified from the payload then the first paragraph in my comment is redundant.

I'm not sure I understand but issues and PRs are both considered as an issue by the GitHub api, there is no PR number only issue.number. You can get linked issues via the GraphQL API most reliably.

If a PR does not have an associated task specification then without having codebase embeddings to compare the diff against we don't have a lot of context other than what they have changed via the diff. Which can still be reviewed by GPT on it's own for logic errors and optimizations it'll just be more general review without the guard rail of the spec.

0x4007 · 2024-09-20T02:47:09Z

I think it would be more appropriate for you to extend your /gpt command logic. You can handle it in a separate pull. Over on your old pull I said to merge and let's test in production. From there you can extend the logic as part of this task.

This task makes it more seamlessly integrated (but looks like your previous work set a lot of relevant ground work)

This requires more deeper integration into GitHub review workflow such as by setting review states and switching between draft and non draft etc.

I have extra time because the prompt engineering alone I'm sure will take a couple days to get the results we want.

Yes let's keep this plaintext no embeddings needed.

ubiquity-os · 2024-09-20T02:47:27Z

@Keyrxng the deadline is at Fri, Oct 4, 2:47 AM UTC

Keyrxng · 2024-11-25T14:53:27Z

@Keyrxng is there absolutely no way to test with an openai api key? Mocking endpoints by using some other llm model

It's impossible to build this AI feature without interacting with an OpenAI style API that has tools/functions logic as that's what this feature depends on. If another provider like OpenRouter is what you mean then yes you can use those.

henalolp · 2024-11-30T13:56:08Z

/help

ishowvel · 2024-11-30T14:10:36Z

/start

ishowvel · 2024-11-30T14:14:19Z

/help

ishowvel · 2024-11-30T14:15:37Z

/start

henalolp · 2024-12-02T10:13:51Z

/help

henalolp · 2024-12-02T10:18:47Z

/rewrite

ishowvel · 2024-12-03T12:34:27Z

/start

ubiquity-os-beta · 2024-12-03T12:34:41Z

Warning! This task was created over 79 days ago. Please confirm that this issue specification is accurate before starting. Deadline Tue, Dec 10, 12:34 PM UTC Beneficiary 0x340D8d2bd82dEb4f485623453c9F6ad307e6B027

Tip

Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

ubiquity-os-beta · 2024-12-05T21:22:12Z

@ishowvel, this task has been idle for a while. Please provide an update.

ishowvel · 2024-12-05T23:45:41Z

Still working on the pr almost done

ishowvel · 2024-12-06T14:19:02Z

can anyone please explain this error, i was setting up my local commits for testing

[
  {
    type: 45,
    schema: { anyOf: [Array], [Symbol(TypeBox.Kind)]: 'Union' },
    path: '/command',
    value: undefined,
    message: 'Expected required property',
    errors: []
  },
  {
    type: 62,
    schema: { anyOf: [Array], [Symbol(TypeBox.Kind)]: 'Union' },
    path: '/command',
    value: undefined,
    message: 'Expected union value',
    errors: [ [ValueErrorIterator], [ValueErrorIterator] ]
  }
] { depth: null }

0x4007 · 2024-12-06T16:42:59Z

This looks like it's related to the manifest but @whilefoo and @gentlementlegen may know best.

I also want to highlight that I changed the spec. We should use Claude 3.5 Sonnet instead of ChatGPT

gentlementlegen · 2024-12-06T17:03:39Z

You need to give a command: null at least in your payload to make the network call correct (I do not know how you're testing it).

ishowvel · 2024-12-07T06:17:05Z

i am running the worker locally to test

from the worker
[wrangler:inf] POST / 400 Bad Request (13ms)

from the kernel

✘ [ERROR] An error occurred while processing the plugin chain, will skip plugin "http://localhost:4003" SyntaxError: Unexpected token 'I', "Invalid signature" is not valid JSON

      at async handleEvent
  (file:///home/aj/ubiquity-os-kernel/src/github/handlers/index.ts:104:9)
      at null.<anonymous> (async
  file:///home/aj/ubiquity-os-kernel/.wrangler/tmp/dev-uFF4dq/cloudflare-worker.js:38383:7)
      at async Promise.all (index 2)

gentlementlegen · 2024-12-07T09:13:12Z

You need to either provide the good set of public and private keys or disable the signature checks on your local setup.

ishowvel · 2024-12-07T11:54:15Z

Done

subscrining to pull request events
parsing pr to get issue data
consuming pull diff and issue spec
prompting ai

TODO

Use result to make review and convert to draft if necessary

ishowvel · 2024-12-07T11:55:19Z

@gentlementlegen

may you know why the events arent being delievered?

gentlementlegen · 2024-12-07T19:01:25Z

Sorry it's a big of a vague question. What kind of event? Not delivered you mean the kernel not sending them or the plugin not picking them up?

ishowvel · 2024-12-08T05:35:14Z

what should be done for when claude returns a invalid json output, ie "lorem ipsum" or '{"lorem":1}'
the expected output should have been '{"confidenceThreshold":1}'

for now ill just throw an error which says Invalid json object found, Aborting

gentlementlegen · 2024-12-08T06:33:13Z

That is the current behavior we have when gpt doesn't return proper json.

ishowvel · 2024-12-08T10:30:04Z

heres the qa
ishowvel-org/.ubiquity-os#52
ishowvel-org/.ubiquity-os#49

ubiquity-os-beta · 2024-12-10T22:35:52Z

@ishowvel, this task has been idle for a while. Please provide an update.

gentlementlegen · 2024-12-11T01:23:23Z

@ishowvel you got a comment on your PR.

ishowvel · 2024-12-11T01:27:26Z

Yeah I just realised 😓

ishowvel · 2024-12-13T12:21:29Z

@0x4007 how do i submit a whole repository for review?
the repository is https://github.com/ishowvel/pull-precheck
here is the status of the tests https://github.com/ishowvel/pull-precheck/actions/runs/12315354584
here is the qa
ishowvel-org/.ubiquity-os#57
ishowvel-org/.ubiquity-os#54

0x4007 · 2024-12-13T12:28:07Z

Open a pull against https://github.com/ubiquity-os-marketplace/daemon-pull-review and resolve this issue, thank you.

0x4007 added Priority: 3 (High) Time: <2 Weeks good first issue Good for newcomers labels Sep 14, 2024

ubiquity-os bot added the Price: 1800 USD label Sep 14, 2024

0x4007 removed good first issue Good for newcomers Price: 1800 USD labels Sep 14, 2024

ubiquity-os bot added the Price: 1800 USD label Sep 14, 2024

devpool-directory-superintendent bot mentioned this issue Sep 14, 2024

Pull Precheck ubiquity/devpool-directory#1504

Closed

ubiquity-os bot assigned ubadineke Sep 18, 2024

0x4007 assigned Keyrxng and unassigned ubadineke Sep 20, 2024

0x4007 added Time: <1 Week and removed Price: 1800 USD Time: <2 Weeks labels Sep 20, 2024

ubiquity-os bot added the Price: 1200 USD label Sep 20, 2024

0x4007 transferred this issue from ubiquity-os-marketplace/text-vector-embeddings Sep 20, 2024

devpool-directory-superintendent bot mentioned this issue Sep 20, 2024

Pull Precheck ubiquity/devpool-directory#1517

Open

ubiquity-os-beta bot assigned ishowvel Dec 3, 2024

ishowvel mentioned this issue Dec 8, 2024

Command ask ubiquity-os-marketplace/command-ask#45

Closed

ishowvel linked a pull request Dec 13, 2024 that will close this issue

Addition of pull-precheck logic ubiquity-os-marketplace/daemon-pull-review#1

Open

Pull Precheck #45

Pull Precheck #45

Comments

0x4007 commented Sep 14, 2024 • edited Loading

Pull Flow

ubadineke commented Sep 18, 2024

ubiquity-os bot commented Sep 18, 2024

ubiquity-os bot commented Sep 18, 2024

ubadineke commented Sep 18, 2024

ubiquity-os bot commented Sep 18, 2024

ubadineke commented Sep 18, 2024

ubiquity-os bot commented Sep 18, 2024

Tips:

ubadineke commented Sep 19, 2024

Keyrxng commented Sep 19, 2024 • edited Loading

ubadineke commented Sep 19, 2024 • edited Loading

Keyrxng commented Sep 19, 2024

0x4007 commented Sep 20, 2024 • edited Loading

ubiquity-os bot commented Sep 20, 2024

Keyrxng commented Nov 25, 2024

henalolp commented Nov 30, 2024

ishowvel commented Nov 30, 2024

ishowvel commented Nov 30, 2024

ishowvel commented Nov 30, 2024

henalolp commented Dec 2, 2024

henalolp commented Dec 2, 2024

ishowvel commented Dec 3, 2024

ubiquity-os-beta bot commented Dec 3, 2024

ubiquity-os-beta bot commented Dec 5, 2024

ishowvel commented Dec 5, 2024

ishowvel commented Dec 6, 2024

0x4007 commented Dec 6, 2024

gentlementlegen commented Dec 6, 2024

ishowvel commented Dec 7, 2024

gentlementlegen commented Dec 7, 2024

ishowvel commented Dec 7, 2024

ishowvel commented Dec 7, 2024

gentlementlegen commented Dec 7, 2024

ishowvel commented Dec 8, 2024 • edited Loading

gentlementlegen commented Dec 8, 2024

ishowvel commented Dec 8, 2024

ubiquity-os-beta bot commented Dec 10, 2024

gentlementlegen commented Dec 11, 2024

ishowvel commented Dec 11, 2024

ishowvel commented Dec 13, 2024

0x4007 commented Dec 13, 2024

0x4007 commented Sep 14, 2024 •

edited

Loading

Keyrxng commented Sep 19, 2024 •

edited

Loading

ubadineke commented Sep 19, 2024 •

edited

Loading

0x4007 commented Sep 20, 2024 •

edited

Loading

ishowvel commented Dec 8, 2024 •

edited

Loading