Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes for a couple of things in leaderboard cog #51

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

b9r5
Copy link
Collaborator

@b9r5 b9r5 commented Dec 13, 2024

Description

siro requested a change in leaderboard_cog.py where to use interaction.response.send_message for initial validations and thereafter use interaction.followup.send.

I noticed a problem where the submission content was not being decoded. As a result, we get rows in the submission table like this (note the binary-looking data in the code column):

clusterdev=# select * from leaderboard.submission where id=8;
-[ RECORD 1 ]---
id              | 8
problem_id      | 1
name            | train.py
user_id         | 704502776683692072
code            | \x696d706f727420746f7263680a0a64656620637573746f6d5f736f66746d617828783a20746f7263682e54656e736f722c2064696d3a20696e74203d202d3129202d3e20746f7263682e54656e736f723a0a2020202072657475726e20746f7263682e6e6e2e66756e6374696f6e616c2e736f66746d617828782c2064696d3d64696d290a
submission_time | 2024-12-12 18:24:07.769627-08
score           | 0.012549400329589844

After this change, we get rows in the submission table like this (note that the code column is legible after the change):

clusterdev=# select * from leaderboard.submission where id=9;
 id | problem_id |   name   |      user_id       |                                code                                 |        submission_time        |        score         
----+------------+----------+--------------------+---------------------------------------------------------------------+-------------------------------+----------------------
  9 |          1 | train.py | 704502776683692072 | import torch                                                       +| 2024-12-12 18:40:29.929962-08 | 0.011950969696044922
    |            |          |                    |                                                                    +|                               | 
    |            |          |                    | def custom_softmax(x: torch.Tensor, dim: int = -1) -> torch.Tensor:+|                               | 
    |            |          |                    |     return torch.nn.functional.softmax(x, dim=dim)                 +|                               | 
    |            |          |                    |                                                                     |                               | 
(1 row)

A couple of screenshots of tests:

  1. Submitting to a leaderboard that does not exist:

image

  1. Submitting a file that is not UTF-8 encoded:

image

Checklist

Before submitting this PR, ensure the following steps have been completed:

I manually verified that /run modal works, and that /run github works for Nvidia.
For some reason, I am not able to execute /run github for AMD at the moment.

  • Run the slash command /verifyruns on your own server.
    • Run the cluster bot on your server:
      python discord-bot.py
    • Start training runs with the slash command /verifyruns.
    • Verify that the bot eventually responds with:
      ✅ All runs completed successfully!
      
      (It may take a few minutes for all runs to finish. In particular, the GitHub
      runs may take a little longer. The Modal run is typically quick.)
      For more information on running a cluster bot on your own server, see
      README.md.

@msaroufim msaroufim self-requested a review December 13, 2024 05:56
@msaroufim msaroufim merged commit 8e59226 into main Dec 13, 2024
1 check failed
@b9r5 b9r5 deleted the benh/leaderboard-fixes branch December 16, 2024 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants