-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QoL changes: Add GPU types view, remove dtype and shapes info. No DB or functionality. #52
base: main
Are you sure you want to change the base?
Conversation
src/discord-cluster-manager/utils.py
Outdated
try: | ||
await interaction.response.send_message(msg) | ||
|
||
except Exception: | ||
await interaction.followup.send(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this would be better written as the following?
try: | |
await interaction.response.send_message(msg) | |
except Exception: | |
await interaction.followup.send(msg) | |
if interaction.response.is_done(): | |
await interaction.followup.send(msg) | |
else: | |
await interaction.response.send_message(msg) |
We do call interaction.response.defer()
when we intend to use followup. And that call to defer()
ends up making is_done()
return True.
What makes me uncomfortable about except Exception
is that we don't really know what's gone wrong.
Honestly, I think it would work either way, but I think I prefer my suggestion if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed this, I think this makes sense. Will verify that this works locally before I push though. Also renamed more generally to send_discord_message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also changed the logic for use_followup=True in the followup commit.
self.connection.rollback() # Ensure rollback if error occurs | ||
return f"Error during leaderboard creation: {e}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm nervous about returning the value with the stringified exception in it. The reason is that I'm not sure what's going to be in e
. I wouldn't expect it to include sensitive information (like the database URL), but it could. What I would suggest is to log the full string (f"Error during leaderboard creation: {e}"
) but only return "Error during leaderboard creation"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, for now I won't log it (we can change this later) but will remove error message inside.
if "duplicate key" in err: | ||
await interaction.followup.send( | ||
f'Error: Tried to create a leaderboard "{leaderboard_name}" that already exists.', | ||
ephemeral=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like in the within the create_leaderboard
implementation in leaderboard_db.py, we should examine the message in e
and sanitize it. We could return a message containing "duplicate key"
if we found that string, and we could check for other strings that we've found to occur. I just want to be careful not to leak sensitive information by returning the full string of e
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure what you mean by this, do you mean we sanitize the "err" string even if we never print it out?
}) | ||
|
||
await interaction.response.send_message( | ||
if err: | ||
print(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print(err) |
...or log the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed these, this should not have been in the commit. I was debugging at the time, thanks for catching!
) | ||
else: | ||
# Handle any other errors | ||
await interaction.followup.send( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a good idea to log the error in this case, so that we can look at the logs and then add more informative error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is fair, I've added logging information for the specific error here, but the Discord bot produces a fixed string.
if "already exists" in str(e): | ||
await interaction.followup.send( | ||
str(e), | ||
ephemeral=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could we get here? Shouldn't this be handled in or near the `"duplicate key" area above?
(I know I'm sounding like a broken record but) we shouldn't send str(e)
to the Discord user. We should log a message containing e
and send a generic message to the user like "Leaderboard already exists"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has to do with why I ended up re-factoring create_leaderboard to return an error string, because it wasn't being captured in the try except. I've removed this, but also made sure to remove the print errors on Discord and add them to a logger instead.
# Ask the user to select GPUs | ||
view = GPUSelectionView([gpu.name for gpu in GitHubGPU]) | ||
|
||
await interaction.response.send_message( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be interaction.followup.send(...)
? If there are going to be more messages in this interaction, then interaction.response.send_message(...)
will cause problems.
I think the easiest thing to do in this function is to try to only send one response, and use interrupt.response.send_message(...)
. And return immediately after the message is sent. Using followup can definitely be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wrapped this around our send_discord_message
util to avoid this issue.
try: | ||
date_value = datetime.strptime(deadline, "%Y-%m-%d") | ||
except ValueError: | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do interaction.response.send_message("Invalid date format. Please use...")
and then return here. It's a simpler control flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this, there was also a mistake in here regarding bad implementations I forgot to delete. It's been simplified.
… to loggers). Remove use_followup with wrapper code to avoid giving user access to flags
Hey Alex, thanks for the work on this PR. I've taken another look at the changes and wanted to share my thoughts. First, generally, from my experience I've found that it's easy to unintentionally break things in this discord.py implementation. I've been trying to stick to small, incremental changes that are quicker to review, and easier to test (and rollback if needed). While it's early days and manual testing is still necessary, I've found this approach helps reduce the testing burden and ensure quicker write/deploy/test cycles. Given that, my recommendation is to break this PR into smaller pieces: a. worflow.yml changes. Removing the processing here is a good improvement and ties into getting b. send_discord_message. I'm not sure about the approach and think it would be helpful to get Siro's input before deciding on the best path forward. c. Whitespace and string style differences. The changes here make me wonder if we should standardize on formatting. Has this been discussed yet? I'm open to a discussion on this to avoid inconsistency (and any poor style choices on my part 🙂) without spending too much time on formatting reviews. d. GPU selection view changes. These look quite cool. |
On styling lemme just use ruff and do a one time lint of the repo. Better to pull the Bandaid now |
A bunch of quality of life changes, as well as a change to GPU types.
Adds:
Removes:
dtype
andshape
information in/leaderboard submit ...
.dtype
andshape
information in/leaderboard create ...
.Leftover TODOs:
/leaderboard list
to also include all available GPU types for the particular problem.Checklist
Before submitting this PR, ensure the following steps have been completed:
/verifyruns
on your own server./verifyruns
.runs may take a little longer. The Modal run is typically quick.)
For more information on running a cluster bot on your own server, see
README.md.