Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latency cost in eval #1468

Merged
merged 31 commits into from
Mar 31, 2024
Merged

Latency cost in eval #1468

merged 31 commits into from
Mar 31, 2024

Conversation

aakrem
Copy link
Collaborator

@aakrem aakrem commented Mar 28, 2024

No description provided.

Copy link
Member

@mmabrouk mmabrouk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @aakrem, great work ,please see my minor comments.

I think the code changes are minimal, and we can revert without any issue if we decide to modify where to track cost/latency from the output of the app to the observability logic. So we should move forward with this without any issue.

Can you please add also the same logic to the comparison view? Thanks!

agenta-backend/agenta_backend/services/llm_apps_service.py Outdated Show resolved Hide resolved
agenta-backend/agenta_backend/tasks/evaluations.py Outdated Show resolved Hide resolved
Copy link
Member

@mmabrouk mmabrouk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bekossy @aakrem FE tests are still failing. It seems to be a bug in the code:

  1) Evaluation Comparison Test
       Executing Evaluation Comparison Workflow
         Should verify that there are completed evaluations in the list:
     TypeError: The following error originated from your application code, not from Cypress.

  > Cannot read properties of null (reading 'getColId')

@aakrem I think we should show the total (sum) cost in the main view and not the average. I think it's more interesting, sine it allows the user to understand the cost of their experiments. For the latency, I think it makes sense to show the average, however, I would label it (avg latency).

@aakrem
Copy link
Collaborator Author

aakrem commented Mar 31, 2024

@mmabrouk btw we don't display "average/avg" for the all evaluators average results. displaying average will maybe make some confusion about what't the type of number is for the rest of the columns

@aakrem aakrem merged commit 6cd4a0a into main Mar 31, 2024
8 checks passed
@aakrem aakrem deleted the latency-cost-in-eval branch March 31, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants