Expanding model support for MyxMatch #22

salma-remyx · 2024-11-22T16:14:24Z

This PR helps expand the available models to evaluate through MyxMatch. We introduce additional validators to confirm availability offline before job submission.

Test

To test this latest change, you can try running a new myxmatch evaluation:

from remyxai.client.myxboard import MyxBoard
from remyxai.api.evaluations import (
    EvaluationTask,
    BenchmarkTask
    )
from remyxai.client.remyx_client import RemyxAPI

myx_board_name = "small-llm-comparison-media_example"
model_ids = ["microsoft/Phi-3-mini-4k-instruct", "Qwen/Qwen2-1.5B"]
myx_board = MyxBoard(model_repo_ids=model_ids, name=myx_board_name)

tasks = [EvaluationTask.MYXMATCH]
prompt = "You are a media analyst. Objective: Analyze the media coverage. Phase 1: Begin analysis."

remyx_api = RemyxAPI()
remyx_api.evaluate(myx_board, tasks, prompt=prompt)
>> Starting evaluation...

>>Evaluations are done! You can now view the results.
results = myx_board.get_results()

results
>>
{'myxmatch': [{'model': 'microsoft/Phi-3-mini-4k-instruct',
   'rank': 1,
   'prompt': 'You are a media analyst. Objective: Analyze the media coverage. Phase 1: Begin analysis.'},
  {'model': 'Qwen/Qwen2-1.5B',
   'rank': 2,
   'prompt': 'You are a media analyst. Objective: Analyze the media coverage. Phase 1: Begin analysis.'}]}

smellslikeml · 2024-11-22T21:07:09Z

remyxai/utils/validators.py

+    size_str, _, unit = match.groups()
+    size = float(size_str)
+
+    if unit.upper() == "M":


parsing this way will fail for so many model ids, we need something more robust:
https://stackoverflow.com/questions/68086929/how-to-get-the-size-of-a-hugging-face-pretrained-model

smellslikeml

model size validation needs something more reliable

salma-remyx · 2024-11-22T22:58:43Z

model size validation needs something more reliable

noted in the remyx repo

salma-remyx added 3 commits November 22, 2024 16:11

expanding models by arch, adding validators

182e363

fixed valdation errors

d325489

fix parsing results

b6a8209

salma-remyx requested a review from smellslikeml November 22, 2024 20:58

salma-remyx self-assigned this Nov 22, 2024

smellslikeml reviewed Nov 22, 2024

View reviewed changes

salma-remyx marked this pull request as ready for review November 22, 2024 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expanding model support for MyxMatch #22

Expanding model support for MyxMatch #22

salma-remyx commented Nov 22, 2024 •

edited

Loading

smellslikeml Nov 22, 2024

smellslikeml left a comment

salma-remyx commented Nov 22, 2024

Expanding model support for MyxMatch #22

Are you sure you want to change the base?

Expanding model support for MyxMatch #22

Conversation

salma-remyx commented Nov 22, 2024 • edited Loading

Test

smellslikeml Nov 22, 2024

Choose a reason for hiding this comment

smellslikeml left a comment

Choose a reason for hiding this comment

salma-remyx commented Nov 22, 2024

salma-remyx commented Nov 22, 2024 •

edited

Loading