Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanding model support for MyxMatch #22

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

salma-remyx
Copy link
Collaborator

@salma-remyx salma-remyx commented Nov 22, 2024

This PR helps expand the available models to evaluate through MyxMatch. We introduce additional validators to confirm availability offline before job submission.

Test

To test this latest change, you can try running a new myxmatch evaluation:

from remyxai.client.myxboard import MyxBoard
from remyxai.api.evaluations import (
    EvaluationTask,
    BenchmarkTask
    )
from remyxai.client.remyx_client import RemyxAPI

myx_board_name = "small-llm-comparison-media_example"
model_ids = ["microsoft/Phi-3-mini-4k-instruct", "Qwen/Qwen2-1.5B"]
myx_board = MyxBoard(model_repo_ids=model_ids, name=myx_board_name)

tasks = [EvaluationTask.MYXMATCH]
prompt = "You are a media analyst. Objective: Analyze the media coverage. Phase 1: Begin analysis."

remyx_api = RemyxAPI()
remyx_api.evaluate(myx_board, tasks, prompt=prompt)
>> Starting evaluation...

>>Evaluations are done! You can now view the results.
results = myx_board.get_results()

results
>>
{'myxmatch': [{'model': 'microsoft/Phi-3-mini-4k-instruct',
   'rank': 1,
   'prompt': 'You are a media analyst. Objective: Analyze the media coverage. Phase 1: Begin analysis.'},
  {'model': 'Qwen/Qwen2-1.5B',
   'rank': 2,
   'prompt': 'You are a media analyst. Objective: Analyze the media coverage. Phase 1: Begin analysis.'}]}

size_str, _, unit = match.groups()
size = float(size_str)

if unit.upper() == "M":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parsing this way will fail for so many model ids, we need something more robust:
https://stackoverflow.com/questions/68086929/how-to-get-the-size-of-a-hugging-face-pretrained-model

Copy link
Member

@smellslikeml smellslikeml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model size validation needs something more reliable

@salma-remyx salma-remyx marked this pull request as ready for review November 22, 2024 22:58
@salma-remyx
Copy link
Collaborator Author

model size validation needs something more reliable

noted in the remyx repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants