Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of appropriate metrics to test #1

Open
klh5 opened this issue Nov 8, 2024 · 3 comments
Open

List of appropriate metrics to test #1

klh5 opened this issue Nov 8, 2024 · 3 comments
Assignees
Labels

Comments

@klh5
Copy link
Collaborator

klh5 commented Nov 8, 2024

No description provided.

@jack89roberts
Copy link
Collaborator

jack89roberts commented Nov 13, 2024

Starting pitch:

  • BLEU (baseline)
  • ROUGE-S / any other skip-gram or conventional metric of interest that may improve on BLEU (e.g. METEOR, CHRF)
  • BLASER 2.0
  • CometKiwi (reference-based variant) / other model-based/text translation-oriented metric of choice (e.g. critical error detection?)

and

  • the above with different variants of pre-processing etc.

@klh5
Copy link
Collaborator Author

klh5 commented Nov 13, 2024

It doesn't include BLASER as it's from 2023 but this paper also has a nice taxonomy of metrics which we could re-use.

@jack89roberts
Copy link
Collaborator

Also sentence similarity metrics e.g. these models: https://huggingface.co/models?pipeline_tag=sentence-similarity&sort=trending

SPICE is using one of these as part of: https://arxiv.org/pdf/2405.13845

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants