Long-Range-Arena Evaluation #49

ClashLuke · 2022-05-18T18:57:17Z

Currently, we only know that our model is better than the baseline because of its lower loss at less training time. However, we could run some benchmarks such as LRA to see how well our long-context model performs in a real-world scenario. While LRA doesn't leverage our capabilities ideally (unlike, for example, #5 and #9), it'd still allow us to have preliminary evaluation results on a well-known benchmark dataset.
This issue tacks the progress of integrating our model into LRA, even though it should happen in a separate codebase.

ClashLuke added engineering Software-engineering problems that don't require ML-Expertise ML Requires machine-learning knowledge (can be built up on the fly) downstream Changes code wrapping the core model labels May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long-Range-Arena Evaluation #49

Long-Range-Arena Evaluation #49

ClashLuke commented May 18, 2022

Long-Range-Arena Evaluation #49

Long-Range-Arena Evaluation #49

Comments

ClashLuke commented May 18, 2022