Skip to content

Commit

Permalink
Update latest.csv
Browse files Browse the repository at this point in the history
  • Loading branch information
kuzcotopiallm authored Mar 23, 2024
1 parent d13d275 commit 3cec6d5
Showing 1 changed file with 87 additions and 3 deletions.
90 changes: 87 additions & 3 deletions latest.csv
Original file line number Diff line number Diff line change
@@ -1,3 +1,87 @@
Rank,Model,Size,Format,Quant,Context,Prompt,1st Score,2nd Score,OK,+/-
1 🆕,[claude-3-opus-20240229](https://www.anthropic.com/claude),Claude 3 Opus,API,,,,18/18 ✓,18/18 ✓,✗,✓

Rank,Model,Size,Format,Quant,Context,Prompt,1st Score,2nd Score,OK,+/-
1 🆕,[claude-3-opus-20240229](https://www.anthropic.com/claude),Claude 3 Opus,API,,,,18/18 ✓,18/18 ✓,✗,✓,
1,[GPT-4](https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/),GPT-4,API,,,,18/18 ✓,18/18 ✓,✓,✓,
1 🆕,[mistral-large-2402](https://mistral.ai/),Mistral,API,,,,18/18 ✓,18/18 ✓,✗,✗,
1,[miquliz-120b-v2.0](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),120B,EXL2,3.0bpw,~~32K~~ 4K-12K,Mistral,18/18 ✓,18/18 ✓,✓,✓,
1,[goliath-120b-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),120B,GGUF,Q2_K,4K,Vicuna 1.1,18/18 ✓,18/18 ✓,✓,✓,
1,[Tess-XL-v1.0-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),120B,GGUF,Q2_K,4K,Synthia,18/18 ✓,18/18 ✓,✓,✓,
1,[Nous-Capybara-34B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),34B,GGUF,Q4_0,16K,Vicuna 1.1,18/18 ✓,18/18 ✓,✓,✓,
1,[Venus-120b-v1.0](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),120B,EXL2,3.0bpw,4K,Alpaca,18/18 ✓,18/18 ✓,✓,✗,
2,[wolfram/miqu-1-120b](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),120B,EXL2,3.0bpw,4K,Mistral,18/18 ✓,18/18 ✓,✗,,
3,[miquella-120b-3.0bpw-h6-exl2](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),120B,EXL2,3.0bpw,~~32K~~ 4K,Mistral,18/18 ✓,17/18,✓,✓,
3,[lzlv_70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Vicuna 1.1,18/18 ✓,17/18,✓,✓,
4,[Mixtral_34Bx2_MoE_60B](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),2x34B,HF,4-bit,~~200K~~ 4K,Alpaca,18/18 ✓,17/18,✓,✗,
5,[miquliz-120b-xs.gguf](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),120B,GGUF,IQ2_XS,~~32K~~ 4K,Mistral,18/18 ✓,17/18,✗,,
6,[GPT-4 Turbo](https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/),GPT-4,API,,,,18/18 ✓,16/18,✓,✓,
6,[chronos007-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Alpaca,18/18 ✓,16/18,✓,✓,
6,[SynthIA-70B-v1.5-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,SynthIA,18/18 ✓,16/18,✓,✓,
6,[Gembo-v1-70b-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),70B,GGUF,Q5_K_M,4K,Alpaca,18/18 ✓,16/18,✓,,
6,[bagel-34b-v0.2](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),34B,HF,4-bit,~~200K~~ 4K,Alpaca,18/18 ✓,16/18,✓,✗,
7,[Mixtral-8x7B-Instruct-v0.1](https://www.reddit.com/r/LocalLLaMA/comments/18gz54r/llm_comparisontest_mixtral8x7b_mistral_decilm/),8x7B,HF,4-bit,~~32K~~ 4K,Mixtral,18/18 ✓,16/18,✗,✓,
8,[dolphin-2_2-yi-34b-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),34B,GGUF,Q4_0,16K,ChatML,18/18 ✓,15/18,✗,✗,
9,[StellarBright-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Vicuna 1.1,18/18 ✓,14/18,✓,✓,
10,[Dawn-v2-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Alpaca,18/18 ✓,14/18,✓,✗,
10,[Euryale-1.3-L2-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Alpaca,18/18 ✓,14/18,✓,✗,
10,[bagel-dpo-34b-v0.2](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),34B,HF,4-bit,~~200K~~ 4K,Alpaca,18/18 ✓,14/18,✓,✗,
10,[nontoxic-bagel-34b-v0.2](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),34B,HF,4-bit,~~200K~~ 4K,Alpaca,18/18 ✓,14/18,✓,✗,
11,[miquella-120b](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),120B,GGUF,IQ3_XXS,~~32K~~ 4K,Mistral,18/18 ✓,13/18,✓,,
11,[sophosynthesis-70b-v1](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,EXL2,4.85bpw,4K,Vicuna 1.1,18/18 ✓,13/18,✓,✓,
12,[Mixtral_11Bx2_MoE_19B](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),2x11B,HF,—,~~200K~~ 4K,Alpaca,18/18 ✓,13/18,✗,✗,
13,[GodziLLa2-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Alpaca,18/18 ✓,12/18,✓,✓,
14,[miquliz-120b-v2.0-iMat.GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),120B,GGUF,IQ2_XS,~~32K~~ 4K,Mistral,18/18 ✓,11/18,✗,,
15,[Samantha-1.11-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,Vicuna 1.1,18/18 ✓,10/18,✗,✗,
16,[miquella-120b](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),120B,GGUF,Q2_K,~~32K~~ 4K,Mistral,17/18,17/18,✓,,
17,[MegaDolphin-120b-exl2](https://www.reddit.com/r/LocalLLaMA/comments/19d1fjp/llm_comparisontest_6_new_models_from_16b_to_120b/),120B,EXL2,3.0bpw,4K,ChatML,17/18,16/18,✓,,
17,[Airoboros-L2-70B-3.1.2-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_K_M,4K,Llama 2 Chat,17/18,16/18,✓,✗,
18,[Midnight-Miqu-70B-v1.0-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),70B,GGUF,Q4_K_M,~~32K~~ 4K,Vicuna 1.1,17/18,16/18,✗,,
18,[Gemini Pro](https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/),Gemini,API,,,,17/18,16/18,✗,✗,
19,[miquliz-120b-v2.0-i1-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),120B,GGUF,IQ1_S,~~32K~~ 4K,Mistral,17/18,15/18,✗,,
19,[Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),8x7B,GGUF,Q4_K_M,~~32K~~ 4K,ChatML,17/18,15/18,✗,,
19,[SauerkrautLM-UNA-SOLAR-Instruct](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,17/18,15/18,✗,✗,
19,[UNA-SOLAR-10.7B-Instruct-v1.0](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,17/18,15/18,✗,✗,
20,[Senku-70B-Full-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),70B,GGUF,Q5_K_M,~~32K~~ 4K,ChatML,17/18,14/18,✓,,
21,[Rogue-Rose-103b-v0.2](https://www.reddit.com/r/LocalLLaMA/comments/18ft8f5/updated_llm_comparisontest_with_new_rp_model/),103B,EXL2,3.2bpw,4K,Rogue Rose,17/18,14/18,✗,✗,
21,[laserxtral](https://www.reddit.com/r/LocalLLaMA/comments/19d1fjp/llm_comparisontest_6_new_models_from_16b_to_120b/),4x7B,GGUF,Q6_K,8K,Alpaca,17/18,14/18,✗,,
21,[SOLAR-10.7B-Instruct-v1.0](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,17/18,14/18,✗,✗,
22,[MiquMaid-v1-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),70B,GGUF,Q5_K_M,~~32K~~ 4K,Alpaca,17/18,13/18,✓,,
22,[miqu-1-70b](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),70B,GGUF,Q5_K_M,32K,Mistral,17/18,13/18,✗,,
22,[miqu-1-70b](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),70B,GGUF,Q4_K_M,~~32K~~ 4K,Mistral,17/18,13/18,✗,,
22,[MIstral-QUantized-70b_Miqu-1-70b-iMat.GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),70B,GGUF,Q4_K_S,~~32K~~ 4K,Mistral,17/18,13/18,✗,,
23,[Midnight-Rose-70B-v2.0.3-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),70B,GGUF,IQ3_XXS,4K,Vicuna 1.1,17/18,11/18,✓,,
24,[GPT-3.5 Turbo Instruct](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),GPT-3.5,API,,,,17/18,11/18,✗,✗,
24,[mistral-small](https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/),Mistral,API,,,,17/18,11/18,✗,✗,
25,[WestLake-7B-v2](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),7B,HF,,4K,ChatML,17/18,10/18,✗,,
25,[SOLARC-M-10.7B](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,17/18,10/18,✗,✗,
26 🆕,[claude-3-sonnet-20240229](https://www.anthropic.com/claude),Claude 3 Sonnet,API,,,,17/18,9/18,✗,✓,
26,[Synthia-MoE-v3-Mixtral-8x7B](https://www.reddit.com/r/LocalLLaMA/comments/18gz54r/llm_comparisontest_mixtral8x7b_mistral_decilm/),8x7B,HF,4-bit,~~32K~~ 4K,~~Synthia~~ Llama 2 Chat,17/18,9/18,✗,✗,
27,[Nous-Hermes-2-Mixtral-8x7B-SFT](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),8x7B,HF,4-bit,32K,ChatML,17/18,5/18,✓,,
28,[miqu-1-70b-exl2](https://www.reddit.com/r/LocalLLaMA/comments/1aix93e/llm_comparisontest_miqu_miqu_miqu_miquella_maid/),70B,EXL2,3.0bpw,~~32K~~ 4K,Mistral,16/18,16/18,✗,,
29,[SOLAR-10.7B-Instruct-v1.0-uncensored](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,16/18,15/18,✗,✗,
30,[bagel-dpo-8x7b-v0.2](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),8x7B,HF,4-bit,~~200K~~ 4K,Alpaca,16/18,14/18,✓,✗,
31,[dolphin-2.2-70B-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),70B,GGUF,Q4_0,4K,ChatML,16/18,14/18,✗,✓,
31,[miqu-1-103b-i1-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),103B,GGUF,IQ2_XS,~~32K~~ 4K,Mistral,16/18,14/18,✗,,
31,[WestLake-7B-v2-laser](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),7B,HF,,4K,ChatML,16/18,14/18,✗,,
32,[Beyonder-4x7B-v2-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/19d1fjp/llm_comparisontest_6_new_models_from_16b_to_120b/),4x7B,GGUF,Q8_0,8K,ChatML,16/18,13/18,✓,,
33,[mistral-ft-optimized-1218](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,~~32K~~ 8K,Alpaca,16/18,13/18,✗,✓,
34,[SauerkrautLM-SOLAR-Instruct](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,16/18,13/18,✗,✗,
34,[OpenHermes-2.5-Mistral-7B](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,~~32K~~ 8K,ChatML,16/18,13/18,✗,✗,
35,[Nous-Hermes-2-Mixtral-8x7B-SFT-GGUF](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),8x7B,GGUF,Q4_K_M,~~32K~~ 4K,ChatML,16/18,12/18,✓,,
36,[SOLARC-MOE-10.7Bx4](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),4x11B,HF,4-bit,4K,User-Ass.-Newlines,16/18,12/18,✗,✗,
36,[Nous-Hermes-2-SOLAR-10.7B](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,16/18,12/18,✗,✗,
36,[Sakura-SOLAR-Instruct](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),11B,HF,—,4K,User-Ass.-Newlines,16/18,12/18,✗,✗,
36,[Mistral-7B-Instruct-v0.2](https://www.reddit.com/r/LocalLLaMA/comments/18gz54r/llm_comparisontest_mixtral8x7b_mistral_decilm/),7B,HF,—,32K,Mistral,16/18,12/18,✗,✗,
37,[DeciLM-7B-instruct](https://www.reddit.com/r/LocalLLaMA/comments/18gz54r/llm_comparisontest_mixtral8x7b_mistral_decilm/),7B,HF,—,32K,Mistral,16/18,11/18,✗,✗,
37,[Marcoroni-7B-v3](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,~~32K~~ 8K,Alpaca,16/18,11/18,✗,✗,
37,[SauerkrautLM-7b-HerO](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,~~32K~~ 8K,ChatML,16/18,11/18,✗,✗,
38,[mistral-medium](https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/),Mistral,API,,,,15/18,17/18,✗,✗,
39,[mistral-ft-optimized-1227](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,~~32K~~ 8K,Alpaca,15/18,14/18,✗,✓,
40,[GPT-3.5 Turbo](https://www.reddit.com/r/LocalLLaMA/comments/185ff51/big_llm_comparisontest_3x_120b_12x_70b_2x_34b/),GPT-3.5,API,,,,15/18,14/18,✗,✗,
41,[dolphin-2.5-mixtral-8x7b](https://www.reddit.com/r/LocalLLaMA/comments/18gz54r/llm_comparisontest_mixtral8x7b_mistral_decilm/),8x7B,HF,4-bit,~~32K~~ 4K,ChatML,15/18,13/18,✗,✓,
42,[Starling-LM-7B-alpha](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,8K,OpenChat (GPT4 Correct),15/18,13/18,✗,✗,
43,[dolphin-2.6-mistral-7b-dpo](https://www.reddit.com/r/LocalLLaMA/comments/18w9hak/llm_comparisontest_brand_new_models_for_2024/),7B,HF,—,16K,ChatML,15/18,12/18,✗,✗,
44,[Mixtral_7Bx2_MoE](https://www.reddit.com/r/LocalLLaMA/comments/19d1fjp/llm_comparisontest_6_new_models_from_16b_to_120b/),2x7B,HF,—,8K,ChatML,15/18,11/18,✓,,
45,[Nous-Hermes-2-Mixtral-8x7B-DPO](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/),8x7B,HF,4-bit,32K,ChatML,15/18,10/18,✓,,
46,[sparsetral-16x7B-v2](https://www.reddit.com/r/LocalLLaMA/comments/1b5vp2e/llm_comparisontest_17_new_models_64_total_ranked/),16x7B,HF,,4K,ChatML,15/18,7/18,✓,,
47,[openchat-3.5-1210](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),7B,HF,—,8K,OpenChat (GPT4 Correct),15/18,7/18,✗,✗,
48,[dolphin-2.7-mixtral-8x7b](https://www.reddit.com/r/LocalLLaMA/comments/18w9hak/llm_comparisontest_brand_new_models_for_2024/),8x7B,HF,4-bit,32K,ChatML,15/18,6/18,✗,✗,
49,[dolphin-2.6-mixtral-8x7b](https://www.reddit.com/r/LocalLLaMA/comments/18u122l/llm_comparisontest_ranking_updated_with_10_new/),8x7B,HF,4-bit,~~32K~~ 16K,ChatML,14/18,12/18,✗,✗,

0 comments on commit 3cec6d5

Please sign in to comment.