Skip to content

Latest commit

 

History

History
124 lines (93 loc) · 13 KB

README.md

File metadata and controls

124 lines (93 loc) · 13 KB

awesome-reasoning

Follow on X    Join Discord

Adding reasoning to your AI? Take these resources, they may help you on your way. Icon


Datasets

AGI/causality/frml grammar
Deepmind Chomsky Hierarchy Problems crafted for FSM/PDA/TM [1]
automata a neurallambda tool to gen from grammars [1]
im a strange dataset Tough for LLMs because of self-references. [1]
DiagGSM8k NL Reasoning Benchmark [1]
CLadder Causal reasoning [1]
Cause-Effect Pairs 108 datasets of 2 var dynamics (not NL) [1]
MNLI Entailment sentence parsing + entailment [1]
AGENT/TOOL
THUDM AgentInstruct long form dialogs [1]
WANG AgentInstruct gpt3 synthesized instructions [1]
KnowLM Tool prompt + tool call + answer [1]
Glaive Tool Usage sys prompt says tools + prompt + answer [1]
opentoolformer retrieval prompt + tool call [1]
CODE
rosetta same program, many diff languages [1]
EvoEval Tool Use 100 prompt + code + tests [1]
MATH/LOGIC
gsm8k Grade School Math 8k [1]
MetaMath one-shot math [1]
MetaMathFewShot few-shot math [1]
MathPile 9B tok from filtered internet [1]
LogiQA NL multi choice, requires abstraction [1]
Logic-LM a model combining auto theorem provers and llms [1]
Coq Facts 270k cog theorem prover programs [1]
NATURAL LANGUAGE
UltraInteract_sft GPT generated iterated reasoning dialogs [1]
MUD videogames (various could be training data)
Winogrande ambiguous sentences, fill in 1 word [1]
Winograd_wsc ambiguous sentences, choose the right word [1]
Contradiction 2 phrases, do they contradict [1]
Recognizing Textual Entailment 2 phrases, do they entail each other [1]
Textual Entailment Pool more entailment [1]
Answer Validation 2 phrases, does the answer solve question [1]
Monotonicity Entailment x is true, does y follow [1]
entailment passage, question -> T/F [1]
Commonsense QA muti choice QA [1]
GLUE several datasets [1]
custom multi-hop use wikipedia's graph of articles
TOY PROBLEMS
Big Bench Hard 23 challenges (only 6k datapoints) [1]
logical entailment dataset logic strings by deepmind [1]
logical entailment dataset code (generate it yourself) [1]
FSM Game generate strings according to grammar
Adaptive Grammar grammar rule might change
String/Graph Rewriting string_rewriting.py
LibraryOfLogic generate NL from multiple games [1]
AB-XY Game
word ladder
parser
longest cmn subseq
string reversal
wisconsin card sorting
anagram
palindrome
permutation composition



Algorithms

TOKEN AUGMENTED REASONING
Reasoning tokens Self-Reasoning Tokens, teaching models to think ahead [1]
Quiet-STaR LLMs Can Teach Themselves to Think Before Speaking [1]
Multi-token Prediction Multi-token prediction is favorable for the development of induction heads and algorithmic reasoning capabilities https://arxiv.org/abs/2404.19737



Prompt Engineering

INDIRECT REASONING (IR)
Contrapositive and Contradiction for Automated Reasoning use logic of contrapositives and contradictions for factual reasoning and mathematical proofs https://arxiv.org/pdf/2402.03667
DIRECT REASONING (DR)
Graph of Thoughts (GoT) Model the information generated by an LLM as an arbitrary graph https://arxiv.org/abs/2308.09687
Self-Consistency Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer https://arxiv.org/abs/2203.11171
Chain of Thoughts chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning https://arxiv.org/abs/2201.11903
Chain of thoughts without prompting CoT reasoning paths can be elicited from pre-trained LLMs by simply altering the decoding proces https://arxiv.org/abs/2402.10200
Iterative Reasoning Preference Optimization Iterated DPO, but for CoT, repeated until performance saturates on reasoning tasks https://arxiv.org/pdf/2404.19733