LLM Reasoning

LLM Reasoning

Survey

Reasoning

DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power 𝕏
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games, arXiv, 2411.13543, arxiv, pdf, cication: -1

Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, ..., Jack Parker-Holder, Tim Rocktäschel
🌟 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding, arXiv, 2411.04282, arxiv, pdf, cication: -1

Haolin Chen, Yihao Feng, Zuxin Liu, ..., Caiming Xiong, Huan Wang · (LaTRO - SalesforceAIResearch)
Large Language Models Can Self-Improve in Long-context Reasoning, arXiv, 2411.08147, arxiv, pdf, cication: -1

Siheng Li, Cheng Yang, Zesen Cheng, ..., Yujiu Yang, Wai Lam
🌟 Combining Induction and Transduction for Abstract Reasoning, arXiv, 2411.02272, arxiv, pdf, cication: -1

Wen-Ding Li, Keya Hu, Carter Larsen, ..., Yewen Pu, Kevin Ellis · (𝕏)
🌟 The Surprising Effectiveness ofTest-Time Training for Abstract Reasoning

· (𝕏) · (marc - ekinakyurek)
Can Language Models Learn to Skip Steps?, arXiv, 2411.01855, arxiv, pdf, cication: -1

Tengxiao Liu, Qipeng Guo, Xiangkun Hu, ..., Xipeng Qiu, Zheng Zhang
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization, arXiv, 2410.21411, arxiv, pdf, cication: -1

Wanhua Li, Zibin Meng, Jiawei Zhou, ..., Chuang Gan, Hanspeter Pfister · (SocialGPT - Mengzibin)
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents, arXiv, 2410.22476, arxiv, pdf, cication: -1

Ankan Mullick, Sombit Bose, Abhilash Nandy, ..., Gajula Sai Chaitanya, Pawan Goyal
Combining Induction and Transduction for Abstract Reasoning
Improve Vision Language Model Chain-of-thought Reasoning, arXiv, 2410.16198, arxiv, pdf, cication: -1

Ruohong Zhang, Bowen Zhang, Yanghao Li, ..., Ruoming Pang, Yiming Yang

· (LLaVA-Reasoner-DPO - RifleZhang)

Math Reasoning

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI, arXiv, 2411.04872, arxiv, pdf, cication: -1

Elliot Glazer, Ege Erdil, Tamay Besiroglu, ..., Tetiana Grechuk, Shreepranav Varma Enugandla · (epochai) · (𝕏)
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning, arXiv, 2410.22304, arxiv, pdf, cication: -1

Yihe Deng, Paul Mineiro
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics, arXiv, 2410.21272, arxiv, pdf, cication: -1

Yaniv Nikankin, Anja Reusch, Aaron Mueller, ..., Yonatan Belinkov · (x)
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models, arXiv, 2410.07985, arxiv, pdf, cication: -1

Bofei Gao, Feifan Song, Zhe Yang, ..., Tianyu Liu, Baobao Chang · (Omni-MATH - KbsdJames) · (omni-math.github) · (huggingface) · (huggingface)

O1 Reasoning

Exploring OpenAI O1 Model Replication
Open-O1 - Open-Source-O1

A Model Matching Proprietary Power with Open-Source Innovation
Patience Is The Key to Large Language Model Reasoning, arXiv, 2411.13082, arxiv, pdf, cication: -1

Yijiong Yu

· (huggingface)
🌟 O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?, arXiv, 2411.16489, arxiv, pdf, cication: -1

Zhen Huang, Haoyang Zou, Xuefeng Li, ..., Weizhe Yuan, Pengfei Liu · (O1-Journey - GAIR-NLP)
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision, arXiv, 2411.16579, arxiv, pdf, cication: -1

Zhiheng Xi, Dingwen Yang, Jixuan Huang, ..., Xuanjing Huang, Yu-Gang Jiang · (mathcritique.github)
QwQ: Reflect Deeply on the Boundaries of the Unknown

· (huggingface)
Skywork o1 Open model series 🤗
🌟 O1-Journey - GAIR-NLP
Beyond Decoding: Meta-Generation Algorithms for Large Language Models

· (simons.berkeley)
🌟 From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models, arXiv, 2406.16838, arxiv, pdf, cication: -1

Sean Welleck, Amanda Bertsch, Matthew Finlayson, ..., Ilia Kulikov, Zaid Harchaoui · (cmu-l3.github)
honorable mentions to Nous Forge Reasoning API and Fireworks f1, DeepSeek appear to have made the first convincing attempt
DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! 𝕏

· (t)
🌟 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions, arXiv, 2411.14405, arxiv, pdf, cication: -1

Yu Zhao, Huifeng Yin, Bo Zeng, ..., Weihua Luo, Kaifu Zhang · (Marco-o1 - AIDC-AI)
🌟 entropix - xjdr-alt
Thinking-Claude - richards199999
Speculations on Test-Time Scaling (o1) 🎬
Tess-R1 is designed with test-time compute in mind, and has the capabilities to produce a Chain-of-Thought (CoT) reasoning before producing the final output. 🤗
LLaMA-O1 - SimpleBerry

Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace · (qbitai)
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model, arXiv, 2410.13639, arxiv, pdf, cication: -1

Siwei Wu, Zhongyuan Peng, Xinrun Du, ..., Chenghua Lin, J. H. Liu
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces, arXiv, 2410.09918, arxiv, pdf, cication: -1

DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, ..., Yuandong Tian, Qinqing Zheng
O1-Journey - GAIR-NLP

A Strategic Progress Report
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model, arXiv, 2410.13639, arxiv, pdf, cication: -1

Siwei Wu, Zhongyuan Peng, Xinrun Du, ..., Chenghua Lin, J. H. Liu

Disentanglement

Disentangling Memory and Reasoning Ability in Large Language Models, arXiv, 2411.13504, arxiv, pdf, cication: -1

Mingyu Jin, Weidi Luo, Sitao Cheng, ..., William Yang Wang, Yongfeng Zhang · (Disentangling-Memory-and-Reasoning - MingyuJ666)

Self Correction

Knowledge

Context Learning

Chain Of Thought

A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration, arXiv, 2410.16540, arxiv, pdf, cication: -1

Yingqian Cui, Pengfei He, Xianfeng Tang, ..., Jiliang Tang, Yue Xing · (𝕏)
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse, arXiv, 2410.21333, arxiv, pdf, cication: -1

Ryan Liu, Jiayi Geng, Addison J. Wu, ..., Tania Lombrozo, Thomas L. Griffiths

Prompt

Prompt Design at Character.AI
Does Prompt Formatting Have Any Impact on LLM Performance?, arXiv, 2411.10541, arxiv, pdf, cication: -1

Jia He, Mukund Rungta, David Koleczek, ..., Franklin X Wang, Sadid Hasan
automatic prompt optimization algorithms 𝕏
Automatic Prompt Optimization
MacOS 15.1 Apple Intelligence Prompt Templates 𝕏
use RL to automatically improve our prompts 𝕏
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs, arXiv, 2410.12405, arxiv, pdf, cication: -1

Jingming Zhuo, Songyang Zhang, Xinyu Fang, ..., Dahua Lin, Kai Chen · (ProSA - open-compass)

Projects

prompt-poet - character-ai
V0-system-prompt - 2-fly-4-ai

· (reddit)
steiner-preview updated 3 days ago Reasoning models trained on synthetic data using reinforcement learning. 🤗

Misc

【北大对齐团队独家解读：OpenAI o1开启「后训练」时代强化学习新范式】 🎬
The Problem with Reasoners

Planning

Revealing the Barriers of Language Agents in Planning, arXiv, 2410.12409, arxiv, pdf, cication: -1

Jian Xie, Kexun Zhang, Jiangjie Chen, ..., Lei Li, Yanghua Xiao

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm_reasoning.md

llm_reasoning.md

LLM Reasoning

Survey

Reasoning

Math Reasoning

O1 Reasoning

Disentanglement

Self Correction

Knowledge

Context Learning

Chain Of Thought

Prompt

Projects

Misc

Planning

Files

llm_reasoning.md

Latest commit

History

llm_reasoning.md

File metadata and controls

LLM Reasoning

Survey

Reasoning

Math Reasoning

O1 Reasoning

Disentanglement

Self Correction

Knowledge

Context Learning

Chain Of Thought

Prompt

Projects

Misc

Planning