Skip to content

Latest commit

 

History

History
194 lines (130 loc) · 14.9 KB

llm_reasoning.md

File metadata and controls

194 lines (130 loc) · 14.9 KB

LLM Reasoning

Survey

Reasoning

  • DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power 𝕏

  • BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games, arXiv, 2411.13543, arxiv, pdf, cication: -1

    Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, ..., Jack Parker-Holder, Tim Rocktäschel

  • 🌟 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding, arXiv, 2411.04282, arxiv, pdf, cication: -1

    Haolin Chen, Yihao Feng, Zuxin Liu, ..., Caiming Xiong, Huan Wang · (LaTRO - SalesforceAIResearch) Star

  • Large Language Models Can Self-Improve in Long-context Reasoning, arXiv, 2411.08147, arxiv, pdf, cication: -1

    Siheng Li, Cheng Yang, Zesen Cheng, ..., Yujiu Yang, Wai Lam

  • 🌟 Combining Induction and Transduction for Abstract Reasoning, arXiv, 2411.02272, arxiv, pdf, cication: -1

    Wen-Ding Li, Keya Hu, Carter Larsen, ..., Yewen Pu, Kevin Ellis · (𝕏)

  • 🌟 The Surprising Effectiveness ofTest-Time Training for Abstract Reasoning

    · (𝕏) · (marc - ekinakyurek) Star

  • Can Language Models Learn to Skip Steps?, arXiv, 2411.01855, arxiv, pdf, cication: -1

    Tengxiao Liu, Qipeng Guo, Xiangkun Hu, ..., Xipeng Qiu, Zheng Zhang

  • SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization, arXiv, 2410.21411, arxiv, pdf, cication: -1

    Wanhua Li, Zibin Meng, Jiawei Zhou, ..., Chuang Gan, Hanspeter Pfister · (SocialGPT - Mengzibin) Star

  • A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents, arXiv, 2410.22476, arxiv, pdf, cication: -1

    Ankan Mullick, Sombit Bose, Abhilash Nandy, ..., Gajula Sai Chaitanya, Pawan Goyal

  • Combining Induction and Transduction for Abstract Reasoning

  • Improve Vision Language Model Chain-of-thought Reasoning, arXiv, 2410.16198, arxiv, pdf, cication: -1

    Ruohong Zhang, Bowen Zhang, Yanghao Li, ..., Ruoming Pang, Yiming Yang

    · (LLaVA-Reasoner-DPO - RifleZhang) Star

Math Reasoning

  • FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI, arXiv, 2411.04872, arxiv, pdf, cication: -1

    Elliot Glazer, Ege Erdil, Tamay Besiroglu, ..., Tetiana Grechuk, Shreepranav Varma Enugandla · (epochai) · (𝕏)

  • Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning, arXiv, 2410.22304, arxiv, pdf, cication: -1

    Yihe Deng, Paul Mineiro

  • Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics, arXiv, 2410.21272, arxiv, pdf, cication: -1

    Yaniv Nikankin, Anja Reusch, Aaron Mueller, ..., Yonatan Belinkov · (x)

  • Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models, arXiv, 2410.07985, arxiv, pdf, cication: -1

    Bofei Gao, Feifan Song, Zhe Yang, ..., Tianyu Liu, Baobao Chang · (Omni-MATH - KbsdJames) Star · (omni-math.github) · (huggingface) · (huggingface)

O1 Reasoning

Disentanglement

  • Disentangling Memory and Reasoning Ability in Large Language Models, arXiv, 2411.13504, arxiv, pdf, cication: -1

    Mingyu Jin, Weidi Luo, Sitao Cheng, ..., William Yang Wang, Yongfeng Zhang · (Disentangling-Memory-and-Reasoning - MingyuJ666) Star

Self Correction

Knowledge

Context Learning

Chain Of Thought

  • A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration, arXiv, 2410.16540, arxiv, pdf, cication: -1

    Yingqian Cui, Pengfei He, Xianfeng Tang, ..., Jiliang Tang, Yue Xing · (𝕏)

  • Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse, arXiv, 2410.21333, arxiv, pdf, cication: -1

    Ryan Liu, Jiayi Geng, Addison J. Wu, ..., Tania Lombrozo, Thomas L. Griffiths

Prompt

Projects

Misc

Planning

  • Revealing the Barriers of Language Agents in Planning, arXiv, 2410.12409, arxiv, pdf, cication: -1

    Jian Xie, Kexun Zhang, Jiangjie Chen, ..., Lei Li, Yanghua Xiao