-
DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power 𝕏
-
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games,
arXiv, 2411.13543
, arxiv, pdf, cication: -1Davide Paglieri, Bartłomiej Cupiał, Samuel Coward, ..., Jack Parker-Holder, Tim Rocktäschel
-
🌟 Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding,
arXiv, 2411.04282
, arxiv, pdf, cication: -1Haolin Chen, Yihao Feng, Zuxin Liu, ..., Caiming Xiong, Huan Wang · (LaTRO - SalesforceAIResearch)
-
Large Language Models Can Self-Improve in Long-context Reasoning,
arXiv, 2411.08147
, arxiv, pdf, cication: -1Siheng Li, Cheng Yang, Zesen Cheng, ..., Yujiu Yang, Wai Lam
-
🌟 Combining Induction and Transduction for Abstract Reasoning,
arXiv, 2411.02272
, arxiv, pdf, cication: -1Wen-Ding Li, Keya Hu, Carter Larsen, ..., Yewen Pu, Kevin Ellis · (𝕏)
-
🌟 The Surprising Effectiveness ofTest-Time Training for Abstract Reasoning
-
Can Language Models Learn to Skip Steps?,
arXiv, 2411.01855
, arxiv, pdf, cication: -1Tengxiao Liu, Qipeng Guo, Xiangkun Hu, ..., Xipeng Qiu, Zheng Zhang
-
SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization,
arXiv, 2410.21411
, arxiv, pdf, cication: -1Wanhua Li, Zibin Meng, Jiawei Zhou, ..., Chuang Gan, Hanspeter Pfister · (SocialGPT - Mengzibin)
-
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents,
arXiv, 2410.22476
, arxiv, pdf, cication: -1Ankan Mullick, Sombit Bose, Abhilash Nandy, ..., Gajula Sai Chaitanya, Pawan Goyal
-
Improve Vision Language Model Chain-of-thought Reasoning,
arXiv, 2410.16198
, arxiv, pdf, cication: -1Ruohong Zhang, Bowen Zhang, Yanghao Li, ..., Ruoming Pang, Yiming Yang
· (LLaVA-Reasoner-DPO - RifleZhang)
-
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI,
arXiv, 2411.04872
, arxiv, pdf, cication: -1Elliot Glazer, Ege Erdil, Tamay Besiroglu, ..., Tetiana Grechuk, Shreepranav Varma Enugandla · (epochai) · (𝕏)
-
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning,
arXiv, 2410.22304
, arxiv, pdf, cication: -1Yihe Deng, Paul Mineiro
-
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics,
arXiv, 2410.21272
, arxiv, pdf, cication: -1Yaniv Nikankin, Anja Reusch, Aaron Mueller, ..., Yonatan Belinkov · (x)
-
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models,
arXiv, 2410.07985
, arxiv, pdf, cication: -1Bofei Gao, Feifan Song, Zhe Yang, ..., Tianyu Liu, Baobao Chang · (Omni-MATH - KbsdJames) · (omni-math.github) · (huggingface) · (huggingface)
-
Open-O1 - Open-Source-O1
A Model Matching Proprietary Power with Open-Source Innovation
-
Patience Is The Key to Large Language Model Reasoning,
arXiv, 2411.13082
, arxiv, pdf, cication: -1Yijiong Yu
· (huggingface)
-
🌟 O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?,
arXiv, 2411.16489
, arxiv, pdf, cication: -1Zhen Huang, Haoyang Zou, Xuefeng Li, ..., Weizhe Yuan, Pengfei Liu · (O1-Journey - GAIR-NLP)
-
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision,
arXiv, 2411.16579
, arxiv, pdf, cication: -1Zhiheng Xi, Dingwen Yang, Jixuan Huang, ..., Xuanjing Huang, Yu-Gang Jiang · (mathcritique.github)
-
QwQ: Reflect Deeply on the Boundaries of the Unknown
· (huggingface)
-
🌟 O1-Journey - GAIR-NLP
-
Beyond Decoding: Meta-Generation Algorithms for Large Language Models
· (simons.berkeley)
-
🌟 From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models,
arXiv, 2406.16838
, arxiv, pdf, cication: -1Sean Welleck, Amanda Bertsch, Matthew Finlayson, ..., Ilia Kulikov, Zaid Harchaoui · (cmu-l3.github)
-
DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! 𝕏
· (t)
-
🌟 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions,
arXiv, 2411.14405
, arxiv, pdf, cication: -1Yu Zhao, Huifeng Yin, Bo Zeng, ..., Weihua Luo, Kaifu Zhang · (Marco-o1 - AIDC-AI)
-
🌟 entropix - xjdr-alt
-
Thinking-Claude - richards199999
-
LLaMA-O1 - SimpleBerry
Open Large Reasoning Model Frameworks For Training, Inference and Evaluation With PyTorch and HuggingFace · (qbitai)
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model,
arXiv, 2410.13639
, arxiv, pdf, cication: -1Siwei Wu, Zhongyuan Peng, Xinrun Du, ..., Chenghua Lin, J. H. Liu
-
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces,
arXiv, 2410.09918
, arxiv, pdf, cication: -1DiJia Su, Sainbayar Sukhbaatar, Michael Rabbat, ..., Yuandong Tian, Qinqing Zheng
-
O1-Journey - GAIR-NLP
A Strategic Progress Report
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model,
arXiv, 2410.13639
, arxiv, pdf, cication: -1Siwei Wu, Zhongyuan Peng, Xinrun Du, ..., Chenghua Lin, J. H. Liu
-
Disentangling Memory and Reasoning Ability in Large Language Models,
arXiv, 2411.13504
, arxiv, pdf, cication: -1Mingyu Jin, Weidi Luo, Sitao Cheng, ..., William Yang Wang, Yongfeng Zhang · (Disentangling-Memory-and-Reasoning - MingyuJ666)
-
A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration,
arXiv, 2410.16540
, arxiv, pdf, cication: -1Yingqian Cui, Pengfei He, Xianfeng Tang, ..., Jiliang Tang, Yue Xing · (𝕏)
-
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse,
arXiv, 2410.21333
, arxiv, pdf, cication: -1Ryan Liu, Jiayi Geng, Addison J. Wu, ..., Tania Lombrozo, Thomas L. Griffiths
-
Does Prompt Formatting Have Any Impact on LLM Performance?,
arXiv, 2411.10541
, arxiv, pdf, cication: -1Jia He, Mukund Rungta, David Koleczek, ..., Franklin X Wang, Sadid Hasan
-
ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs,
arXiv, 2410.12405
, arxiv, pdf, cication: -1Jingming Zhuo, Songyang Zhang, Xinyu Fang, ..., Dahua Lin, Kai Chen · (ProSA - open-compass)
-
prompt-poet - character-ai
-
V0-system-prompt - 2-fly-4-ai
· (reddit)