Benchmarks:
-
GSM8k (Cobbe et al., 2021): Arithmetic reasoning with grade school problems and natural language descripts. Problems include arithmic operations: addition, subtraction, multiplication, and division.
-
Big-Bench-Hard (Suzgun et al., 2022)
-
HumanEval (Chen et al., 2021)
-
TheoremQA
-
SummEdits
Task Type | Construction | # Problems | # Problem Types | Problems | Prompt style | |
---|---|---|---|---|---|---|
GSM8k | Arithmic reasoning of math computation steps using language | Manually written language descriptions | 8,500 | 4 | addition, subtraction, multiplication, division | Multi-step reasoing (Similar to chain-of-thoughts): The problems take between 2 and 8 steps to solve, as described by natural languages. |
MATH | Math problems from mathematics competitions | 12,000 | 7 | Multi-step reasoing (Similar to chain-of-thoughts) | ||
MMLU | High-school and college-level common knowledge | 15,000 | 57 | Multiple-choice; Few-show examples. | ||
BBH | Language and symbolic reasoning | 6,500 | 23 | Few shot chain-of-thought exemplars. | ||
HumanEval | Python programming problems with text comments and docstrings test cases | Manually written programs | 164 | |||
TheoremQA | ||||||
SummEdits |
Examples.
Example | |
---|---|
GSM8K |
Question: Angelo and Melanie want to plan how many hours over the next week they should study together for their test next week. They have 2 chapters of their textbook to study and 4 worksheets to memorize. They figure out that they should dedicate 3 hours to each chapter of their textbook and 1.5 hours for each worksheet. If they plan to study no more than 4 hours each day, how many days should they plan to study total over the next week if they take a 10-minute break every hour, include 3 10-minute snack breaks each day, and 30 minutes for lunch each day? Let's think step by step Angelo and Melanie think they should dedicate 3 hours to each of the 2 chapters, 3 hours x 2 chapters = 6 hours total. For the worksheets they plan to dedicate 1.5 hours for each worksheet, 1.5 hours x 4 worksheets = 6 hours total. Angelo and Melanie need to start with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days. However, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks. They also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes. And they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours. So Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total. They want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75 They will need to plan to study 4 days to allow for all the time they need. The answer is 4 |
MATH |
Question: The sum of two numbers is 6. The difference of their squares is 12. What is the positive difference of the two numbers? Let's think step by step Call the two numbers We are given that Because we can substitute in for giving or The answer is 2 |
MMLU | The following are multiple choice questions (with answers) about abstract algebra. Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field. A. 0 B. 1 C. 2 D. 3 Answer: B Statement 1: Every function from a finite set onto itself must be one to one. Statement 2: Every subgroup of an abelian group is abelian. A. True, True B. False, False C. True, False D. False, True Answer: A Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q. A. 0 B. 4 C. 2 D. 6 Answer: B |
BBH |
Input: "If you follow these instructions, do you return to the starting point? Always face forward. Take 1 step backward. Take 9 steps left. Take 2 steps backward. Take 6 steps forward. Take 4 steps forward. Take 4 steps backward. Take 3 steps right. Options: Yes or No", Let's think step by step Target: "No" |
HumanEval |
Prompt: def incr_list(l: list):<br/> """Return list with elements incremented by 1. <br/> >>> incr_list([1, 2, 3])<br/> [2, 3, 4]<br/> >>> incr_list([5, 3, 5, 2, 3, 3, 9, 0])<br/> [6, 4, 6, 3, 4, 4, 10, 1]<br/> """ Output: return [i+1 for i in l]
|
Concepts:
- Fine-tuning: Use the language modeling objective to further train a pretrained language model.
- Verification: First train a generator by question-solution pairs. Then, sample multiple generated solutions, assign each solution a score (binary scores of whether the solution leads to the correct answer), and train a model by the scores. A model trained by the verification scores is called verifier.
- At test time, we sample solutions to each test problem, rank them with the verifier, and then return the one with the highest verifier score.
Prompting methods:
Prompt strategy | Prompts | In-context examples | Prompt generation | GSM8k | |
---|---|---|---|---|---|
Metric: Solve rate (%); Model: Codex | |||||
Scratchpad (Nye et al., 2021) | Break a code function down and ask the model to output all intermediate steps of the code | (input, intermediate steps, output) | Manually designed based on an algorithm | ? | |
Chain-of-though prompting (Wei et al., 2022) | Prompt the model with the rationale in solving a multi-step reasoning problem. | (input, chain-of-thought, output) | Manually written | 63.1 | |
Algotihmic prompting (Zhou et al., 2022) | Prompt the model with detailed rationales, including describing the steps within an algorithm. | (input, algorithmic prompt, output) | Manually written | 82.7 | |
End-to-End Multi-Task Learning with Attention. CVPR 2019. paper
Latent Multi-task Architecture Learning. AAAI 2019. paper
Cross-stitch Networks for Multi-task Learning. CVPR 2016. paper
Learning Multiple Tasks with Multilinear Relationship Networks. NIPS 2017. paper
More multitask learning papers here
Meta-Learning in Neural Networks: A Survey. paper
(MANN) Meta-learning with memory-augmented neural networks. ICML 2016. paper
Matching Networks for One-Shot Learning. NIPS 2016. paper
(SNAIL) A Simple Neural Attentive Meta-Learner. ICLR 2018. paper
(MAML) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. paper ⭐
(Reptile; First-order method) On First-Order Meta-Learning Algorithms. arXiv 2018. paper
(Implicit MAML) Meta-Learning with Implicit Gradients. NIPS 2019. paper
(Implicit Differentiation; SVM) Meta-Learning with Differentiable Convex Optimization. CVPR 2019. paper
(Bayesian linear regression) Meta-Learning Priors for Efficient Online Bayesian Regression. Workshop on the Algorithmic Foundations of Robotics 2018. paper
(Ridge regression; Logistic regression) Meta-learning with Differentiable Closed-Form Solvers. ICLR 2019. paper
(MAML expressive power and university) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper
(Map MAML to Bayes Framework) Recasting Gradient-Based Meta-Learning as Hierarchical Bayes. ICLR 2018. paper
Choose architecture that is effective for inner gradient-step
Auto-Meta: Automated Gradient Based Meta Learner Search. NIPS 2018 Workshop on Meta-Learning. paper
Automatically learn inner vector learning rate, tune outer learning rate
Alpha MAML: Adaptive Model-Agnostic Meta-Learning. ICML 2019 Workshop on Automated Machine Learning. paper
Meta-SGD: Learning to Learn Quickly for Few-Shot Learning. arXiv 2017. paper
Optimize only a subset of the parameters in the inner loop
(DEML) Deep Meta-Learning: Learning to Learn in the Concept Space. arXiv 2018. paper
(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper
Decouple inner learning rate, BN statistics per-step
(MAML++) How to train your MAML. ICLR 2019. paper
Introduce context variables for increased expressive power
(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper
(Bias transformation) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper
Siamese Neural Networks for One-shot Image Recognition. ICML 2015. paper
Matching Networks for One Shot Learning. NIPS 2016. paper
Prototypical Networks for Few-shot Learning. NIPS 2017. paper
Learn non-linear relation module on embeddings
Learning to Compare: Relation Network for Few-Shot Learning. CVPR 2018. paper
Learn infinite mixture of prototypes
Infinite Mixture Prototypes for Few-Shot Learning. ICML 2019. paper
Perform message passing on embeddings
Few-Shot Learning with Graph Neural Networks ICLR 2018. paper
Amortized Bayesian Meta-Learning. ICLR 2019. paper
Bayesian Model-Agnostic Meta-Learning. NIPS 2018. paper
Probabilistic Model-Agnostic Meta-Learning. NIPS 2018. paper
Meta-Learning Probabilistic Inference for Prediction. ICLR 2019. paper
Meta-Learning with Latent Embedding Optimization. ICLR 2019. paper
Fast Context Adaptation via Meta-Learning. ICML 2019. paper
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR 2020. paper
Few-Shot Learning with Graph Neural Networks. ICLR 2018. paper
(CAML) Learning to Learn with Conditional Class Dependencies. ICLR 2019. paper
MAML and Black-Box Meta Learning Approaches can be directly applied to Policy-Gradient RL methods
It is not easy to applied existing meta learning approaches to Value-Based RL because Value-Based RL is dynamic programming method
Meta-Q-Learning. ICLR 2020. paper
(Goal-Conditioned RL with hindsight relabeling)/(Multi-Task RL) Hindsight Experience Replay. NIPS 2017. paper
(better learning) Learning Latent Plans from Play. CoRL 2019. paper
(learn a better goal representation)
Universal Planning Networks. ICML 2018. paper
Unsupervised Visuomotor Control through Distributional Planning Networks. RSS 2019. paper
Meta-Learning for Low-Resource Neural Machine Translation. EMNLP 2018. paper
Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions. ICLR 2018. paper
One-Shot Imitation Learning. NIPS 2017. paper
Massively Multitask Networks for Drug Discovery. ICML 2015. paper