GitHub - VirtuosoResearch/Multitask-Learning-and-Fine-Tuning: Recent papers and projects in multitask Learning, fine-tuning, and their applications

Benchmarks for Reasoning Abilities of Large Language Models

Benchmarks:

GSM8k (Cobbe et al., 2021): Arithmetic reasoning with grade school problems and natural language descripts. Problems include arithmic operations: addition, subtraction, multiplication, and division.
MATH (Hendrycks et al., 2021)
MMLU (Hendrycks et al., 2020)
Big-Bench-Hard (Suzgun et al., 2022)
HumanEval (Chen et al., 2021)
TheoremQA
SummEdits

	Task Type	Construction	# Problems	# Problem Types	Problems	Prompt style
GSM8k	Arithmic reasoning of math computation steps using language	Manually written language descriptions	8,500	4	addition, subtraction, multiplication, division	Multi-step reasoing (Similar to chain-of-thoughts): The problems take between 2 and 8 steps to solve, as described by natural languages.
MATH	Math problems from mathematics competitions		12,000	7		Multi-step reasoing (Similar to chain-of-thoughts)
MMLU	High-school and college-level common knowledge		15,000	57		Multiple-choice; Few-show examples.
BBH	Language and symbolic reasoning		6,500	23		Few shot chain-of-thought exemplars.
HumanEval	Python programming problems with text comments and docstrings test cases	Manually written programs	164
TheoremQA
SummEdits

Examples.

	Example
GSM8K	Question: Angelo and Melanie want to plan how many hours over the next week they should study together for their test next week. They have 2 chapters of their textbook to study and 4 worksheets to memorize. They figure out that they should dedicate 3 hours to each chapter of their textbook and 1.5 hours for each worksheet. If they plan to study no more than 4 hours each day, how many days should they plan to study total over the next week if they take a 10-minute break every hour, include 3 10-minute snack breaks each day, and 30 minutes for lunch each day? Let's think step by step Angelo and Melanie think they should dedicate 3 hours to each of the 2 chapters, 3 hours x 2 chapters = 6 hours total. For the worksheets they plan to dedicate 1.5 hours for each worksheet, 1.5 hours x 4 worksheets = 6 hours total. Angelo and Melanie need to start with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days. However, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks. They also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes. And they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours. So Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total. They want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75 They will need to plan to study 4 days to allow for all the time they need. The answer is 4
MATH	Question: The sum of two numbers is 6. The difference of their squares is 12. What is the positive difference of the two numbers? Let's think step by step Call the two numbers $x$ and $y$. We are given that $x+y = 6$ and $x^2 - y^2 = 12$. Because $x^2 - y^2$ factors into $(x+y)(x-y)$, we can substitute in for $x+y$, giving $6(x-y) = 12$, or $x-y = \boxed{2}$. The answer is 2
MMLU	The following are multiple choice questions (with answers) about abstract algebra. Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field. A. 0 B. 1 C. 2 D. 3 Answer: B Statement 1: Every function from a finite set onto itself must be one to one. Statement 2: Every subgroup of an abelian group is abelian. A. True, True B. False, False C. True, False D. False, True Answer: A Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q. A. 0 B. 4 C. 2 D. 6 Answer: B
BBH	Input: "If you follow these instructions, do you return to the starting point? Always face forward. Take 1 step backward. Take 9 steps left. Take 2 steps backward. Take 6 steps forward. Take 4 steps forward. Take 4 steps backward. Take 3 steps right. Options: Yes or No", Let's think step by step Target: "No"
HumanEval	Prompt: `def incr_list(l: list):<br/> """Return list with elements incremented by 1. <br/> >>> incr_list([1, 2, 3])<br/> [2, 3, 4]<br/> >>> incr_list([5, 3, 5, 2, 3, 3, 9, 0])<br/> [6, 4, 6, 3, 4, 4, 10, 1]<br/> """` Output: `return [i+1 for i in l]`

Concepts:

Fine-tuning: Use the language modeling objective to further train a pretrained language model.
Verification: First train a generator by question-solution pairs. Then, sample multiple generated solutions, assign each solution a score (binary scores of whether the solution leads to the correct answer), and train a model by the scores. A model trained by the verification scores is called verifier.
- At test time, we sample solutions to each test problem, rank them with the verifier, and then return the one with the highest verifier score.

Prompting methods:

Prompt strategy	Prompts	In-context examples	Prompt generation	GSM8k
				Metric: Solve rate (%); Model: Codex
Scratchpad (Nye et al., 2021)	Break a code function down and ask the model to output all intermediate steps of the code	(input, intermediate steps, output)	Manually designed based on an algorithm	?
Chain-of-though prompting (Wei et al., 2022)	Prompt the model with the rationale in solving a multi-step reasoning problem.	(input, chain-of-thought, output)	Manually written	63.1
Algotihmic prompting (Zhou et al., 2022)	Prompt the model with detailed rationales, including describing the steps within an algorithm.	(input, algorithmic prompt, output)	Manually written	82.7

Multi-Task Learning

End-to-End Multi-Task Learning with Attention. CVPR 2019. paper

Latent Multi-task Architecture Learning. AAAI 2019. paper

Cross-stitch Networks for Multi-task Learning. CVPR 2016. paper

Learning Multiple Tasks with Multilinear Relationship Networks. NIPS 2017. paper

More multitask learning papers here

Meta Learning

Survey

Meta-Learning in Neural Networks: A Survey. paper

Black-Box Approaches

Recurrent Neural Network

(MANN) Meta-learning with memory-augmented neural networks. ICML 2016. paper

Attention-Based Network

Matching Networks for One-Shot Learning. NIPS 2016. paper

(SNAIL) A Simple Neural Attentive Meta-Learner. ICLR 2018. paper

Optimization-Based Methods

(MAML) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. paper ⭐

(Reptile; First-order method) On First-Order Meta-Learning Algorithms. arXiv 2018. paper

Other Forms of Prior on MAML

(Implicit MAML) Meta-Learning with Implicit Gradients. NIPS 2019. paper

(Implicit Differentiation; SVM) Meta-Learning with Differentiable Convex Optimization. CVPR 2019. paper

(Bayesian linear regression) Meta-Learning Priors for Efficient Online Bayesian Regression. Workshop on the Algorithmic Foundations of Robotics 2018. paper

(Ridge regression; Logistic regression) Meta-learning with Differentiable Closed-Form Solvers. ICLR 2019. paper

Understanding MAML

(MAML expressive power and university) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

(Map MAML to Bayes Framework) Recasting Gradient-Based Meta-Learning as Hierarchical Bayes. ICLR 2018. paper

Tricks to Optimize MAML

Choose architecture that is effective for inner gradient-step

Auto-Meta: Automated Gradient Based Meta Learner Search. NIPS 2018 Workshop on Meta-Learning. paper

Automatically learn inner vector learning rate, tune outer learning rate

Alpha MAML: Adaptive Model-Agnostic Meta-Learning. ICML 2019 Workshop on Automated Machine Learning. paper

Meta-SGD: Learning to Learn Quickly for Few-Shot Learning. arXiv 2017. paper

Optimize only a subset of the parameters in the inner loop

(DEML) Deep Meta-Learning: Learning to Learn in the Concept Space. arXiv 2018. paper

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Decouple inner learning rate, BN statistics per-step

(MAML++) How to train your MAML. ICLR 2019. paper

Introduce context variables for increased expressive power

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

(Bias transformation) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

Non-Parametric Methods via Metric Learning

Siamese Neural Networks for One-shot Image Recognition. ICML 2015. paper

Matching Networks for One Shot Learning. NIPS 2016. paper

Prototypical Networks for Few-shot Learning. NIPS 2017. paper

Learn non-linear relation module on embeddings

Learning to Compare: Relation Network for Few-Shot Learning. CVPR 2018. paper

Learn infinite mixture of prototypes

Infinite Mixture Prototypes for Few-Shot Learning. ICML 2019. paper

Perform message passing on embeddings

Few-Shot Learning with Graph Neural Networks ICLR 2018. paper

Bayesian Meta-Learning & Generative Models

Amortized Inference

Amortized Bayesian Meta-Learning. ICLR 2019. paper

Ensemble Method

Bayesian Model-Agnostic Meta-Learning. NIPS 2018. paper

Sampling & Hybrid Inference

Probabilistic Model-Agnostic Meta-Learning. NIPS 2018. paper

Meta-Learning Probabilistic Inference for Prediction. ICLR 2019. paper

Hybrid meta-learning approaches

Meta-Learning with Latent Embedding Optimization. ICLR 2019. paper

Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR 2020. paper

Few-Shot Learning with Graph Neural Networks. ICLR 2018. paper

(CAML) Learning to Learn with Conditional Class Dependencies. ICLR 2019. paper

Meta Reinforcement Learning

Policy Gradient RL

MAML and Black-Box Meta Learning Approaches can be directly applied to Policy-Gradient RL methods

Value-Based RL

It is not easy to applied existing meta learning approaches to Value-Based RL because Value-Based RL is dynamic programming method

Meta-Q-Learning. ICLR 2020. paper

(Goal-Conditioned RL with hindsight relabeling)/(Multi-Task RL) Hindsight Experience Replay. NIPS 2017. paper

(better learning) Learning Latent Plans from Play. CoRL 2019. paper

(learn a better goal representation)

Universal Planning Networks. ICML 2018. paper

Unsupervised Visuomotor Control through Distributional Planning Networks. RSS 2019. paper

Applications

Meta-Learning for Low-Resource Neural Machine Translation. EMNLP 2018. paper

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions. ICLR 2018. paper

One-Shot Imitation Learning. NIPS 2017. paper

Massively Multitask Networks for Drug Discovery. ICML 2015. paper

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
Multitask Learning Papers Review.md		Multitask Learning Papers Review.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmarks for Reasoning Abilities of Large Language Models

Multi-Task Learning

Meta Learning

Survey

Black-Box Approaches

Recurrent Neural Network

Attention-Based Network

Optimization-Based Methods

Other Forms of Prior on MAML

Understanding MAML

Tricks to Optimize MAML

Non-Parametric Methods via Metric Learning

Bayesian Meta-Learning & Generative Models

Amortized Inference

Ensemble Method

Sampling & Hybrid Inference

Hybrid meta-learning approaches

Meta Reinforcement Learning

Policy Gradient RL

Value-Based RL

Applications

About

Releases

Packages

Contributors 2

VirtuosoResearch/Multitask-Learning-and-Fine-Tuning

Folders and files

Latest commit

History

Repository files navigation

Benchmarks for Reasoning Abilities of Large Language Models

Multi-Task Learning

Meta Learning

Survey

Black-Box Approaches

Recurrent Neural Network

Attention-Based Network

Optimization-Based Methods

Other Forms of Prior on MAML

Understanding MAML

Tricks to Optimize MAML

Non-Parametric Methods via Metric Learning

Bayesian Meta-Learning & Generative Models

Amortized Inference

Ensemble Method

Sampling & Hybrid Inference

Hybrid meta-learning approaches

Meta Reinforcement Learning

Policy Gradient RL

Value-Based RL

Applications

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages