A curated list for Efficient Large Language Models
-
Updated
Nov 17, 2024 - Python
A curated list for Efficient Large Language Models
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs
[ICLR 2024] Jaiswal, A., Gan, Z., Du, X., Zhang, B., Wang, Z., & Yang, Y. Compressing llms: The truth is rarely pure and never simple.
LLM Inference on AWS Lambda
papers of llm compression
[CAAI AIR'24] Minimize Quantization Output Error with Bias Compensation
Add a description, image, and links to the llm-compression topic page so that developers can more easily learn about it.
To associate your repository with the llm-compression topic, visit your repo's landing page and select "manage topics."