Skip to content

A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software

License

Notifications You must be signed in to change notification settings

jssonx/awesome-gemm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

awesome-gemm Awesome

awesome-gemm

Introduction: This repository is dedicated to compiling an extensive list of frameworks, libraries, and software for matrix-matrix multiplication (A * B = C) optimization. It serves as a comprehensive resource for developers and researchers interested in high-performance computing, numerical analysis, and optimization of matrix operations.

Table of Contents

Fundamental Theories and Concepts

General Optimization Techniques

Frameworks and Development Tools

  • BLIS: A software framework for instantiating high-performance BLAS-like dense linear algebra libraries. BSD-3-Clause
  • BLISlab: A framework for experimenting with and learning about BLIS-like GEMM algorithms.
  • Tensile: AMD ROCm's library for JIT compiling kernels for matrix multiplications and tensor contractions. MIT

Libraries

CPU Libraries

GPU Libraries

Cross-Platform Libraries

Language-Specific Libraries

Development Software: Debugging and Profiling

  • Intel VTune Profiler: A performance analysis tool for various platforms, ideal for profiling and optimizing applications on Intel architectures.
  • Intel Advisor: A tool for vectorization optimization and memory layout transformations to improve application performance.
  • NVIDIA Nsight Systems: A system-wide performance analysis tool designed to visualize application algorithms, optimize performance, and enhance efficiency on NVIDIA GPUs. NVIDIA SOFTWARE LICENSE AGREEMENT
  • NVIDIA Nsight Compute: A performance analysis tool for CUDA kernels, providing detailed performance metrics and API debugging.
  • Nsight Visual Studio Edition: An integrated development environment for debugging and profiling CUDA applications within Visual Studio.
  • nvprof: NVIDIA's command-line profiler for CUDA applications. NVIDIA End User License Agreement
  • ROCm Profiler: AMD's performance analysis tool for profiling applications running on ROCm platforms. MIT
  • HPCToolkit: An integrated suite of tools for program performance measurement and analysis across a range of architectures. BSD-3-Clause
  • TAU (Tuning and Analysis Utilities): A performance evaluation tool framework for high-performance parallel programs.
  • Perf: A performance analyzing tool in Linux, useful for profiling CPU performance counters and system-level metrics. GPLv2
  • gprof: A performance analysis tool for Unix applications, useful for identifying program bottlenecks. GPLv3
  • gprofng: The next-generation GNU profiling tool with improved capabilities. GPLv3
  • LIKWID: A suite of command-line tools for performance-oriented programmers to profile and optimize their applications. GPLv3
  • VAMPIR: A tool suite for performance analysis and visualization of parallel programs, aiding in identifying performance issues. Proprietary
  • Extrae: A package that generates trace files for performance analysis, which can be visualized with Paraver. GPLv2.1
  • Memcheck (Valgrind): A memory error detector that helps identify issues like memory leaks and invalid memory access. GPLv2
  • FPChecker: A tool for detecting floating-point accuracy problems in applications. BSD-3-Clause
  • MegPeak: A tool for testing processor peak computation performance, useful for benchmarking. Apache-2.0

Learning Resources

University Courses & Tutorials

Selected Papers

Lecture Notes

Blogs

Other Resources

Example Implementations


This curated list aims to be a comprehensive resource for anyone interested in the optimization of matrix-matrix multiplication. Contributions and suggestions are welcome to help keep this list up-to-date and useful for the community.

About

A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published