Release MS-AMP v0.2.0
MS-AMP 0.2.0 Release Notes
MS-AMP Improvements
- Add O3 optimization for supporting FP8 in distributed training frameworks
- Support ScalingTensor in functional.linear
- Support customized attributes in FP8Linear
- Improve performance
- Add docker file for pytorch1.14+cuda11.8 and pytorch2.1+cuda12.1
- Support pytorch 2.1
- Add performance result and TE result in homepage
- Cache TE build in pipeline
MS-AMP-Examples Improvements
Add 3 examples using MS-AMP: