DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales Permalink April 23, 2023
DeepSpeed Data Efficiency: A composable library that makes better use of data, increases training efficiency, and improves model quality December 11, 2022
DeepSpeed-MII: instant speedup on 24,000+ open-source DL models with up to 40x cheaper inference October 10, 2022
Azure empowers easy-to-use, high-performance, and hyperscale model training using DeepSpeed July 25, 2022
Supporting efficient large model training on AMD Instinct GPUs with DeepSpeed Permalink March 20, 2022
DeepSpeed: Advancing MoE inference and training to power next-generation AI scale Permalink January 18, 2022
Autotuning: Automatically discover the optimal DeepSpeed configuration that delivers good training speed November 16, 2021
DeepSpeed: Accelerating large-scale model inference and training via system optimizations and compression Permalink May 14, 2021
Mixture-of-Quantization: A novel quantization approach for reducing model size with minimal accuracy impact May 4, 2021
DeepSpeed Inference: Multi-GPU inference with customized inference kernels and quantization support March 15, 2021
Powering 10x longer sequences and 6x faster execution through DeepSpeed Sparse Attention September 8, 2020
ZeRO stage 1 with reduced communication March 17, 2020 Partition-aware ZeRO with up to 2x reduction in communication time!
Turing-NLG: A 17-billion-parameter language model by Microsoft Permalink February 13, 2020 DeepSpeed was used to train the world’s largest language model.
ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters Permalink February 13, 2020