Turing-NLG: A 17-billion-parameter language model by Microsoft Permalink DeepSpeed was used to train the world’s largest language model.
ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters Permalink Developed by Microsoft AI & Research.