Jekyll2024-03-15T12:25:00-07:00https://www.deepspeed.ai/feed.xmlDeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.DeepSpeed-FastGen:通过 MII 和 DeepSpeed-Inference 实现 LLM 高吞吐量文本生成2023-11-05T16:00:00-08:002023-11-05T16:00:00-08:00https://www.deepspeed.ai/2023/11/05/deepspeed-fastgen-chineseDeepSpeed-FastGen: MIIとDeepSpeed-InferenceによるLLMのための高速なテキスト生成2023-11-05T16:00:00-08:002023-11-05T16:00:00-08:00https://www.deepspeed.ai/2023/11/05/deepspeed-fastgen-japaneseDeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference2023-11-05T16:00:00-08:002023-11-05T16:00:00-08:00https://www.deepspeed.ai/2023/11/05/deepspeed-fastgenDeepSpeed-VisualChat:多轮图像+文字,为你展现不一样的AI聊天魅力2023-10-03T17:00:00-07:002023-10-03T17:00:00-07:00https://www.deepspeed.ai/2023/10/03/deepspeed-visualchat-chineseDeepSpeed-VisualChat: 複数ラウンド・複数画像の入力が可能なAIチャット体験を実現2023-10-03T17:00:00-07:002023-10-03T17:00:00-07:00https://www.deepspeed.ai/2023/10/03/deepspeed-visualchat-japaneseDeepSpeed-VisualChat: Improve Your Chat Experience with Multi-Round Multi-Image Inputs2023-10-03T17:00:00-07:002023-10-03T17:00:00-07:00https://www.deepspeed.ai/2023/10/03/deepspeed-visualchatDeepSpeed4Science:利用先进的AI系统优化技术实现科学发现2023-09-18T17:00:00-07:002023-09-18T17:00:00-07:00https://www.deepspeed.ai/2023/09/18/deepspeed4science-chineseDeepSpeed4Scienceイニシアティブ: 洗練されたAIシステムのテクノロジーにより大規模な科学的発見を可能に2023-09-18T17:00:00-07:002023-09-18T17:00:00-07:00https://www.deepspeed.ai/2023/09/18/deepspeed4science-japaneseAnnouncing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies2023-09-18T17:00:00-07:002023-09-18T17:00:00-07:00https://www.deepspeed.ai/2023/09/18/deepspeed4scienceZero Inference2023-09-12T00:00:00-07:002023-09-12T00:00:00-07:00https://www.deepspeed.ai/2023/09/12/ZeRO-Inferencetitle: “ZeRO-Inference: 20X faster inference through weight quantization and KV cache offloading”
excerpt: “”
link: https://github.com/microsoft/DeepSpeedExamples/blob/master/inference/huggingface/zero_inference/README.md
date: 2023-09-12 00:09:00
tags: inference ZeRO quantization English
—]]>