vwxyzjn/Megatron-MoE-ModelZoo - Gitstar Ranking

vwxyzjn

Fetched on 2026/07/13 18:39

Best practices for training DeepSeek, Mixtral, Qwen and other MoE models using Megatron Core. - View it on GitHub

Star

0

Rank

14087811