Chinese 671B parameter MoE model with strong reasoning that matches GPT-4o and Claude 3.5 at 5% of training cost.