Files
CTI-Inference-Opt/代码
OwnerSunshine530 9f73505caa perf: MoE top-k加权改scatter+mul+sum(在[E,N,D]上),省permute大clone+gather(profile clone 8%)
数学等价(top-k索引互异,scatter无冲突),零AUC风险。延续'减kernel'方向。
moe_fused_weight默认开,test_moe_dense_matches_loop已覆盖。

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 20:22:16 +08:00
..