This website requires JavaScript.
Explore
Help
Register
Sign In
Serendipity
/
CTI-Inference-Opt
Watch
1
Star
1
Fork
0
You've already forked CTI-Inference-Opt
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
3c9da9a47df671bace215c55fa99efcf4ab14e6d
CTI-Inference-Opt
/
代码
/
code
T
History
OwnerSunshine530
3c9da9a47d
fix: INT8 MoE int32结果先转fp32反量化再fp16(直接.half()溢出830万>65504致NaN)
2026-06-20 01:45:05 +08:00
..
tests
perf: _triton_block_meta 消除最后一个host同步(grid用shape派生上界,空block在kernel内mask空跑)
2026-06-19 20:51:37 +08:00
bench.py
feat: INT8 dense MoE(torch._int_mm,2D拼接W1_cat/W2_cat,top-k加权折进GEMM2,per-tensor激活量化)
2026-06-20 01:35:55 +08:00
build_env.sh
fix: build_env.sh 简化为纯净版本(避免 CUDA 预热导致异常)
2026-06-12 21:55:09 +08:00
EXPERIMENTS.md
docs: 收尾 — 最终67.998/记录RepEncoder预计算尝试与结论
2026-06-16 13:18:48 +08:00
infer.py
fix: INT8 MoE int32结果先转fp32反量化再fp16(直接.half()溢出830万>65504致NaN)
2026-06-20 01:45:05 +08:00
requirements.txt
revert: requirements.txt 还原为原始完整依赖列表
2026-06-12 21:24:22 +08:00
RISKS.md
docs: 潜在风险说明(RepEncoder预计算合规灰区/max_feasign一致性)与合规保底
2026-06-15 20:44:57 +08:00