This website requires JavaScript.
Explore
Help
Register
Sign In
Serendipity
/
CTI-Inference-Opt
Watch
1
Star
1
Fork
0
You've already forked CTI-Inference-Opt
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
292a0216798dbf42f2f4b07d4d1258728127562f
CTI-Inference-Opt
/
代码
/
code
T
History
OwnerSunshine530
292a021679
experiment: triton_block_m=128(块数减半=launch减半);消同步赚-1.64s证评测对launch敏感→块大试
...
Co-Authored-By: Claude Opus 4.8 <
noreply@anthropic.com
>
2026-06-20 01:11:59 +08:00
..
tests
perf: _triton_block_meta 消除最后一个host同步(grid用shape派生上界,空block在kernel内mask空跑)
2026-06-19 20:51:37 +08:00
bench.py
feat: 真稀疏MoE(capacity分组,只算top-k,cutlass baddbmm,无host同步)
2026-06-17 21:05:55 +08:00
build_env.sh
fix: build_env.sh 简化为纯净版本(避免 CUDA 预热导致异常)
2026-06-12 21:55:09 +08:00
EXPERIMENTS.md
docs: 收尾 — 最终67.998/记录RepEncoder预计算尝试与结论
2026-06-16 13:18:48 +08:00
infer.py
experiment: triton_block_m=128(块数减半=launch减半);消同步赚-1.64s证评测对launch敏感→块大试
2026-06-20 01:11:59 +08:00
requirements.txt
revert: requirements.txt 还原为原始完整依赖列表
2026-06-12 21:24:22 +08:00
RISKS.md
docs: 潜在风险说明(RepEncoder预计算合规灰区/max_feasign一致性)与合规保底
2026-06-15 20:44:57 +08:00