This website requires JavaScript.
Explore
Help
Register
Sign In
Serendipity
/
CTI-Inference-Opt
Watch
1
Star
1
Fork
0
You've already forked CTI-Inference-Opt
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
6f7ff9fce8948eb6335651d2098f2f2b291c3070
CTI-Inference-Opt
/
代码
/
code
T
History
OwnerSunshine530
6f7ff9fce8
feat: Triton kernel load_model预热(避免首batch含JIT编译) + 默认attn=triton
...
Co-Authored-By: Claude Opus 4.8 <
noreply@anthropic.com
>
2026-06-17 12:23:11 +08:00
..
tests
feat: Triton varlen因果flash attention(块对角,单kernel,消逐块调用+mask构造开销)
2026-06-17 00:14:53 +08:00
bench.py
feat: Triton varlen因果flash attention(块对角,单kernel,消逐块调用+mask构造开销)
2026-06-17 00:14:53 +08:00
build_env.sh
fix: build_env.sh 简化为纯净版本(避免 CUDA 预热导致异常)
2026-06-12 21:55:09 +08:00
EXPERIMENTS.md
docs: 收尾 — 最终67.998/记录RepEncoder预计算尝试与结论
2026-06-16 13:18:48 +08:00
infer.py
feat: Triton kernel load_model预热(避免首batch含JIT编译) + 默认attn=triton
2026-06-17 12:23:11 +08:00
requirements.txt
revert: requirements.txt 还原为原始完整依赖列表
2026-06-12 21:24:22 +08:00
RISKS.md
docs: 潜在风险说明(RepEncoder预计算合规灰区/max_feasign一致性)与合规保底
2026-06-15 20:44:57 +08:00