CTI-Inference-Opt/代码/code/build_env.sh at 574399e8ac3fa4ad18585fc05b8cbddb73b6a095 - CTI-Inference-Opt - 人工智能协会仓库

Serendipity/CTI-Inference-Opt

Files

T

Serendipity 574399e8ac feat: Flash Attention + torch.compile（第二版优化方案）

- scaled_dot_product 替换为 F.scaled_dot_product_attention（自动启用 Flash Attention）
- load_model 中添加 torch.compile(mode='reduce-overhead')
- build_env.sh: 预热 torch inductor，避免编译耗时计入推理

2026-06-12 21:39:43 +08:00

18 lines

277 B

Bash

Raw Blame History

 #!/bin/bash
 set -e
 # 预热 torch inductor，避免推理时编译
 python -c "
 import torch
 @torch.compile(mode='reduce-overhead')
 def _warmup(x):
     return x * 2
 x = torch.randn(100, 100, device='cuda')
 _warmup(x)
 print('Inductor cache ready')
 "
 echo "build env success"