安装环境依赖:
apt install python3-dev
pip install --upgrade pip
pip install Cmake
vim ~/.bashrc
export CUDA_HOME=/usr/local/cuda
export PATH=$PATH:$CUDA_HOME/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64
鉴于目前flash-attn v2.6.3为cu118版本构建,可安装pytorch2.4.0+cu121或cu118及以下版本
pip install torch==2.4.0+cu121 torchaudio==2.4.0+cu121 torchvision==0.19.0+cu121 --index-url https://download.pytorch.org/whl
flash-attn源码编译安装
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention && pip install .
# 下方安装可选,安装可能比较缓慢。
# pip install csrc/layer_norm
# pip install csrc/rotary