DeepSpeedExamples/applications/DeepSpeed-Chat at master · microsoft/DeepSpeedExamples · GitHub
第一步,SFT省略。
第二步,Reward Model训练。其中遇到安装deepspeed的时候报错,参考如下博客:
[linux] No such file or directory ‘:/usr/local/cuda/bin/nvcc‘_心心喵的博客-CSDN博客
2、Reward Model
pip install transformers --use-feature=2020-resolver
pip install datasets
pip install -r requirements.txt
# Move into the second step of the pipeline
cd training/step2_reward_model_finetuning
# Run the training script
bash training_scripts/s