## 介绍 [SWIFT](https://github.com/modelscope/swift)(Scalable lightWeight Infrastructure for Fine-Tuning)是一个可扩展的轻量级一站式训练、推理深度学习框架。它集成了各种高效的微调方法,如LoRA、QLoRA、阿里云自研的ResTuning-Bypass等,以及开箱即用的训练推理脚本,使开发者可以在单张商业级显卡上微调推理LLM&AIGC模型。此外,SWIFT与PEFT完全兼容,使开发者可以在ModelScope模型体系中使用PEFT的能力。 ## 安装 ```shell # 设置pip全局镜像 pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/ # 安装ms-swift git clone https://github.com/modelscope/swift.git cd swift pip install -e .[llm] # 如果你想要使用deepspeed. pip install deepspeed -U # 如果你想要使用基于auto_gptq的qlora训练. (推荐, 效果优于bnb) # 支持auto_gptq的模型: `https://github.com/modelscope/swift/blob/main/docs/source/LLM/支持的模型和数据集.md#模型` # auto_gptq和cuda版本有对应关系,请按照`https://github.com/PanQiWei/AutoGPTQ#quick-installation`选择版本 pip install auto_gptq -U # 如果你想要使用基于bnb的qlora训练. pip install bitsandbytes -U # 环境对齐 (如果你运行错误, 可以跑下面的代码, 仓库使用最新环境测试) pip install -r requirements/framework.txt -U pip install -r requirements/llm.txt -U ``` ## webui使用 执行如下命令启动webui通过界面方式进行模型训练推理 ```shell swift web-ui ``` 界面示例如下 ![image](https://modelscope.oss-cn-beijing.aliyuncs.com/resource/swift_webui.jpg) ## 微调 ```python # Experimental environment: A10, 3090, V100, ... # 20GB GPU memory CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model_id_or_path qwen/Qwen-7B-Chat \ --dataset blossom-math-zh \ --output_dir output \ # 使用自己的数据集 CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model_id_or_path qwen/Qwen-7B-Chat \ --custom_train_dataset_path chatml.jsonl \ --output_dir output \ # 使用DDP # Experimental environment: 2 * 3090 # 2 * 23GB GPU memory CUDA_VISIBLE_DEVICES=0,1 \ NPROC_PER_NODE=2 \ swift sft \ --model_id_or_path qwen/Qwen-7B-Chat \ --dataset blossom-math-zh \ --output_dir output \ # 多机多卡 # node0 CUDA_VISIBLE_DEVICES=0,1,2,3 \ NNODES=2 \ NODE_RANK=0 \ MASTER_ADDR=127.0.0.1 \ NPROC_PER_NODE=4 \ swift sft \ --model_id_or_path qwen/Qwen-7B-Chat \ --dataset blossom-math-zh \ --output_dir output \ # node1 CUDA_VISIBLE_DEVICES=0,1,2,3 \ NNODES=2 \ NODE_RANK=1 \ MASTER_ADDR=xxx.xxx.xxx.xxx \ NPROC_PER_NODE=4 \ swift sft \ --model_id_or_path qwen/Qwen-7B-Chat \ --dataset blossom-math-zh \ --output_dir output \ ``` 更多微调方法参考[这里](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E5%BE%AE%E8%B0%83%E6%96%87%E6%A1%A3.md#%E5%BE%AE%E8%B0%83) 已有微调代码示例 | 模型名称 | 训练方法 | |:-------------------|:---------------------------------------------------------------------------------------------------------------------------| | qwen_14b | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/lora_ddp_ds) | | qwen_14b | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/qlora) | | qwen_14b | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b/qlora_ddp_ds) | | qwen_14b_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/lora_ddp_ds) | | qwen_14b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/qlora) | | qwen_14b_chat | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat/qlora_ddp_ds) | | qwen_14b_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora) | | qwen_14b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int4/qlora_ddp_ds) | | qwen_14b_chat_int8 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora) | | qwen_14b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_14b_chat_int8/qlora_ddp_ds) | | qwen_1_8b_chat | [full](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat/full) | | qwen_1_8b_chat | [full_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_1_8b_chat/full_ddp) | | qwen_72b_chat | [lora_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/lora_mp) | | qwen_72b_chat | [lora_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/lora_mp_ddp) | | qwen_72b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat/qlora) | | qwen_72b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat_int4/qlora_ddp_ds) | | qwen_72b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_72b_chat_int8/qlora_ddp_ds) | | qwen_7b | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b/lora_ddp_ds) | | qwen_7b | [qlora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b/qlora_ddp) | | qwen_7b_chat | [full](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full) | | qwen_7b_chat | [full_freeze_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_freeze_ddp) | | qwen_7b_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_mp) | | qwen_7b_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/full_mp_ddp) | | qwen_7b_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora) | | qwen_7b_chat | [lora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_ddp) | | qwen_7b_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_ddp_ds) | | qwen_7b_chat | [lora_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/lora_mp_ddp) | | qwen_7b_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora) | | qwen_7b_chat | [qlora_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora_ddp) | | qwen_7b_chat | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat/qlora_ddp_ds) | | qwen_7b_chat_int4 | [qalora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qalora) | | qwen_7b_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qlora) | | qwen_7b_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int4/qlora_ddp_ds) | | qwen_7b_chat_int8 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int8/qlora) | | qwen_7b_chat_int8 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_7b_chat_int8/qlora_ddp_ds) | | qwen_audio_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/full_mp) | | qwen_audio_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/full_mp_ddp) | | qwen_audio_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/lora) | | qwen_audio_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_audio_chat/lora_ddp_ds) | | qwen_vl | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl/lora_ddp_ds) | | qwen_vl_chat | [full_mp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/full_mp) | | qwen_vl_chat | [full_mp_ddp](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/full_mp_ddp) | | qwen_vl_chat | [lora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/lora) | | qwen_vl_chat | [lora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/lora_ddp_ds) | | qwen_vl_chat | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat/qlora) | | qwen_vl_chat_int4 | [qlora](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat_int4/qlora) | | qwen_vl_chat_int4 | [qlora_ddp_ds](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm/scripts/qwen_vl_chat_int4/qlora_ddp_ds) | ## 推理 ```python import os os.environ['CUDA_VISIBLE_DEVICES'] = '0' from swift.llm import ( get_model_tokenizer, get_template, inference, ModelType, get_default_template_type, ) from swift.utils import seed_everything model_type = ModelType.qwen_7b_chat template_type = get_default_template_type(model_type) print(f'template_type: {template_type}') # template_type: qwen kwargs = {} # kwargs['use_flash_attn'] = True # 使用flash_attn model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'}, **kwargs) # 修改max_new_tokens model.generation_config.max_new_tokens = 128 template = get_template(template_type, tokenizer) seed_everything(42) query = '浙江的省会在哪里?' response, history = inference(model, template, query) print(f'query: {query}') print(f'response: {response}') query = '这有什么好吃的?' response, history = inference(model, template, query, history) print(f'query: {query}') print(f'response: {response}') print(f'history: {history}') """Out[0] query: 浙江的省会在哪里? response: 浙江省的省会是杭州。 query: 这有什么好吃的? response: 杭州市有很多著名的美食,例如西湖醋鱼、龙井虾仁、糖醋排骨、毛血旺等。此外,还有杭州特色的点心,如桂花糕、荷花酥、艾窝窝等。 history: [('浙江的省会在哪里?', '浙江省的省会是杭州。'), ('这有什么好吃的?', '杭州市有很多著名的美食,例如西湖醋鱼、龙井虾仁、糖醋排骨、毛血旺等。此外,还有杭州特色的点心,如桂花糕、荷花酥、艾窝窝等。')] """ # 流式输出对话模板 inference(model, template, '第一个问题是什么', history, verbose=True, stream=True) """Out[1] [PROMPT]<|im_start|>system You are a helpful assistant.<|im_end|> <|im_start|>user 浙江的省会在哪里?<|im_end|> <|im_start|>assistant 浙江省的省会是杭州。<|im_end|> <|im_start|>user 这有什么好吃的?<|im_end|> <|im_start|>assistant 杭州市有很多著名的美食,例如西湖醋鱼、龙井虾仁、糖醋排骨、毛血旺等。此外,还有杭州特色的点心,如桂花糕、荷花酥、艾窝窝等。<|im_end|> <|im_start|>user 第一个问题是什么<|im_end|> <|im_start|>assistant [OUTPUT]你的第一个问题是“浙江的省会在哪里?”<|im_end|> """ ``` 更多推理使用请参考[这里](https://github.com/modelscope/swift/blob/main/docs/source/LLM/LLM%E6%8E%A8%E7%90%86%E6%96%87%E6%A1%A3.md)