{ "cells": [ { "cell_type": "markdown", "id": "6e6981ab-2d9a-4280-923f-235a166855ba", "metadata": {}, "source": [ "# LoRA Fine-Tuning Qwen-Chat Large Language Model (Single GPU)\n", "\n", "Tongyi Qianwen is a large language model developed by Alibaba Cloud based on the Transformer architecture, trained on an extensive set of pre-training data. The pre-training data is diverse and covers a wide range, including a large amount of internet text, specialized books, code, etc. In addition, an AI assistant called Qwen-Chat has been created based on the pre-trained model using alignment mechanism.\n", "\n", "This notebook uses Qwen-1.8B-Chat as an example to introduce how to LoRA fine-tune the Qianwen model using Deepspeed.\n", "\n", "## Environment Requirements\n", "\n", "Please refer to **requirements.txt** to install the required dependencies.\n", "\n", "## Preparation\n", "\n", "### Download Qwen-1.8B-Chat\n", "\n", "First, download the model files. You can choose to download directly from ModelScope." ] }, { "cell_type": "code", "execution_count": null, "id": "248488f9-4a86-4f35-9d56-50f8e91a8f11", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "outputs": [], "source": [ "from modelscope.hub.snapshot_download import snapshot_download\n", "model_dir = snapshot_download('Qwen/Qwen-1_8B-Chat', cache_dir='.', revision='master')" ] }, { "attachments": {}, "cell_type": "markdown", "id": "7b2a92b1-f08e-4413-9f92-8f23761e6e1f", "metadata": {}, "source": [ "### Download Example Training Data\n", "\n", "Download the data required for training; here, we provide a tiny dataset as an example. It is sampled from [Belle](https://github.com/LianjiaTech/BELLE).\n", "\n", "Disclaimer: the dataset can be only used for the research purpose." ] }, { "cell_type": "code", "execution_count": null, "id": "ce195f08-fbb2-470e-b6c0-9a03457458c7", "metadata": { "tags": [] }, "outputs": [], "source": [ "!wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/qwen_recipes/Belle_sampled_qwen.json" ] }, { "cell_type": "markdown", "id": "7226bed0-171b-4d45-a3f9-b3d81ec2bb9f", "metadata": {}, "source": [ "You can also refer to this format to prepare the dataset. Below is a simple example list with 1 sample:\n", "\n", "```json\n", "[\n", " {\n", " \"id\": \"identity_0\",\n", " \"conversations\": [\n", " {\n", " \"from\": \"user\",\n", " \"value\": \"你好\"\n", " },\n", " {\n", " \"from\": \"assistant\",\n", " \"value\": \"我是一个语言模型,我叫通义千问。\"\n", " }\n", " ]\n", " }\n", "]\n", "```\n", "\n", "You can also use multi-turn conversations as the training set. Here is a simple example:\n", "\n", "```json\n", "[\n", " {\n", " \"id\": \"identity_0\",\n", " \"conversations\": [\n", " {\n", " \"from\": \"user\",\n", " \"value\": \"你好,能告诉我遛狗的最佳时间吗?\"\n", " },\n", " {\n", " \"from\": \"assistant\",\n", " \"value\": \"当地最佳遛狗时间因地域差异而异,请问您所在的城市是哪里?\"\n", " },\n", " {\n", " \"from\": \"user\",\n", " \"value\": \"我在纽约市。\"\n", " },\n", " {\n", " \"from\": \"assistant\",\n", " \"value\": \"纽约市的遛狗最佳时间通常在早晨6点至8点和晚上8点至10点之间,因为这些时间段气温较低,遛狗更加舒适。但具体时间还需根据气候、气温和季节变化而定。\"\n", " }\n", " ]\n", " }\n", "]\n", "```\n", "\n", "## Fine-Tune the Model\n", "\n", "You can directly run the prepared training script to fine-tune the model." ] }, { "cell_type": "code", "execution_count": null, "id": "7ab0581e-be85-45e6-a5b7-af9c42ea697b", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "outputs": [], "source": [ "!export CUDA_VISIBLE_DEVICES=0\n", "!python ../../finetune.py \\\n", " --model_name_or_path \"Qwen/Qwen-1_8B-Chat/\"\\\n", " --data_path \"Belle_sampled_qwen.json\"\\\n", " --bf16 \\\n", " --output_dir \"output_qwen\" \\\n", " --num_train_epochs 5 \\\n", " --per_device_train_batch_size 1 \\\n", " --per_device_eval_batch_size 1 \\\n", " --gradient_accumulation_steps 16 \\\n", " --evaluation_strategy \"no\" \\\n", " --save_strategy \"steps\" \\\n", " --save_steps 1000 \\\n", " --save_total_limit 10 \\\n", " --learning_rate 1e-5 \\\n", " --weight_decay 0.1 \\\n", " --adam_beta2 0.95 \\\n", " --warmup_ratio 0.01 \\\n", " --lr_scheduler_type \"cosine\" \\\n", " --logging_steps 1 \\\n", " --report_to \"none\" \\\n", " --model_max_length 512 \\\n", " --gradient_checkpointing \\\n", " --lazy_preprocess \\\n", " --use_lora" ] }, { "cell_type": "markdown", "id": "5e6f28aa-1772-48ce-aa15-8cf29e7d67b5", "metadata": {}, "source": [ "## Merge Weights\n", "\n", "The training of both LoRA and Q-LoRA only saves the adapter parameters. You can load the fine-tuned model and merge weights as shown below:" ] }, { "cell_type": "code", "execution_count": null, "id": "4fd5ef2a-34f9-4909-bebe-7b3b086fd16a", "metadata": { "ExecutionIndicator": { "show": true }, "tags": [] }, "outputs": [], "source": [ "from transformers import AutoModelForCausalLM\n", "from peft import PeftModel\n", "import torch\n", "\n", "model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-1_8B-Chat/\", torch_dtype=torch.float16, device_map=\"auto\", trust_remote_code=True)\n", "model = PeftModel.from_pretrained(model, \"output_qwen/\")\n", "merged_model = model.merge_and_unload()\n", "merged_model.save_pretrained(\"output_qwen_merged\", max_shard_size=\"2048MB\", safe_serialization=True)" ] }, { "cell_type": "markdown", "id": "2e3f5b9f-63a1-4599-8d9b-a8d8f764838f", "metadata": {}, "source": [ "The tokenizer files are not saved in the new directory in this step. You can copy the tokenizer files or use the following code:" ] }, { "cell_type": "code", "execution_count": null, "id": "10fa5ea3-dd55-4901-86af-c045d4c56533", "metadata": { "tags": [] }, "outputs": [], "source": [ "from transformers import AutoTokenizer\n", "\n", "tokenizer = AutoTokenizer.from_pretrained(\n", " \"Qwen/Qwen-1_8B-Chat/\",\n", " trust_remote_code=True\n", ")\n", "\n", "tokenizer.save_pretrained(\"output_qwen_merged\")" ] }, { "cell_type": "markdown", "id": "804b84d8", "metadata": {}, "source": [ "## Test the Model\n", "\n", "After merging the weights, we can test the model as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from transformers import AutoModelForCausalLM, AutoTokenizer\n", "from transformers.generation import GenerationConfig\n", "\n", "tokenizer = AutoTokenizer.from_pretrained(\"output_qwen_merged\", trust_remote_code=True)\n", "model = AutoModelForCausalLM.from_pretrained(\n", " \"output_qwen_merged\",\n", " device_map=\"auto\",\n", " trust_remote_code=True\n", ").eval()\n", "\n", "response, history = model.chat(tokenizer, \"你好\", history=None)\n", "print(response)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 5 }