You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

422 lines
16 KiB
Plaintext

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Qwen Quick Start Notebook\n",
"\n",
"This notebook shows how to train and infer the Qwen-7B-Chat model on a single GPU. Similarly, Qwen-1.8B-Chat, Qwen-14B-Chat can also be leveraged for the following steps. We only need to modify the corresponding `model name` and hyper-parameters. The training and inference of Qwen-72B-Chat requires higher GPU requirements and larger disk space."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Requirements\n",
"- Python 3.8 and above\n",
"- Pytorch 1.12 and above, 2.0 and above are recommended\n",
"- CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.)\n",
"We test the training of the model on an A10 GPU (24GB)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extra\n",
"If you need to speed up, you can install `flash-attention`. The details of the installation can be found [here](https://github.com/Dao-AILab/flash-attention)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/Dao-AILab/flash-attention\n",
"!cd flash-attention && pip install .\n",
"# Below are optional. Installing them might be slow.\n",
"# !pip install csrc/layer_norm\n",
"# If the version of flash-attn is higher than 2.1.1, the following is not needed.\n",
"# !pip install csrc/rotary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 0: Install Package Requirements"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!pip install transformers>=4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed modelscope"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 1: Download Model\n",
"When using `transformers` in some regions, the model cannot be automatically downloaded due to network problems. We recommend using `modelscope` to download the model first, and then use `transformers` for inference."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from modelscope import snapshot_download\n",
"\n",
"# Downloading model checkpoint to a local dir model_dir.\n",
"model_dir = snapshot_download('Qwen/Qwen-7B-Chat', cache_dir='.', revision='master')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 2: Direct Model Inference \n",
"We recommend two ways to do model inference: `modelscope` and `transformers`.\n",
"\n",
"#### 2.1 Model Inference with ModelScope"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecutionIndicator": {
"show": true
},
"tags": []
},
"outputs": [],
"source": [
"from modelscope import AutoModelForCausalLM, AutoTokenizer\n",
"from modelscope import GenerationConfig\n",
"\n",
"# Note: The default behavior now has injection attack prevention off.\n",
"tokenizer = AutoTokenizer.from_pretrained(\"Qwen/Qwen-7B-Chat/\", trust_remote_code=True)\n",
"\n",
"# use bf16\n",
"# model = AutoModelForCausalLM.from_pretrained(\"qwen/Qwen-7B-Chat/\", device_map=\"auto\", trust_remote_code=True, bf16=True).eval()\n",
"# use fp16\n",
"# model = AutoModelForCausalLM.from_pretrained(\"qwen/Qwen-7B-Chat/\", device_map=\"auto\", trust_remote_code=True, fp16=True).eval()\n",
"# use cpu only\n",
"# model = AutoModelForCausalLM.from_pretrained(\"qwen/Qwen-7B-Chat/\", device_map=\"cpu\", trust_remote_code=True).eval()\n",
"# use auto mode, automatically select precision based on the device.\n",
"model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-7B-Chat/\", device_map=\"auto\", trust_remote_code=True).eval()\n",
"\n",
"# Specify hyperparameters for generation. But if you use transformers>=4.32.0, there is no need to do this.\n",
"# model.generation_config = GenerationConfig.from_pretrained(\"Qwen/Qwen-7B-Chat/\", trust_remote_code=True) # 可指定不同的生成长度、top_p等相关超参\n",
"\n",
"# 第一轮对话 1st dialogue turn\n",
"response, history = model.chat(tokenizer, \"你好\", history=None)\n",
"print(response)\n",
"# 你好!很高兴为你提供帮助。\n",
"\n",
"# 第二轮对话 2nd dialogue turn\n",
"response, history = model.chat(tokenizer, \"给我讲一个年轻人奋斗创业最终取得成功的故事。\", history=history)\n",
"print(response)\n",
"# 这是一个关于一个年轻人奋斗创业最终取得成功的故事。\n",
"# 故事的主人公叫李明,他来自一个普通的家庭,父母都是普通的工人。从小,李明就立下了一个目标:要成为一名成功的企业家。\n",
"# 为了实现这个目标,李明勤奋学习,考上了大学。在大学期间,他积极参加各种创业比赛,获得了不少奖项。他还利用课余时间去实习,积累了宝贵的经验。\n",
"# 毕业后,李明决定开始自己的创业之路。他开始寻找投资机会,但多次都被拒绝了。然而,他并没有放弃。他继续努力,不断改进自己的创业计划,并寻找新的投资机会。\n",
"# 最终,李明成功地获得了一笔投资,开始了自己的创业之路。他成立了一家科技公司,专注于开发新型软件。在他的领导下,公司迅速发展起来,成为了一家成功的科技企业。\n",
"# 李明的成功并不是偶然的。他勤奋、坚韧、勇于冒险,不断学习和改进自己。他的成功也证明了,只要努力奋斗,任何人都有可能取得成功。\n",
"\n",
"# 第三轮对话 3rd dialogue turn\n",
"response, history = model.chat(tokenizer, \"给这个故事起一个标题\", history=history)\n",
"print(response)\n",
"# 《奋斗创业:一个年轻人的成功之路》"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 2.2 Model Inference with transformers"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecutionIndicator": {
"show": true
},
"tags": []
},
"outputs": [],
"source": [
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
"from transformers.generation import GenerationConfig\n",
"\n",
"tokenizer = AutoTokenizer.from_pretrained(\"Qwen/Qwen-7B-Chat/\", trust_remote_code=True)\n",
"\n",
"# use bf16\n",
"# model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-7B-Chat/\", device_map=\"auto\", trust_remote_code=True, bf16=True).eval()\n",
"# use fp16\n",
"# model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-7B-Chat/\", device_map=\"auto\", trust_remote_code=True, fp16=True).eval()\n",
"# use cpu only\n",
"# model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-7B-Chat/\", device_map=\"cpu\", trust_remote_code=True).eval()\n",
"# use auto mode, automatically select precision based on the device.\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" \"Qwen/Qwen-7B-Chat/\",\n",
" device_map=\"auto\",\n",
" trust_remote_code=True\n",
").eval()\n",
"\n",
"# Specify hyperparameters for generation. But if you use transformers>=4.32.0, there is no need to do this.\n",
"# model.generation_config = GenerationConfig.from_pretrained(\"Qwen/Qwen-7B-Chat/\", trust_remote_code=True)\n",
"\n",
"# 1st dialogue turn\n",
"response, history = model.chat(tokenizer, \"你好\", history=None)\n",
"print(response)\n",
"# 你好!很高兴为你提供帮助。\n",
"\n",
"# 2nd dialogue turn\n",
"response, history = model.chat(tokenizer, \"给我讲一个年轻人奋斗创业最终取得成功的故事。\", history=history)\n",
"print(response)\n",
"# 这是一个关于一个年轻人奋斗创业最终取得成功的故事。\n",
"# 故事的主人公叫李明,他来自一个普通的家庭,父母都是普通的工人。从小,李明就立下了一个目标:要成为一名成功的企业家。\n",
"# 为了实现这个目标,李明勤奋学习,考上了大学。在大学期间,他积极参加各种创业比赛,获得了不少奖项。他还利用课余时间去实习,积累了宝贵的经验。\n",
"# 毕业后,李明决定开始自己的创业之路。他开始寻找投资机会,但多次都被拒绝了。然而,他并没有放弃。他继续努力,不断改进自己的创业计划,并寻找新的投资机会。\n",
"# 最终,李明成功地获得了一笔投资,开始了自己的创业之路。他成立了一家科技公司,专注于开发新型软件。在他的领导下,公司迅速发展起来,成为了一家成功的科技企业。\n",
"# 李明的成功并不是偶然的。他勤奋、坚韧、勇于冒险,不断学习和改进自己。他的成功也证明了,只要努力奋斗,任何人都有可能取得成功。\n",
"\n",
"# 3rd dialogue turn\n",
"response, history = model.chat(tokenizer, \"给这个故事起一个标题\", history=history)\n",
"print(response)\n",
"# 《奋斗创业:一个年轻人的成功之路》"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 3: LoRA Fine-Tuning Model (Single GPU)\n",
"\n",
"#### 3.1 Download Example Training Data\n",
"Download the data required for training; here, we provide a tiny dataset as an example. It is sampled from [Belle](https://github.com/LianjiaTech/BELLE)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!wget https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/qwen_recipes/Belle_sampled_qwen.json"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can refer to this format to prepare the dataset. Below is a simple example list with 1 sample:\n",
"\n",
"```json\n",
"[\n",
" {\n",
" \"id\": \"identity_0\",\n",
" \"conversations\": [\n",
" {\n",
" \"from\": \"user\",\n",
" \"value\": \"你好\"\n",
" },\n",
" {\n",
" \"from\": \"assistant\",\n",
" \"value\": \"我是一个语言模型,我叫通义千问。\"\n",
" }\n",
" ]\n",
" }\n",
"]\n",
"```\n",
"\n",
"You can also use multi-turn conversations as the training set. Here is a simple example:\n",
"\n",
"```json\n",
"[\n",
" {\n",
" \"id\": \"identity_0\",\n",
" \"conversations\": [\n",
" {\n",
" \"from\": \"user\",\n",
" \"value\": \"你好\"\n",
" },\n",
" {\n",
" \"from\": \"assistant\",\n",
" \"value\": \"你好我是一名AI助手我叫通义千问有需要请告诉我。\"\n",
" },\n",
" {\n",
" \"from\": \"user\",\n",
" \"value\": \"你都能做什么\"\n",
" },\n",
" {\n",
" \"from\": \"assistant\",\n",
" \"value\": \"我能做很多事情,包括但不限于回答各种领域的问题、提供实用建议和指导、进行多轮对话交流、文本生成等。\"\n",
" }\n",
" ]\n",
" }\n",
"]\n",
"```\n",
"\n",
"#### 3.2 Fine-Tune the Model\n",
"\n",
"You can directly run the prepared training script to fine-tune the model. Remember to check `model_name_or_path`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!python ../finetune/deepspeed/finetune.py \\\n",
" --model_name_or_path \"Qwen/Qwen-7B-Chat/\"\\\n",
" --data_path \"Belle_sampled_qwen.json\"\\\n",
" --bf16 \\\n",
" --output_dir \"output_qwen\" \\\n",
" --num_train_epochs 5 \\\n",
" --per_device_train_batch_size 1 \\\n",
" --per_device_eval_batch_size 1 \\\n",
" --gradient_accumulation_steps 16 \\\n",
" --evaluation_strategy \"no\" \\\n",
" --save_strategy \"steps\" \\\n",
" --save_steps 1000 \\\n",
" --save_total_limit 10 \\\n",
" --learning_rate 1e-5 \\\n",
" --weight_decay 0.1 \\\n",
" --adam_beta2 0.95 \\\n",
" --warmup_ratio 0.01 \\\n",
" --lr_scheduler_type \"cosine\" \\\n",
" --logging_steps 1 \\\n",
" --report_to \"none\" \\\n",
" --model_max_length 512 \\\n",
" --gradient_checkpointing \\\n",
" --lazy_preprocess \\\n",
" --use_lora"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.3 Merge Weights\n",
"\n",
"LoRA training only saves the adapter parameters. You can load the fine-tuned model and merge weights as shown below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from transformers import AutoModelForCausalLM\n",
"from peft import PeftModel\n",
"import torch\n",
"\n",
"model = AutoModelForCausalLM.from_pretrained(\"Qwen/Qwen-7B-Chat/\", torch_dtype=torch.float16, device_map=\"auto\", trust_remote_code=True)\n",
"model = PeftModel.from_pretrained(model, \"output_qwen/\")\n",
"merged_model = model.merge_and_unload()\n",
"merged_model.save_pretrained(\"output_qwen_merged\", max_shard_size=\"2048MB\", safe_serialization=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The tokenizer files are not saved in the new directory in this step. You can copy the tokenizer files or use the following code:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from transformers import AutoTokenizer\n",
"\n",
"tokenizer = AutoTokenizer.from_pretrained(\n",
" \"Qwen/Qwen-7B-Chat/\",\n",
" trust_remote_code=True\n",
")\n",
"\n",
"tokenizer.save_pretrained(\"output_qwen_merged\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.4 Test the Model\n",
"\n",
"After merging the weights, we can test the model as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from transformers import AutoModelForCausalLM, AutoTokenizer\n",
"from transformers.generation import GenerationConfig\n",
"\n",
"tokenizer = AutoTokenizer.from_pretrained(\"output_qwen_merged\", trust_remote_code=True)\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" \"output_qwen_merged\",\n",
" device_map=\"auto\",\n",
" trust_remote_code=True\n",
").eval()\n",
"\n",
"response, history = model.chat(tokenizer, \"你好\", history=None)\n",
"print(response)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
},
"vscode": {
"interpreter": {
"hash": "2d58e898dde0263bc564c6968b04150abacfd33eed9b19aaa8e45c040360e146"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}