This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.
{
"cells": [
{
"cell_type": "markdown",
"id": "6e6981ab-2d9a-4280-923f-235a166855ba",
"metadata": {},
"source": [
"# Fine-Tuning Qwen-Chat Large Language Model (Multiple GPUs)\n",
"\n",
"Tongyi Qianwen is a large language model developed by Alibaba Cloud based on the Transformer architecture, trained on an extensive set of pre-training data. The pre-training data is diverse and covers a wide range, including a large amount of internet text, specialized books, code, etc. In addition, an AI assistant called Qwen-Chat has been created based on the pre-trained model using alignment mechanism.\n",
"\n",
"This notebook uses Qwen-1.8B-Chat as an example to introduce how to fine-tune the Qianwen model using Deepspeed.\n",
"\n",
"## Environment Requirements\n",
"\n",
"Please refer to **requirements.txt** to install the required dependencies.\n",
"\n",
"## Preparation\n",
"\n",
"### Download Qwen-1.8B-Chat\n",
"\n",
"First, download the model files. You can choose to download directly from ModelScope."
"Download the data required for training; here, we provide a tiny dataset as an example. It is sampled from [Belle](https://github.com/LianjiaTech/BELLE).\n",
"\n",
"Disclaimer: the dataset can be only used for the research purpose."