@ -29,10 +29,11 @@ In brief, we have strong base language models, which have been stably pretrained
In this repo, you can figure out:
In this repo, you can figure out:
* Quickstart with Qwen, and enjoy the simple inference.
* Quickstart with Qwen, and enjoy the simple inference.
* Details about the quantization models, including usage, memory, inference speed. For comparison, we also provide the statistics of the BF16 models.
* Details about the quantization models, including GPTQ and KV cache quantization.
* Statistics of inference performance, including speed and memory.
* Tutorials on finetuning, including full-parameter tuning, LoRA, and Q-LoRA.
* Tutorials on finetuning, including full-parameter tuning, LoRA, and Q-LoRA.
* Instructions on building demos, including WebUI, CLI demo, etc.
* Instructions on building demos, including WebUI, CLI demo, etc.
* Instructions on building an OpenAI-style API for your model.
* Introduction to DashScope API service, as well as the instructions on building an OpenAI-style API for your model.
* Information about Qwen for tool use, agent, and code interpreter
* Information about Qwen for tool use, agent, and code interpreter
* Statistics of long-context understanding evaluation
* Statistics of long-context understanding evaluation
* License agreement
* License agreement
@ -556,14 +557,15 @@ The finetuning scripts allow you to perform:
- LoRA
- LoRA
- Q-LoRA
- Q-LoRA
Full-parameter parameter finetuning requires updating all parameters in the whole training process. To launch your training, run the following script:
Full-parameter finetuning requires updating all parameters in the whole training process. To launch your training, run the following script:
```bash
```bash
# Distributed training. We do not provide single-GPU training script as the insufficient GPU memory will break down the training.
# Distributed training. We do not provide single-GPU training script as the insufficient GPU memory will break down the training.
sh finetune/finetune_ds.sh
sh finetune/finetune_ds.sh
```
```
Remember to specify the correct model name or path, the data path, as well as the output directory in the shell scripts. Another thing to notice is that we use DeepSpeed ZeRO 3 in this script. If you want to make changes, just remove the argument `--deepspeed` or make changes in the DeepSpeed configuration json file based on your requirements. Additionally, this script supports mixed-precision training, and thus you can use `--bf16 True` or `--fp16 True`. Empirically we advise you to use bf16 to make your training consistent with our pretraining and alignment if your machine supports bf16, and thus we use it by default.
Remember to specify the correct model name or path, the data path, as well as the output directory in the shell scripts. Another thing to notice is that we use DeepSpeed ZeRO 3 in this script. If you want to make changes, just remove the argument `--deepspeed` or make changes in the DeepSpeed configuration json file based on your requirements. Additionally, this script supports mixed-precision training, and thus you can use `--bf16 True` or `--fp16 True`. Remember to use DeepSpeed when you use fp16 due to mixed precision training.
Empirically we advise you to use bf16 to make your training consistent with our pretraining and alignment if your machine supports bf16, and thus we use it by default.
Similarly, to run LoRA, use another script to run as shown below. Before you start, make sure that you have installed `peft`. Also, you need to specify your paths to your model, data, and output. We advise you to use absolute path for your pretrained model. This is because LoRA only saves the adapter and the absolute path in the adapter configuration json file is used for finding out the pretrained model to load. Also, this script support both bf16 and fp16.
Similarly, to run LoRA, use another script to run as shown below. Before you start, make sure that you have installed `peft`. Also, you need to specify your paths to your model, data, and output. We advise you to use absolute path for your pretrained model. This is because LoRA only saves the adapter and the absolute path in the adapter configuration json file is used for finding out the pretrained model to load. Also, this script support both bf16 and fp16.
@ -831,7 +833,7 @@ If you suffer from lack of GPU memory and you would like to run the model on mor
## Tool Usage
## Tool Usage
Qwen-Chat has been optimized for tool usage and function calling capabilities. Users can develop agents, LangChain applications, and even agument Qwen with a Python Code Interpreter.
Qwen-Chat has been optimized for tool usage and function calling capabilities. Users can develop agents, LangChain applications, and even augment Qwen with a Python Code Interpreter.
We provide documentation on how to implement tool calls based on the principle of ReAct Prompting, please refer to [the ReAct example](examples/react_prompt.md). Based on this principle, we provide support for function calling in [openai_api.py](openai_api.py).
We provide documentation on how to implement tool calls based on the principle of ReAct Prompting, please refer to [the ReAct example](examples/react_prompt.md). Based on this principle, we provide support for function calling in [openai_api.py](openai_api.py).
@ -543,7 +544,7 @@ model = AutoModelForCausalLM.from_pretrained(
sh finetune/finetune_ds.sh
sh finetune/finetune_ds.sh
```
```
尤其注意,你需要在脚本中指定正确的模型名称或路径、数据路径、以及模型输出的文件夹路径。在这个脚本中我们使用了DeepSpeed ZeRO 3。如果你想修改这个配置,可以删除掉`--deepspeed`这个输入或者自行根据需求修改DeepSpeed配置json文件。此外,我们支持混合精度训练,因此你可以设置`--bf16 True`或者`--fp16 True`。经验上,如果你的机器支持bf16,我们建议使用bf16,这样可以和我们的预训练和对齐训练保持一致,这也是为什么我们把默认配置设为它的原因。
尤其注意,你需要在脚本中指定正确的模型名称或路径、数据路径、以及模型输出的文件夹路径。在这个脚本中我们使用了DeepSpeed ZeRO 3。如果你想修改这个配置,可以删除掉`--deepspeed`这个输入或者自行根据需求修改DeepSpeed配置json文件。此外,我们支持混合精度训练,因此你可以设置`--bf16 True`或者`--fp16 True`。在使用fp16时,请使用DeepSpeed支持混合精度训练。经验上,如果你的机器支持bf16,我们建议使用bf16,这样可以和我们的预训练和对齐训练保持一致,这也是为什么我们把默认配置设为它的原因。