update readme

1 year ago · c5f7fa9487
parent b1c10b956d
commit c5f7fa9487
3 changed files with 37 additions and 30 deletions
--- a/README.md
+++ b/README.md
@ -26,10 +26,11 @@ Qwen-7B is the 7B-parameter version of the large language model series, Qwen (ab
 5. **Support of Plugins**. Qwen-7B-Chat is trained with plugin-related alignment data, and thus it is capable of using tools, including APIs, models, databases, etc., and it is capable of playing as an agent.
 The following sections include information that you might find it helpful. Specifically, we advise you to read the FAQ section before you launch issues.
-<br>
+<br><br>
 ## News and Updates
 * 2023.9.12 We now support finetuning on the Qwen-7B models, including full-parameter finetuning, LoRA and Q-LoRA.
 * 2023.8.21 We release the Int4 quantized model for Qwen-7B-Chat, **Qwen-7B-Chat-Int4**, which requires low memory costs but achieves improved inference speed. Besides, there is no significant performance degradation on the benchmark evaluation.
 * 2023.8.3 We release both **Qwen-7B** and **Qwen-7B-Chat** on ModelScope and Hugging Face. We also provide a technical memo for more details about the model, including training details and model performance.
 <br>
@ -59,7 +60,7 @@ In general, Qwen-7B outperforms the baseline models of a similar model size, and
 Additionally, according to the third-party evaluation of large language models, conducted by [OpenCompass](https://opencompass.org.cn/leaderboard-llm), Qwen-7B and Qwen-7B-Chat are the top 7B-parameter models. This evaluation consists of a large amount of public benchmarks for the evaluation of language understanding and generation, coding, mathematics, reasoning, etc.
 For more experimental results (detailed model performance on more benchmark datasets) and details, please refer to our technical memo by clicking [here](tech_memo.md).
-<br>
+<br><br>
 ## Requirements
@ -207,7 +208,7 @@ print(f'Response: {response}')
 ## Tokenizer
 Our tokenizer based on tiktoken is different from other tokenizers, e.g., sentencepiece tokenizer. You need to pay attention to special tokens, especially in finetuning. For more detailed information on the tokenizer and related use in fine-tuning, please refer to the [documentation](tokenization_note.md).
-<br>
+<br><br>
 ## Quantization
@ -263,7 +264,7 @@ We also profile the peak GPU memory usage for encoding 2048 tokens as context (a
 | Int4         |               8.21GB                |                13.62GB                |
 The above speed and memory profiling are conducted using [this script](https://qianwen-res.oss-cn-beijing.aliyuncs.com/profile.py).
-<br>
+<br><br>
 ## Finetuning
@ -337,7 +338,7 @@ model = AutoPeftModelForCausalLM.from_pretrained(
 ```
 The shell scripts uses `torchrun` to run single-GPU or multi-GPU training. For multi-GPU training, you need to specify the proper hyperparameters for distributed training based on your machine. 
-
+<br><br>
 ## Demo
@ -374,6 +375,7 @@ python cli_demo.py
    <img src="assets/cli_demo.gif" width="600" />
    <br>
 <p>
 <br>
 ## API
@ -429,7 +431,7 @@ print(response.choices[0].message.content)
 <p>
 Function calling is also supported (but only when `stream=False` for the moment). See the [example usage](examples/function_call_examples.py) here.
-<br>
+<br><br>
 ## Deployment
@ -447,7 +449,7 @@ model = load_model_on_gpus('Qwen/Qwen-7B-Chat', num_gpus=2)
 ```
 Then you can run the 7B chat model on 2 GPUs using the above scripts.
-<br>
+<br><br>
 ## Tool Usage
@ -501,17 +503,17 @@ To extend the context length and break the bottleneck of training sequence lengt
 ## Reproduction
 For your reproduction of the model performance on benchmark datasets, we provide scripts for you to reproduce the results. Check [eval/EVALUATION.md](eval/EVALUATION.md) for more information. Note that the reproduction may lead to slight differences from our reported results.
-<br>
+<br><br>
 ## FAQ
 If you meet problems, please refer to [FAQ](FAQ.md) and the issues first to search a solution before you launch a new issue.
-<br>
+<br><br>
 ## License Agreement
 Researchers and developers are free to use the codes and model weights of both Qwen-7B and Qwen-7B-Chat. We also allow their commercial use. Check our license at [LICENSE](LICENSE) for more details. If you have requirements for commercial use, please fill out the [form](https://dashscope.console.aliyun.com/openModelApply/qianwen) to apply.
-<br>
+<br><br>
 ## Contact Us
--- a/README_CN.md
+++ b/README_CN.md
@ -26,10 +26,11 @@
 5. **支持插件调用**：Qwen-7B-Chat针对插件调用相关的对齐数据做了特定优化，当前模型能有效调用插件以及升级为Agent。
 以下章节的信息可能对你有帮助，建议阅读。如果你在使用过程遇到问题，建议先查询FAQ，如仍无法解决再提交issue。
-<br>
+<br><br>
 ## 新闻
 * 2023年9月12日 支持Qwen-7B和Qwen-7B-Chat的微调，其中包括全参数微调、LoRA以及Q-LoRA。
 * 2023年8月21日 发布Qwen-7B-Chat的Int4量化模型，Qwen-7B-Chat-Int4。该模型显存占用低，推理速度相比半精度模型显著提升，在基准评测上效果损失较小。
 * 2023年8月3日 在魔搭社区（ModelScope）和Hugging Face同步推出Qwen-7B和Qwen-7B-Chat模型。同时，我们发布了技术备忘录，介绍了相关的训练细节和模型表现。
 <br>
@ -59,7 +60,7 @@ Qwen-7B在多个全面评估自然语言理解与生成、数学运算解题、
 此外，根据[OpenCompass](https://opencompass.org.cn/leaderboard-llm)进行的大型语言模型第三方评估，Qwen-7B 和 Qwen-7B-Chat 是其中表现最优的7B参数模型。该评估由大量公开基准组成，用于评估语言理解和生成、代码生成、数学、推理等。
 更多的实验结果和细节请查看我们的技术备忘录。点击[这里](tech_memo.md)。
-<br>
+<br><br>
 ## 要求
@ -201,7 +202,7 @@ print(f'Response: {response}')
 > 注：作为术语的“tokenization”在中文中尚无共识的概念对应，本文档采用英文表达以利说明。
 基于tiktoken的tokenizer有别于其他分词器，比如sentencepiece tokenizer。尤其在微调阶段，需要特别注意特殊token的使用。关于tokenizer的更多信息，以及微调时涉及的相关使用，请参阅[文档](tokenization_note_zh.md)。
-<br>
+<br><br>
 ## 量化
@ -257,7 +258,7 @@ response, history = model.chat(tokenizer, "Hi", history=None)
 | Int4               |               8.21GB                |                13.62GB                |
 上述性能测算使用[此脚本](https://qianwen-res.oss-cn-beijing.aliyuncs.com/profile.py)完成。
-<br>
+<br><br>
 ## 微调
@ -332,7 +333,7 @@ model = AutoPeftModelForCausalLM.from_pretrained(
 ```
 上述shell脚本使用`torchrun`来运行单GPU和多GPU训练。分布式训练需要根据你的需求和机器指定正确的分布式训练超参数。
-
+<br><br>
 ## Demo
@ -369,6 +370,7 @@ python cli_demo.py
    <img src="assets/cli_demo.gif" width="600" />
    <br>
 <p>
 <br>
 ## API
@ -424,7 +426,7 @@ print(response.choices[0].message.content)
 <p>
 该接口也支持函数调用（Function Calling），但暂时仅限 `stream=False` 时能生效。用法见[函数调用示例](examples/function_call_examples.py)。
-<br>
+<br><br>
 ## 部署
@ -442,7 +444,7 @@ model = load_model_on_gpus('Qwen/Qwen-7B-Chat', num_gpus=2)
 ```
 你即可使用2张GPU进行推理。
-<br>
+<br><br>
 ## 工具调用
@ -498,17 +500,17 @@ For how to write and use prompts for ReAct Prompting, please refer to [the ReAct
 ## 复现
 我们提供了评测脚本以供复现我们的实验结果。注意，由于内部代码和开源代码存在少许差异，评测结果可能与汇报结果存在细微的结果不一致。请阅读[eval/EVALUATION.md](eval/EVALUATION.md)了解更多信息。
-<br>
+<br><br>
 ## FAQ
 如遇到问题，敬请查阅[FAQ](FAQ_zh.md)以及issue区，如仍无法解决再提交issue。
-<br>
+<br><br>
 ## 使用协议
 研究人员与开发者可使用Qwen-7B和Qwen-7B-Chat或进行二次开发。我们同样允许商业使用，具体细节请查看[LICENSE](LICENSE)。如需商用，请填写[问卷](https://dashscope.console.aliyun.com/openModelApply/qianwen)申请。
-<br>
+<br><br>
 ## 联系我们
--- a/README_JA.md
+++ b/README_JA.md
@ -31,13 +31,14 @@ Qwen-7B は、アリババクラウドが提唱する大規模言語モデルシ
 5. **プラグインのサポート**。Qwen-7B-Chat は、プラグイン関連のアライメントデータでトレーニングされているため、API、モデル、データベースなどのツールを使用することができ、エージェントとしてプレイすることができる。
 以下のセクションには、参考になる情報が記載されています。特に、issue を立ち上げる前に FAQ セクションをお読みになることをお勧めします。
-<br>
+<br><br>
 ## ニュースとアップデート
 * 2023.9.12 Qwen-7Bモデルにおいて、フルパラメーター・ファインチューニング、LoRA、Q-LoRAを含むファインチューニングをサポートしました。
 * 2023.8.21 Qwen-7B-Chat 用 Int4 量子化モデル **Qwen-7B-Chat-Int4** をリリースしました。また、ベンチマーク評価においても大きな性能低下は見られませんでした。
 * 2023.8.3 ModelScope と Hugging Face 上で **Qwen-7B** と **Qwen-7B-Chat** をリリースしました。また、トレーニングの詳細やモデルの性能など、モデルの詳細については技術メモを提供しています。
-  <br>
+<br>
 ## 性能
@ -64,14 +65,14 @@ Qwen-7B は、MMLU、C-Eval、GSM8K、HumanEval、WMT22、CMMLU など、自然
 さらに、[OpenCompass](https://opencompass.org.cn/leaderboard-llm) が実施した大規模言語モデルの第三者評価によると、Qwen-7B と Qwen-7B-Chat は 7B パラメータモデルのトップになります。この評価は、言語理解・生成、コーディング、数学、推論などの評価のための大量の公開ベンチマークで構成されています。
 より詳細な実験結果（より多くのベンチマークデータセットでの詳細なモデル性能）や詳細については、[こちら](tech_memo.md)をクリックして技術メモを参照してください。
-<br>
+<br><br>
 ## 必要条件
 * python 3.8 以上
 * pytorch 1.12 以上、2.0 以上を推奨
 * CUDA 11.4 以上を推奨（GPU ユーザー、フラッシュアテンションユーザー向けなど）
-  <br>
+<br>
 ## クイックスタート
@ -198,13 +199,12 @@ results = pipe(text, history=history)
 response, history = results['response'], results['history']
 print(f'Response: {response}')
 ```
 <br>
 ## トークナイザー
 tiktoken に基づくトークナイザーは、他のトークナイザー、例えばセンテンスピーストークナイザーとは異なります。特にファインチューニングの際には、特殊なトークンに注意を払う必要があります。トークナイザに関する詳細な情報や、ファインチューニングにおける使用方法については、[ドキュメント](tokenization_note_ja.md)を参照してください。
-<br>
+<br><br>
 ## 量子化
@ -261,7 +261,7 @@ BF16 の精度と Int4 の量子化レベルの下で、それぞれ 2048 個と
 | Int4               |               8.21GB                |                13.62GB                |
 上記のスピードとメモリーのプロファイリングは、[このスクリプト](https://qianwen-res.oss-cn-beijing.aliyuncs.com/profile.py)を使用しています。
-<br>
+<br><br>
 ## ファインチューニング
@ -336,6 +336,7 @@ model = AutoPeftModelForCausalLM.from_pretrained(
 ```
 シェルスクリプトは`torchrun`を使用してシングルGPUまたはマルチGPUトレーニングを実行します。そのため、分散トレーニングのための適切なハイパーパラメータをマシンに応じて指定する必要があります。
 <br><br>
 ## デモ
@ -372,6 +373,7 @@ python cli_demo.py
    <img src="assets/cli_demo.gif" width="600" />
    <br>
 <p>
 <br>
 ## API
@ -425,6 +427,7 @@ print(response.choices[0].message.content)
    <img src="assets/openai_api.gif" width="600" />
    <br>
 <p>
 <br>
 ## デプロイ
@ -496,17 +499,17 @@ ReAct プロンプトの書き方や使い方については、[ReAct の例](ex
 ## 再現
 ベンチマークデータセットでのモデル性能の再現のために、結果を再現するスクリプトを提供しています。詳しくは [eval/EVALUATION.md](eval/EVALUATION.md) を確認してください。なお、再現の結果、我々の報告結果と若干異なる場合があります。
-<br>
+<br><br>
 ## FAQ
 問題が発生した場合は、まずは [FAQ](FAQ_ja.md) や issue を参照し、新しい issue を立ち上げる前に解決策を探してください。
-<br>
+<br><br>
 ## ライセンス契約
 Qwen-7B と Qwen-7B-Chat のコードとモデルウェイトは、研究者や開発者が自由に使用することができます。また、商用利用も可能です。詳しくは [LICENSE](LICENSE) をご覧ください。商用利用を希望される方は、[リクエストフォーム](https://dashscope.console.aliyun.com/openModelApply/qianwen)に必要事項をご記入の上、お申し込みください。
-<br>
+<br><br>
 ## お問い合わせ