Merge pull request #326 from jxst539246/main

update readme
main
Iurnem 1 year ago committed by GitHub
commit 71205c89fc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -9,7 +9,7 @@
<br> <br>
<p align="center"> <p align="center">
🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/qwen">ModelScope<a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="QWEN_TECHNICAL_REPORT.pdf">Paper</a> &nbsp&nbsp &nbsp&nbsp🖥 <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a> 🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/qwen">ModelScope<a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf">Paper</a> &nbsp&nbsp &nbsp&nbsp🖥 <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a>
<br> <br>
<a href="assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp &nbsp&nbsp DingTalk (钉钉) &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp <a href="assets/wechat.png">WeChat (微信)</a>&nbsp&nbsp &nbsp&nbsp DingTalk (钉钉) &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp
</p> </p>
@ -22,7 +22,7 @@
We opensource our **Qwen** series, now including **Qwen**, the base language models, namely **Qwen-7B** and **Qwen-14B**, as well as **Qwen-Chat**, the chat models, namely **Qwen-7B-Chat** and **Qwen-14B-Chat**. Links are on the above table. Click them and check the model cards. Also, we release the **technical report**. Please click the paper link and check it out! We opensource our **Qwen** series, now including **Qwen**, the base language models, namely **Qwen-7B** and **Qwen-14B**, as well as **Qwen-Chat**, the chat models, namely **Qwen-7B-Chat** and **Qwen-14B-Chat**. Links are on the above table. Click them and check the model cards. Also, we release the **[technical report](https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf)**. Please click the paper link and check it out!
In brief, we have strong base language models, which have been stably pretrained for up to 3 trillion tokens of multilingual data with a wide coverage of domains, languages (with a focus on Chinese and English), etc. They are able to achieve competitive performance on benchmark datasets. Additionally, we have chat models that are aligned with human preference based on SFT and RLHF (not released yet), which are able to chat, create content, extract information, summarize, translate, code, solve math problems, and so on, and are able to use tools, play as agents, or even play as code interpreters, etc. In brief, we have strong base language models, which have been stably pretrained for up to 3 trillion tokens of multilingual data with a wide coverage of domains, languages (with a focus on Chinese and English), etc. They are able to achieve competitive performance on benchmark datasets. Additionally, we have chat models that are aligned with human preference based on SFT and RLHF (not released yet), which are able to chat, create content, extract information, summarize, translate, code, solve math problems, and so on, and are able to use tools, play as agents, or even play as code interpreters, etc.
@ -76,7 +76,7 @@ Qwen-14B and Qwen-7B (this is the new version trained with more tokens and the c
For all compared models, we report the best scores between their official reported results and [OpenCompass](https://opencompass.org.cn/leaderboard-llm). For all compared models, we report the best scores between their official reported results and [OpenCompass](https://opencompass.org.cn/leaderboard-llm).
For more experimental results (detailed model performance on more benchmark datasets) and details, please refer to our technical report by clicking [here](TODO). For more experimental results (detailed model performance on more benchmark datasets) and details, please refer to our technical report by clicking [here](https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf).
<br><br> <br><br>
## Requirements ## Requirements

@ -9,7 +9,7 @@
<br> <br>
<p align="center"> <p align="center">
🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/qwen">魔搭社区<a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="QWEN_TECHNICAL_REPORT.pdf">论文</a> &nbsp&nbsp &nbsp&nbsp🖥 <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a> 🤗 <a href="https://huggingface.co/Qwen">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp🤖 <a href="https://modelscope.cn/models/qwen">魔搭社区<a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf">论文</a> &nbsp&nbsp &nbsp&nbsp🖥 <a href="https://modelscope.cn/studios/qwen/Qwen-7B-Chat-Demo/summary">Demo</a>
<br> <br>
<a href="assets/wechat.png">微信</a>&nbsp&nbsp &nbsp&nbsp 钉钉 &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp <a href="assets/wechat.png">微信</a>&nbsp&nbsp &nbsp&nbsp 钉钉 &nbsp&nbsp | &nbsp&nbsp<a href="https://discord.gg/z3GAxXZ9Ce">Discord</a>&nbsp&nbsp
</p> </p>
@ -20,7 +20,7 @@
| 7B | <a href="https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-7B-Chat">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-7B-Chat-Int4">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-7B/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-7B">🤗</a> | | 7B | <a href="https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-7B-Chat">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-7B-Chat-Int4/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-7B-Chat-Int4">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-7B/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-7B">🤗</a> |
| 14B | <a href="https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-14B-Chat">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-14B-Chat-Int4">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-14B/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-14B">🤗</a> | | 14B | <a href="https://modelscope.cn/models/qwen/Qwen-14B-Chat/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-14B-Chat">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-14B-Chat-Int4/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-14B-Chat-Int4">🤗</a> | <a href="https://modelscope.cn/models/qwen/Qwen-14B/summary">🤖 <a> <a href="https://huggingface.co/Qwen/Qwen-14B">🤗</a> |
我们开源了**Qwen**通义千问系列工作当前开源模型的参数规模为70亿7B和140亿14B。本次开源包括基础模型**Qwen**,即**Qwen-7B**和**Qwen-14B**,以及对话模型**Qwen-Chat**,即**Qwen-7B-Chat**和**Qwen-14B-Chat**。模型链接在表格中,请点击了解详情。同时,我们公开了我们的**技术报告**,请点击上方论文链接查看。 我们开源了**Qwen**通义千问系列工作当前开源模型的参数规模为70亿7B和140亿14B。本次开源包括基础模型**Qwen**,即**Qwen-7B**和**Qwen-14B**,以及对话模型**Qwen-Chat**,即**Qwen-7B-Chat**和**Qwen-14B-Chat**。模型链接在表格中,请点击了解详情。同时,我们公开了我们的**[技术报告](https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf)**,请点击上方论文链接查看。
当前基础模型已经稳定训练了大规模高质量且多样化的数据覆盖多语言当前绝以中文和英文为主总量高达3万亿token。在相关基准评测中Qwen系列模型拿出非常有竞争力的表现显著超出同规模模型并紧追一系列最强的闭源模型。此外我们利用SFT和RLHF技术实现对齐从基座模型训练得到对话模型。Qwen-Chat具备聊天、文字创作、摘要、信息抽取、翻译等能力同时还具备一定的代码生成和简单数学推理的能力。在此基础上我们针对LLM对接外部系统等方面针对性地做了优化当前具备较强的工具调用能力以及最近备受关注的Code Interpreter的能力和扮演Agent的能力。 当前基础模型已经稳定训练了大规模高质量且多样化的数据覆盖多语言当前绝以中文和英文为主总量高达3万亿token。在相关基准评测中Qwen系列模型拿出非常有竞争力的表现显著超出同规模模型并紧追一系列最强的闭源模型。此外我们利用SFT和RLHF技术实现对齐从基座模型训练得到对话模型。Qwen-Chat具备聊天、文字创作、摘要、信息抽取、翻译等能力同时还具备一定的代码生成和简单数学推理的能力。在此基础上我们针对LLM对接外部系统等方面针对性地做了优化当前具备较强的工具调用能力以及最近备受关注的Code Interpreter的能力和扮演Agent的能力。
@ -42,6 +42,7 @@
## 新闻 ## 新闻
* 2023年9月25日 🔥 在魔搭社区ModelScope和Hugging Face推出**Qwen-14B**和**Qwen-14B-Cha**t模型并同步更新**Qwen-7B**和**Qwen-7B-Chat**模型。相比原版Qwen-7B新版用了更多训练数据2.4T token序列长度从2048扩展至8192。整体中文能力以及代码能力提升较多。**请确保你使用的是最新的代码和模型!** * 2023年9月25日 🔥 在魔搭社区ModelScope和Hugging Face推出**Qwen-14B**和**Qwen-14B-Cha**t模型并同步更新**Qwen-7B**和**Qwen-7B-Chat**模型。相比原版Qwen-7B新版用了更多训练数据2.4T token序列长度从2048扩展至8192。整体中文能力以及代码能力提升较多。**请确保你使用的是最新的代码和模型!**
* 2023年9月12日 支持Qwen-7B和Qwen-7B-Chat的微调其中包括全参数微调、LoRA以及Q-LoRA。 * 2023年9月12日 支持Qwen-7B和Qwen-7B-Chat的微调其中包括全参数微调、LoRA以及Q-LoRA。
* 2023年8月21日 发布Qwen-7B-Chat的Int4量化模型Qwen-7B-Chat-Int4。该模型显存占用低推理速度相比半精度模型显著提升在基准评测上效果损失较小。 * 2023年8月21日 发布Qwen-7B-Chat的Int4量化模型Qwen-7B-Chat-Int4。该模型显存占用低推理速度相比半精度模型显著提升在基准评测上效果损失较小。
@ -75,7 +76,7 @@ Qwen-14B及Qwen-7B (最新版本使用更大量的token进行预训练)相比同
对于以上所有对比模型,我们列出了其官方汇报结果与[OpenCompass](https://opencompass.org.cn/leaderboard-llm)结果之间的最佳分数。 对于以上所有对比模型,我们列出了其官方汇报结果与[OpenCompass](https://opencompass.org.cn/leaderboard-llm)结果之间的最佳分数。
更多的实验结果和细节请查看我们的技术备忘录。点击[这里](TODO)。 更多的实验结果和细节请查看我们的技术备忘录。点击[这里](https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf)。
<br><br> <br><br>
## 要求 ## 要求

Loading…
Cancel
Save