From 6ebfdfb0b5ee036f7c716c407e5d9261df9547e6 Mon Sep 17 00:00:00 2001 From: Yang An Date: Thu, 3 Aug 2023 17:22:18 +0800 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9450d29..6fc75e3 100644 --- a/README.md +++ b/README.md @@ -20,7 +20,7 @@ Qwen-7B is the 7B-parameter version of the large language model series, Qwen (ab 1. **Trained with high-quality pretraining data**. We have pretrained Qwen-7B on a self-constructed large-scale high-quality dataset of over 2.2 trillion tokens. The dataset includes plain texts and codes, and it covers a wide range of domains, including general domain data and professional damain data. 2. **Strong performance**. In comparison with the models of the similar model size, we outperform the competitors on a series of benchmark datasets, which evaluates natural language understanding, mathematics, coding, etc. -3. **Better support of languages**. Our tokenizer, based on a large vocabulary of over 150K tokens, is a more efficient one compared with other tokenizers. It is friendly to many languages, and it is helpful for users to further finetune `Qwen-7B` for the extension of understanding a certain language. +3. **Better support of languages**. Our tokenizer, based on a large vocabulary of over 150K tokens, is a more efficient one compared with other tokenizers. It is friendly to many languages, and it is helpful for users to further finetune Qwen-7B for the extension of understanding a certain language. 4. **Support of 8K Context Length**. Both Qwen-7B and Qwen-7B-Chat supports the context length of 8K, which allows inputs with long contexts. 5. **Support of Plugins**. Qwen-7B-Chat is trained with plugin-related alignment data, and thus it is capable of using tools, including APIs, models, databases, etc., and it is capable of playing as an agent.