Update README.md

main
Junyang Lin 1 year ago committed by GitHub
parent 0c701d944c
commit 28ea8cb8be
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -33,7 +33,7 @@ Qwen-7B is the 7B-parameter version of the large language model series, Qwen (ab
In general, Qwen-7B outperforms the baseline models of a similar model size, and even outperform larger models of around 13B parameters, on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, HumanEval, and WMT22, etc., which evaluate the models' capabilities on natural language understanding, mathematic problem solving, coding, etc. See the results below. In general, Qwen-7B outperforms the baseline models of a similar model size, and even outperform larger models of around 13B parameters, on a series of benchmark datasets, e.g., MMLU, C-Eval, GSM8K, HumanEval, and WMT22, etc., which evaluate the models' capabilities on natural language understanding, mathematic problem solving, coding, etc. See the results below.
| Model | MMLU | C-Eval | GSM8K | HumanEval | WMT22 (en-zh) | | Model | MMLU | C-Eval | GSM8K | HumanEval | WMT22 (en-zh) |
| :---------------- | -------------- | -------------: | -------------: | -------------: | -------------: | | :---------------- | -------------: | -------------: | -------------: | -------------: | -------------: |
| LLaMA-7B | 35.1 | - | 11.0 | 10.5 | 8.7 | | LLaMA-7B | 35.1 | - | 11.0 | 10.5 | 8.7 |
| LLaMA 2-7B | 45.3 | - | 14.6 | 12.8 | 17.9 | | LLaMA 2-7B | 45.3 | - | 14.6 | 12.8 | 17.9 |
| Baichuan-7B | 42.3 | 42.8 | 9.7 | 9.2 | 26.6 | | Baichuan-7B | 42.3 | 42.8 | 9.7 | 9.2 | 26.6 |
@ -239,7 +239,7 @@ To extend the context length and break the botteneck of training sequence length
## Reproduction ## Reproduction
For your reproduction of the model performance on benchmark datasets, we provide scripts for you to reproduce the results and improve your own model. Check [eval/EVALUATION.md](eval/EVALUATION.md) for more information. For your reproduction of the model performance on benchmark datasets, we provide scripts for you to reproduce the results. Check [eval/EVALUATION.md](eval/EVALUATION.md) for more information. Note that the reproduction may lead to slight differences from our reported results.
## License Agreement ## License Agreement

Loading…
Cancel
Save