diff --git a/README_JA.md b/README_JA.md
index f1108da..d5cd64d 100644
--- a/README_JA.md
+++ b/README_JA.md
@@ -4,38 +4,52 @@
- +
- Qwen-7B ð€ | ð€  ïœ Qwen-7B-Chat ð€ | ð€  | Qwen-7B-Chat-Int4 ð€
+ ð€ Hugging Face   |   ð€ ModelScope   |    ð Paper   ïœ   ð¥ïž Demo
-WeChat   |   Discord   |   Demo  ïœ  Report
+WeChat (埮信)   ïœ    DingTalk (éé)    |   Discord  
æ¥æ¬èªããã¥ã¡ã³ãã¡ã³ãããŒ: Ikko Eltociear Ashimine & Junyang Lin
+ +
+
-| Model | MMLU | C-Eval | GSM8K | MATH | HumanEval | MBPP | BBH | CMMLU |
-|:------------------|:--------:|:--------:|:--------:|:--------:|:---------:|:---------:|:--------:|:--------:|
-| | 5-shot | 5-shot | 8-shot | 4-shot | 0-shot | 3-shot | 3-shot | 5-shot |
-| LLaMA2-7B | 46.8 | 32.5 | 16.7 | 3.3 | 12.8 | 20.8 | 38.2 | 31.8 |
-| LLaMA2-13B | 55.0 | 41.4 | 29.6 | 5.0 | 18.9 | 30.3 | 45.6 | 38.4 |
-| LLaMA2-34B | 62.6 | - | 42.2 | 6.2 | 22.6 | 33.0 | 44.1 | - |
-| ChatGLM2-6B | 47.9 | 51.7 | 32.4 | 6.5 | - | - | 33.7 | - |
-| InternLM-7B | 51.0 | 52.8 | 31.2 | 6.3 | 10.4 | 14.0 | 37.0 | 51.8 |
-| InternLM-20B | 62.1 | 58.8 | 52.6 | 7.9 | 25.6 | 35.6 | 52.5 | 59.0 |
-| Baichuan2-7B | 54.2 | 54.0 | 24.5 | 5.6 | 18.3 | 24.2 | 41.6 | 57.1 |
-| Baichuan2-13B | 59.2 | 58.1 | 52.8 | 10.1 | 17.1 | 30.2 | 48.8 | 62.0 |
-| **Qwen-7B** | 56.7 | 59.6 | 51.6 | - | 24.4 | 31.2 | 40.6 | 58.8 |
-| **Qwen-7B v1.1** | 58.2 | 63.5 | 51.7 | 11.6 | 29.9 | 31.6 | 45.0 | 62.2 |
-| **Qwen-14B** | **66.3** | **72.1** | **61.3** | **24.8** | **32.3** | **40.8** | **53.4** | **71.0** |
+| Model | MMLU | C-Eval | GSM8K | MATH | HumanEval | MBPP | BBH | CMMLU |
+|:-------------------|:--------:|:--------:|:--------:|:--------:|:---------:|:---------:|:--------:|:--------:|
+| | 5-shot | 5-shot | 8-shot | 4-shot | 0-shot | 3-shot | 3-shot | 5-shot |
+| LLaMA2-7B | 46.8 | 32.5 | 16.7 | 3.3 | 12.8 | 20.8 | 38.2 | 31.8 |
+| LLaMA2-13B | 55.0 | 41.4 | 29.6 | 5.0 | 18.9 | 30.3 | 45.6 | 38.4 |
+| LLaMA2-34B | 62.6 | - | 42.2 | 6.2 | 22.6 | 33.0 | 44.1 | - |
+| ChatGLM2-6B | 47.9 | 51.7 | 32.4 | 6.5 | - | - | 33.7 | - |
+| InternLM-7B | 51.0 | 52.8 | 31.2 | 6.3 | 10.4 | 14.0 | 37.0 | 51.8 |
+| InternLM-20B | 62.1 | 58.8 | 52.6 | 7.9 | 25.6 | 35.6 | 52.5 | 59.0 |
+| Baichuan2-7B | 54.2 | 54.0 | 24.5 | 5.6 | 18.3 | 24.2 | 41.6 | 57.1 |
+| Baichuan2-13B | 59.2 | 58.1 | 52.8 | 10.1 | 17.1 | 30.2 | 48.8 | 62.0 |
+| Qwen-7B (original) | 56.7 | 59.6 | 51.6 | 10.4 | 24.4 | 31.2 | 40.6 | 58.8 |
+| **Qwen-7B** | 58.2 | 63.5 | 51.7 | 11.6 | 29.9 | 31.6 | 45.0 | 62.2 |
+| **Qwen-14B** | **66.3** | **72.1** | **61.3** | **24.8** | **32.3** | **40.8** | **53.4** | **71.0** |
æ¯èŒããããã¹ãŠã®ã¢ãã«ã«ã€ããŠãå
¬åŒã«å ±åãããçµæãš[OpenCompass](https://opencompass.org.cn/leaderboard-llm) ã®éã®æé«ã¹ã³ã¢ãå ±åããŸãã
@@ -75,7 +94,7 @@ Qwen-14B ã¯ãMMLUãC-EvalãGSM8KãHumanEvalãCMMLU ãªã©ãèªç¶èšèª
## ã¯ã€ãã¯ã¹ã¿ãŒã
-以äžã§ã¯ãQwen-7B ãš ð€ ModelScope ãš ð€ Transformers ã®ç°¡åãªäœ¿çšäŸã瀺ããŸãã
+以äžã§ã¯ãQwen-Chat ãš ð€ ModelScope ãš ð€ Transformers ã®ç°¡åãªäœ¿çšäŸã瀺ããŸãã
ã³ãŒããå®è¡ããåã«ãç°å¢ã®ã»ããã¢ãããšå¿
èŠãªããã±ãŒãžã®ã€ã³ã¹ããŒã«ãæžãã§ããããšã確èªããŠãã ãããäžèšã®èŠä»¶ãæºãããŠããããšã確èªããŠãããäŸåããã©ã€ãã©ãªãã€ã³ã¹ããŒã«ããŠãã ããã
@@ -97,13 +116,13 @@ cd flash-attention && pip install .
#### ð€ Transformers
-Qwen-7B-Chat ãæšè«ã«äœ¿çšããã«ã¯ã以äžã®ããã«æ°è¡ã®ã³ãŒããå
¥åããã ãã§ãã**ææ°ã®ã³ãŒãã䜿çšããŠããããšã確èªããŠãã ããã**
+Qwen-Chat ãæšè«ã«äœ¿çšããã«ã¯ã以äžã®ããã«æ°è¡ã®ã³ãŒããå
¥åããã ãã§ãã**ææ°ã®ã³ãŒãã䜿çšããŠããããšã確èªããŠãã ããã**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
-# 泚: ããã©ã«ãã®åäœã§ã¯ãã€ã³ãžã§ã¯ã·ã§ã³æ»æé²æ¢æ©èœããªãã«ãªã£ãŠããŸãã
+# Model namesïŒ"Qwen/Qwen-7B-Chat"ã"Qwen/Qwen-14B-Chat"
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True)
# bf16 ã䜿çš
@@ -139,15 +158,16 @@ print(response)
# ãå¥æåäžïŒäžäžªå¹Žèœ»äººçæåä¹è·¯ã
```
-Qwen-7B ã®åŠç¿æžã¿ããŒã¹ã¢ãã«ã®å®è¡ãç°¡åã§ãã
+Qwen ã®åŠç¿æžã¿ããŒã¹ã¢ãã«ã®å®è¡ãç°¡åã§ãã
Qwen-7B ã®å®è¡
+ Qwen ã®å®è¡
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
+# Model namesïŒ"Qwen/Qwen-7B"ã"Qwen/Qwen-14B"
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B", trust_remote_code=True)
# bf16 ã䜿çš
# model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B", device_map="auto", trust_remote_code=True, bf16=True).eval()
@@ -178,6 +198,7 @@ ModelScope ã¯ãMaaSïŒModel-as-a-ServiceïŒ ã®ããã®ãªãŒãã³ãœãŒã¹
from modelscope import AutoModelForCausalLM, AutoTokenizer
from modelscope import GenerationConfig
+# Model namesïŒ"Qwen/Qwen-7B-Chat"ã"Qwen/Qwen-14B-Chat"
tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen-7B-Chat", revision='v1.0.5', trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("qwen/Qwen-7B-Chat", revision='v1.0.5', device_map="auto", trust_remote_code=True, fp16=True).eval()
model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-7B-Chat", revision='v1.0.5', trust_remote_code=True) # å¯æå®äžåççæé¿åºŠãtop_pççžå
³è¶
å
@@ -191,16 +212,11 @@ print(response)
```
-## ããŒã¯ãã€ã¶ãŒ
-
-tiktoken ã«åºã¥ãããŒã¯ãã€ã¶ãŒã¯ãä»ã®ããŒã¯ãã€ã¶ãŒãäŸãã°ã»ã³ãã³ã¹ããŒã¹ããŒã¯ãã€ã¶ãŒãšã¯ç°ãªããŸããç¹ã«ãã¡ã€ã³ãã¥ãŒãã³ã°ã®éã«ã¯ãç¹æ®ãªããŒã¯ã³ã«æ³šæãæãå¿
èŠããããŸããããŒã¯ãã€ã¶ã«é¢ãã詳现ãªæ
å ±ãããã¡ã€ã³ãã¥ãŒãã³ã°ã«ããã䜿çšæ¹æ³ã«ã€ããŠã¯ã[ããã¥ã¡ã³ã](tokenization_note_ja.md)ãåç
§ããŠãã ããã
-
-
## éåå
### 䜿çšæ¹æ³
-**泚: [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) ã«åºã¥ãæ°ãã解決çãæäŸããQwen-7B-Chat çšã® Int4 éååã¢ãã«[ãããã¯ãªãã¯](https://huggingface.co/Qwen/Qwen-7B-Chat-Int4)ããªãªãŒã¹ããŸããããã®ã¢ãã«ã¯ãåŸæ¥ã®è§£æ±ºçãšæ¯èŒããŠãã»ãŒç¡æ倱ã®ã¢ãã«å¹æãéæãã€ã€ãã¡ã¢ãªã³ã¹ããšæšè«é床ã®äž¡æ¹ã§æ§èœãåäžããŠããŸãã**
+**泚: [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) ã«åºã¥ãæ°ãã解決çãæäŸããQwen-Chat çšã® Int4 éååã¢ãã«[ãããã¯ãªãã¯](https://huggingface.co/Qwen/Qwen-7B-Chat-Int4)ããªãªãŒã¹ããŸããããã®ã¢ãã«ã¯ãåŸæ¥ã®è§£æ±ºçãšæ¯èŒããŠãã»ãŒç¡æ倱ã®ã¢ãã«å¹æãéæãã€ã€ãã¡ã¢ãªã³ã¹ããšæšè«é床ã®äž¡æ¹ã§æ§èœãåäžããŠããŸãã**
ããã§ã¯ãéååãããã¢ãã«ãæšè«ã«äœ¿çšããæ¹æ³ã説æãããå§ããåã«ãauto-gptqã®èŠä»¶ãæºãããŠããããšã確èªãïŒäŸïŒtorch 2.0以äžãtransformers 4.32.0以äžãªã©ïŒãå¿
èŠãªããã±ãŒãžãã€ã³ã¹ããŒã«ããŠãã ããïŒ
@@ -225,19 +241,23 @@ response, history = model.chat(tokenizer, "Hi", history=None)
ãã³ãããŒã¯ã«ããã BF16 ã¢ãã«ãš Int4 ã¢ãã«ã®æ§èœã«ã€ããŠèª¬æããŸãããã®çµæã¯ä»¥äžã«ç€ºããŸãïŒ
-| Quantization | MMLU | CEval (val) | GSM8K | Humaneval |
-| ------------- | :--------: | :----------: | :----: | :--------: |
-| BF16 | 53.9 | 54.2 | 41.1 | 24.4 |
-| Int4 | 52.6 | 52.9 | 38.1 | 23.8 |
+| Quantization | MMLU | CEval (val) | GSM8K | Humaneval |
+|----------------------|:----:|:-----------:|:-----:|:---------:|
+| Qwen-7B-Chat (BF16) | 53.9 | 54.2 | 41.1 | 24.4 |
+| Qwen-7B-Chat (Int4) | 52.6 | 52.9 | 38.1 | 23.8 |
+| Qwen-14B-Chat (BF16) | 64.6 | 69.8 | 61.0 | 43.9 |
+| Qwen-14B-Chat (Int4) | 63.3 | 69.0 | 59.8 | 45.7 |
### æšè«ã¹ããŒã
BF16 ã®ç²ŸåºŠãš Int4 ã®éååã¬ãã«ã®äžã§ããããã 2048 åãš 8192 åã®ããŒã¯ã³ãçæããå¹³åæšè«é床(tokens/s)ã枬å®ããŸããã
-| Quantization | Speed (2048 tokens) | Speed (8192 tokens) |
-| ------------- | :------------------:| :------------------:|
-| BF16 | 30.34 | 29.32 |
-| Int4 | 43.56 | 33.92 |
+| Quantization | Speed (2048 tokens) | Speed (8192 tokens) |
+|----------------------|:-------------------:|:-------------------:|
+| Qwen-7B-Chat (BF16) | 30.34 | 29.32 |
+| Qwen-7B-Chat (Int4) | 43.56 | 33.92 |
+| Qwen-14B-Chat (BF16) | 30.70 | 21.73 |
+| Qwen-14B-Chat (Int4) | 37.11 | 26.11 |
詳现ã«ã¯ããããã¡ã€ãªã³ã°ã®èšå®ã¯ã1 ã³ã³ãã¯ã¹ãããŒã¯ã³ã§ 8192 åã®æ°ããããŒã¯ã³ãçæããŠããŸãããããã¡ã€ãªã³ã°ã¯ãPyTorch 2.0.1 ãš CUDA 11.4 ãæèŒããã·ã³ã°ã« A100-SXM4-80G GPU ã§å®è¡ãããŸãããæšè«é床ã¯çæããã 8192 åã®ããŒã¯ã³ã®å¹³åå€ãšãªããŸãã
@@ -245,17 +265,22 @@ BF16 ã®ç²ŸåºŠãš Int4 ã®éååã¬ãã«ã®äžã§ããããã 2048 åãš
ãŸããBF16ãŸãã¯Int4ã®éååã¬ãã«ã§ããããã2048ããŒã¯ã³ãã³ã³ããã¹ããšããŠãšã³ã³ãŒãããå ŽåïŒããã³åäžã®ããŒã¯ã³ãçæããå ŽåïŒãšã8192ããŒã¯ã³ãçæããå ŽåïŒåäžã®ããŒã¯ã³ãã³ã³ããã¹ããšããŠçæããå ŽåïŒã®GPUã¡ã¢ãªäœ¿çšéã®ããŒã¯å€ããããã¡ã€ãªã³ã°ããŸããããã®çµæã以äžã«ç€ºããŸãã
-| Quantization Level | Peak Usage for Encoding 2048 Tokens | Peak Usage for Generating 8192 Tokens |
-| ------------------ | :---------------------------------: | :-----------------------------------: |
-| BF16 | 17.66GB | 22.58GB |
-| Int4 | 8.21GB | 13.62GB |
+| Quantization | Peak Usage for Encoding 2048 Tokens | Peak Usage for Generating 8192 Tokens |
+|----------------------|:-----------------------------------:|:-------------------------------------:|
+| Qwen-7B-Chat (BF16) | 17.66GB | 22.58GB |
+| Qwen-7B-Chat (Int4) | 8.21GB | 13.62GB |
+| Qwen-14B-Chat (BF16) | 30.15GB | 38.94GB |
+| Qwen-14B-Chat (Int4) | 13.00GB | 21.79GB |
äžèšã®ã¹ããŒããšã¡ã¢ãªãŒã®ãããã¡ã€ãªã³ã°ã¯ã[ãã®ã¹ã¯ãªãã](https://qianwen-res.oss-cn-beijing.aliyuncs.com/profile.py)ã䜿çšããŠããŸãã
## ãã¡ã€ã³ãã¥ãŒãã³ã°
-çŸåšãå
¬åŒã®ãã¬ãŒãã³ã°ã¹ã¯ãªãã `finetune.py` ãæäŸããŠããŸããããã«ãfinetune.pyã®ã·ã§ã«ã¹ã¯ãªãããæäŸããfinetune.pyãå®è¡ããããšã§ãfinetune.pyãèµ·åããããšãã§ãããããã«ãå®å¿ããŠãã¡ã€ã³ãã¥ãŒãã³ã°ãéå§ããããã®ã·ã§ã«ã¹ã¯ãªãããæäŸããŠããŸãããã®ã¹ã¯ãªããã¯ã[DeepSpeed](https://github.com/microsoft/DeepSpeed) ããã³ [FSDP](https://engineering.fb.com/2021/07/15/open-source/fsdp/) ã䜿çšãããã¬ãŒãã³ã°ããµããŒãããŸããåŒç€ŸãæäŸããã·ã§ã«ã»ã¹ã¯ãªãã㯠DeepSpeed ã䜿çšãããããäºåã« DeepSpeed ãã€ã³ã¹ããŒã«ããããšããå§ãããŸãïŒ
+çŸåšãå
¬åŒã®ãã¬ãŒãã³ã°ã¹ã¯ãªãã `finetune.py` ãæäŸããŠããŸããããã«ãfinetune.pyã®ã·ã§ã«ã¹ã¯ãªãããæäŸããfinetune.pyãå®è¡ããããšã§ãfinetune.pyãèµ·åããããšãã§ãããããã«ãå®å¿ããŠãã¡ã€ã³ãã¥ãŒãã³ã°ãéå§ããããã®ã·ã§ã«ã¹ã¯ãªãããæäŸããŠããŸãããã®ã¹ã¯ãªããã¯ã[DeepSpeed](https://github.com/microsoft/DeepSpeed) ããã³ [FSDP](https://engineering.fb.com/2021/07/15/open-source/fsdp/) ã䜿çšãããã¬ãŒãã³ã°ããµããŒãããŸããåŒç€ŸãæäŸããã·ã§ã«ã»ã¹ã¯ãªãã㯠DeepSpeed ãš Peft ã䜿çšãããããäºåã« DeepSpeed ãš Peft ãã€ã³ã¹ããŒã«ããããšããå§ãããŸãïŒ
+```bash
+pip install -r requirements_finetune.txt
+```
åŠç¿ããŒã¿ãæºåããã«ã¯ããã¹ãŠã®ãµã³ãã«ããªã¹ãã«ãŸãšããjsonãã¡ã€ã«ã«ä¿åããå¿
èŠããããŸããåãµã³ãã«ã¯idãšäŒè©±ãªã¹ãã§æ§æãããèŸæžã§ãã以äžã¯1ã€ã®ãµã³ãã«ãå«ãåçŽãªãªã¹ãã®äŸã§ãïŒ
@@ -696,6 +721,11 @@ ReAct ããã³ããã®æžãæ¹ã䜿ãæ¹ã«ã€ããŠã¯ã[ReAct ã®äŸ](ex
+## ããŒã¯ãã€ã¶ãŒ
+
+tiktoken ã«åºã¥ãããŒã¯ãã€ã¶ãŒã¯ãä»ã®ããŒã¯ãã€ã¶ãŒãäŸãã°ã»ã³ãã³ã¹ããŒã¹ããŒã¯ãã€ã¶ãŒãšã¯ç°ãªããŸããç¹ã«ãã¡ã€ã³ãã¥ãŒãã³ã°ã®éã«ã¯ãç¹æ®ãªããŒã¯ã³ã«æ³šæãæãå¿
èŠããããŸããããŒã¯ãã€ã¶ã«é¢ãã詳现ãªæ
å ±ãããã¡ã€ã³ãã¥ãŒãã³ã°ã«ããã䜿çšæ¹æ³ã«ã€ããŠã¯ã[ããã¥ã¡ã³ã](tokenization_note_ja.md)ãåç
§ããŠãã ããã
+
+
## åçŸ
ãã³ãããŒã¯ããŒã¿ã»ããã§ã®ã¢ãã«æ§èœã®åçŸã®ããã«ãçµæãåçŸããã¹ã¯ãªãããæäŸããŠããŸãã詳ãã㯠[eval/EVALUATION.md](eval/EVALUATION.md) ã確èªããŠãã ããããªããåçŸã®çµæãæã
ã®å ±åçµæãšè¥å¹²ç°ãªãå ŽåããããŸãã
@@ -708,7 +738,7 @@ ReAct ããã³ããã®æžãæ¹ã䜿ãæ¹ã«ã€ããŠã¯ã[ReAct ã®äŸ](ex
## ã©ã€ã»ã³ã¹å¥çŽ
-Qwen-7B ãš Qwen-7B-Chat ã®ã³ãŒããšã¢ãã«ãŠã§ã€ãã¯ãç 究è
ãéçºè
ãèªç±ã«äœ¿çšããããšãã§ããŸãããŸããåçšå©çšãå¯èœã§ãã詳ãã㯠[LICENSE](LICENSE) ãã芧ãã ãããåçšå©çšãåžæãããæ¹ã¯ã[ãªã¯ãšã¹ããã©ãŒã ](https://dashscope.console.aliyun.com/openModelApply/qianwen)ã«å¿
èŠäºé
ããèšå
¥ã®äžããç³ã蟌ã¿ãã ããã
+Qwen ãš Qwen-Chat ã®ã³ãŒããšã¢ãã«ãŠã§ã€ãã¯ãç 究è
ãéçºè
ãèªç±ã«äœ¿çšããããšãã§ããŸãããŸããåçšå©çšãå¯èœã§ãã詳ãã㯠[LICENSE](LICENSE) ãã芧ãã ãããåçšå©çšãåžæãããæ¹ã¯ã[ãªã¯ãšã¹ããã©ãŒã ](https://dashscope.console.aliyun.com/openModelApply/qianwen)ã«å¿
èŠäºé
ããèšå
¥ã®äžããç³ã蟌ã¿ãã ããã
## ãåãåãã