Update README_CN.md

main
Yang An 1 year ago committed by GitHub
parent 9863a4896f
commit 460ea3418b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -244,7 +244,7 @@ model = AutoModelForCausalLM.from_pretrained(
| Int8 | 52.8 | 10.44G |
| NF4 | 48.9 | 7.79G |
表中显存占用的测试环境为A100-SXM4-80G单卡PyTorch 2.0.1cuda11.8开启flash attention
表中显存占用的测试环境为A100-SXM4-80G单卡PyTorch 2.0.1CUDA 11.8开启flash attention
## 推理性能
@ -258,7 +258,7 @@ model = AutoModelForCausalLM.from_pretrained(
| Int8 (bnb) | 7.94 | 7.86 |
| NF4 (bnb) | 21.43 | 20.37 |
具体的评测方式为指定输入context长度为1生成长度为2048测试硬件为A100-SXM4-80G单卡软件环境为PyTorch 2.0.1cuda版本11.8计算生成该2048序列的平均速度
具体的评测方式为指定输入context长度为1生成长度为2048测试硬件为A100-SXM4-80G单卡软件环境为PyTorch 2.0.1CUDA版本11.8计算生成该2048序列的平均速度
### 显存占用

Loading…
Cancel
Save