From 460ea3418b5696f04e1ddc998e3b6f5540ee1ca7 Mon Sep 17 00:00:00 2001 From: Yang An Date: Sun, 13 Aug 2023 16:27:06 +0800 Subject: [PATCH] Update README_CN.md --- README_CN.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README_CN.md b/README_CN.md index 18b8298..dfee8c5 100644 --- a/README_CN.md +++ b/README_CN.md @@ -244,7 +244,7 @@ model = AutoModelForCausalLM.from_pretrained( | Int8 | 52.8 | 10.44G | | NF4 | 48.9 | 7.79G | -注:表中显存占用的测试环境为A100-SXM4-80G单卡,PyTorch 2.0.1,cuda11.8,开启flash attention +注:表中显存占用的测试环境为A100-SXM4-80G单卡,PyTorch 2.0.1,CUDA 11.8,开启flash attention ## 推理性能 @@ -258,7 +258,7 @@ model = AutoModelForCausalLM.from_pretrained( | Int8 (bnb) | 7.94 | 7.86 | | NF4 (bnb) | 21.43 | 20.37 | -具体的评测方式为:指定输入context长度为1,生成长度为2048;测试硬件为A100-SXM4-80G单卡,软件环境为PyTorch 2.0.1,cuda版本11.8,计算生成该2048序列的平均速度 +具体的评测方式为:指定输入context长度为1,生成长度为2048;测试硬件为A100-SXM4-80G单卡,软件环境为PyTorch 2.0.1,CUDA版本11.8,计算生成该2048序列的平均速度 ### 显存占用