From 8ced635b223f7fcecf4b6a40ed721c1a8d77b2f1 Mon Sep 17 00:00:00 2001
From: Junyang Lin <justinlin930319@hotmail.com>
Date: Thu, 3 Aug 2023 20:58:22 +0800
Subject: [PATCH] Update README.md

---
 README.md | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index b5cdbac..30975aa 100644
--- a/README.md
+++ b/README.md
@@ -159,8 +159,12 @@ print(f'Response: {response}')
 
 ## Quantization
 
-To load the model in lower precision, e.g., 4 bits and 8 bits, we provide examples to show how to load by adding quantization configuration:
+We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`.
+```
+pip install bitsandbytes
+```
 
+Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
 ```python
 from transformers import BitsAndBytesConfig