Update README.md

2 years ago · 8ced635b22
parent 81288c389a
commit 8ced635b22
1 changed files with 5 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -159,8 +159,12 @@ print(f'Response: {response}')

 ## Quantization

-To load the model in lower precision, e.g., 4 bits and 8 bits, we provide examples to show how to load by adding quantization configuration:
+We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`.
+```
+pip install bitsandbytes
+```

+Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
 ```python
 from transformers import BitsAndBytesConfig