Update README.md

main
Junyang Lin 2 years ago committed by GitHub
parent 81288c389a
commit 8ced635b22
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -159,8 +159,12 @@ print(f'Response: {response}')
## Quantization
To load the model in lower precision, e.g., 4 bits and 8 bits, we provide examples to show how to load by adding quantization configuration:
We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`.
```
pip install bitsandbytes
```
Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
```python
from transformers import BitsAndBytesConfig

Loading…
Cancel
Save