@ -50,7 +50,6 @@ In general, Qwen-7B outperforms the baseline models of a similar model size, and
<p>
<p>
<br>
<br>
For more experimental results (detailed model performance on more benchmark datasets) and details, please refer to our technical memo by clicking [here](techmemo-draft.md).
For more experimental results (detailed model performance on more benchmark datasets) and details, please refer to our technical memo by clicking [here](techmemo-draft.md).
We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`.
We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`.
```
```
pip install bitsandbytes
pip install bitsandbytes
```
```
Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
```python
```python
from transformers import BitsAndBytesConfig
from transformers import BitsAndBytesConfig
@ -267,3 +273,5 @@ Researchers and developers are free to use the codes and model weights of both Q
If you are interested to leave a message to either our research team or product team, feel free to send an email to qianwen_opensource@alibabacloud.com.
If you are interested to leave a message to either our research team or product team, feel free to send an email to qianwen_opensource@alibabacloud.com.