You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
2b565da220
change Old Evaluation Dataset (Version 20230803) to new version |
11 months ago | |
---|---|---|
.. | ||
EVALUATION.md | 1 year ago | |
evaluate_ceval.py | 1 year ago | |
evaluate_chat_ceval.py | 1 year ago | |
evaluate_chat_gsm8k.py | 1 year ago | |
evaluate_chat_humaneval.py | 1 year ago | |
evaluate_chat_mmlu.py | 1 year ago | |
evaluate_cmmlu.py | 1 year ago | |
evaluate_gsm8k.py | 1 year ago | |
evaluate_humaneval.py | 1 year ago | |
evaluate_mmlu.py | 1 year ago | |
evaluate_plugin.py | 11 months ago | |
gsm8k_prompt.txt | 1 year ago |