-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How many shots are used for evaluation of HumanEval? #39
Comments
Hi @zhimin-z . If you are referring to the Self-Check baselines, then yes. Please refer to https://github.com/GAIR-NLP/factool/blob/main/factool/utils/prompts/self_check.yaml |
Thanks for your quick replies |
Hi @zhimin-z. These results were evaluated neither using 0-shot nor 3-shot. The results were evaluated using FacTool. For some modules in certain tasks we did provide some demonstrations (e.g. query generation for KB-QA), while others we use 0-shot prompting. |
Is that 0-shot and 3-shot CoT?
The text was updated successfully, but these errors were encountered: