Name		Name	Last commit message	Last commit date
parent directory ..
dataset		dataset
demo		demo
scripts		scripts
claude.py		claude.py
data_process.py		data_process.py
do_gsm8k.py		do_gsm8k.py
do_math.py		do_math.py
evaluate.py		evaluate.py
evaluate_falcon.py		evaluate_falcon.py
prompt_pattern.py		prompt_pattern.py
readme.md		readme.md
solve_claude.py		solve_claude.py
solve_text_002.py		solve_text_002.py
solve_text_003.py		solve_text_003.py
solve_turbo.py		solve_turbo.py
test_falcon_gsm8k.py		test_falcon_gsm8k.py
test_falcon_math.py		test_falcon_math.py

readme.md

Mathematical Reasoning

Datasets

We provide the processed dataset in folder dataset/ and few-shot exemplars in demo/.

Models

You have to download the checkpoints in hugging face version, and save them in folder model/ with the model name as the sub-folder name.

Environment

Here are the main environmental configurations

python==3.9.13
torch==1.12.1
transformers==4.29.1
datasets==2.10.1
sympy==1.11.1
openai

Specially, for running falcon on our code, you need pytorch 2.x version instead of pytorch 1.x.

torch==2.0.1

Evaluation

You can running the scripts in folder scripts/ to evaluate the LLMs and reproduce our experiment:

bash scripts/run_eval_color.sh
bash scripts/run_eval_penguins.sh
bash scripts/run_eval_gsm8k.sh
bash scripts/run_eval_math.sh

To better evaluate the results from prediction of LLMs with the golden labels, we use evaluate.py to clean the results from LLMs and evaluate the accuracy. You can find the details in the corresponding file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MathematicalReasoning

MathematicalReasoning

readme.md

Mathematical Reasoning

Datasets

Models

Environment

Evaluation

Files

MathematicalReasoning

Directory actions

More options

Directory actions

More options

Latest commit

History

MathematicalReasoning

Folders and files

parent directory

readme.md

Mathematical Reasoning

Datasets

Models

Environment

Evaluation