Improving Code Generation by Training with Natural Language Feedback

Authors: Angelica Chen, Jérémy Scheurer, Tomasz Korbak, Jon Ander Campos, Jun Shern Chan, Samuel R. Bowman, Kyunghyun Cho, Ethan Perez

This repository contains the code and data (human-written feedback and refinements) for running the Imitation learning from Language Feedback (ILF) algorithm for code generation from "Improving Code Generation by Training with Natural Language Feedback" by Chen et al. (2023).

Installation

Our code relies upon the jaxformer repository and open-source CodeGen-Mono checkpoints.

To install all dependencies and download the necessary model checkpoints:

git clone [email protected]:nyu-mll/ILF-for-code-generation.git
cd ILF-for-code-generation
conda env create -f environment.yml

# Install codegen repo and reset to old commit
git clone [email protected]:salesforce/CodeGen.git
cd CodeGen
git reset --hard 9cc1f971c83ad606cce5da292d3c58523dd920a2
git clean -df
pip3 install -r requirements.txt
cd ..

# To download codegen-6B-mono
wget -P checkpoints https://storage.googleapis.com/sfr-codegen-research/checkpoints/codegen-6B-mono.tar.gz && tar -xvf checkpoints/codegen-6B-mono.tar.gz -C checkpoints/

In our paper we use the Codegen-Mono 6B checkpoint, but you can easily replace the above wget command with the download links for the other CodeGen models.

To run the ILF pipeline

To run the ILF pipeline using our dataset, run (from this directory):

source ilf_pipeline.sh -d $(pwd) -n <EXPERIMENT_NAME>

with <EXPERIMENT_NAME> replaced with the name of the subdirectory that you wish to store results in.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create_finetuning_data_from_refinements.py		create_finetuning_data_from_refinements.py
environment.yml		environment.yml
eval_mbpp.py		eval_mbpp.py
finetune.py		finetune.py
finetune_refinement_model.py		finetune_refinement_model.py
generate_code_for_mbpp.py		generate_code_for_mbpp.py
generate_refinements_codegen_finetuned.py		generate_refinements_codegen_finetuned.py
ilf_for_code_gen.pdf		ilf_for_code_gen.pdf
ilf_pipeline.sh		ilf_pipeline.sh
preprocess_feedback_spreadsheet.py		preprocess_feedback_spreadsheet.py
surge_annotations.jsonl		surge_annotations.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Code Generation by Training with Natural Language Feedback

Installation

To run the ILF pipeline

About

Releases

Packages

Languages

License

nyu-mll/ILF-for-code-generation

Folders and files

Latest commit

History

Repository files navigation

Improving Code Generation by Training with Natural Language Feedback

Installation

To run the ILF pipeline

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages