Improving Black-box Robustness with In-Context Rewriting

This repo contains the code to replicate the results in our paper and extend our study to additional models and datasets. We use Make to abstract most of the commands to replicate our results. Don't hestiate to reach out to Kyle O'Brien with questions.

Installation

We've tested our experiments using Python 3.10. Conda is recommended.

conda create -n llm-tta python=3.10
conda activate llm-tta

You can install the pip packages and datasets using the following command.

make install_depends

Generating augmentation is slow. We use a caching mechanism to leverage previously generated augmentations for each test input to speed up experiments. You can fill the cache using the following command. The cache wil otherwise automatically begin populating otherwise.

make download_rewrites_cache

And clear the cache with:

make clear_rewrites_cache

Run Experiments

You can iterate through the main results using the following commands.

make main_results_async # split across all GPUs (Recommended)
make main_results_sync # Single-GPU setup

Makefile contains multiple other preset experiment configurations and shows examples of various command line arguments.

Acknowledgments

We are grateful to EleutherAI for permitting access to their compute resources for initial experiments. The welcome and open research community on the EleutherAI Discord was especially helpful for the literature review, debugging PyTorch issues, and information necessary to conduct the parameter count ablation experiment (Appendix A.2). In particular, we would like to thank Stella Biderman, Nora Belrose, and Hailey Schoelkopf.

Lydia O’Brien provided copy editing and feedback on figure design and engaged in extensive discussions that shaped the direction of this project.

M. Ghassemi’s work is supported in part by Quanta Computing and the Gordon and Betty Moore Foundation. The research of J. Mendez-Mendez is funded by an MIT-IBM Distinguished Postdoctoral Fellowship.

Name		Name	Last commit message	Last commit date
Latest commit History 350 Commits
.vscode		.vscode
archived_experiments		archived_experiments
augmenters_hf		augmenters_hf
notebooks		notebooks
openai_hf		openai_hf
openicl		openicl
prompts		prompts
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
adaptive_methods.py		adaptive_methods.py
evaluate_styling.py		evaluate_styling.py
launch_experiment_sweep.sh		launch_experiment_sweep.sh
populate_cache.py		populate_cache.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
rewriter_dataset.py		rewriter_dataset.py
run_experiment.sh		run_experiment.sh
train_model.py		train_model.py
util_caching.py		util_caching.py
util_data.py		util_data.py
util_icl.py		util_icl.py
util_metrics.py		util_metrics.py
util_modeling.py		util_modeling.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving Black-box Robustness with In-Context Rewriting

Installation

Run Experiments

Acknowledgments

About

Releases

Packages

Languages

Kyle1668/LLM-TTA

Folders and files

Latest commit

History

Repository files navigation

Improving Black-box Robustness with In-Context Rewriting

Installation

Run Experiments

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages