Inductive Bias Probes

Source code for the paper "What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models" by Keyon Vafa, Peter G Chang, Ashesh Rambachan, and Sendhil Mullainathan.

If you find this repository useful for your research, please consider citing our work:

@inproceedings{vafa2025world,
  title={What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models},
  author={Vafa, Keyon and Chang, Peter G and Rambachan, Ashesh and Mullainathan, Sendhil},
  booktitle={International Conference on Machine Learning},
  year={2025},
}

Setup

After downloading the repo, go inside the inductive-bias-probes folder.

First, make sure you have pipx installed.

sudo apt install pipx
pipx install uv
pipx ensurepath
source ~/.bashrc

Then, create a virtual environment and install the dependencies with uv.

uv venv ~/.venv --python 3.10.0
source ~/.venv/bin/activate
uv pip install -e '.[dev]'
uv pip install -e '.[ssm]' --no-build-isolation

Run source ~/.venv/bin/activate to activate the virtual environment.

You may also want to log into wandb via wandb login.

Data generation

The dataset used for pretraining in the physics experiments is large. Make sure the data directory is set up so that it can store a lot of data -- go to the inductivebiasprobes/paths.py file and make sure the DATA_DIR variable is set accordingly.

To generate data for pretraining, go to inductivebiasprobes/experiments/physics. Run the following command:

python generate_data.py

For quick prototyping, pass in --num_train_trajectories 1000.

Pretraining

The training code is based on Andrej Karpathy's nanoGPT library. To train the model, run the following in inductivebiasprobes/experiments/physics. Note the following code assumes 8 GPUs. You can also download the checkpoints used in the paper from here.

NUM_GPUS=8
torchrun --nproc_per_node=$NUM_GPUS train_model.py --config ntp_config --gradient_accumulation_steps $NUM_GPUS  --max_iters 600_000 --plot_trajectory  --batch_size 64 --eval_interval 250 --eval_iters 1

To plot an animation of the model's predictions, run the following:

python plot_solar_system_orbits.py

That will create a gif in the figs directory, which will look like this:

Adapting to new tasks

Run the following command to fine-tune the model on the small sample of force vectors in our solar system created in the generate_data.py script. The command below uses 1 GPU (make sure you're in the inductivebiasprobes/experiments/physics directory).

python train_model.py --config force_vector_config --pretrained next_token --learning_rate 2e-4 --max_iters 10_000 --batch_size 64 --eval_interval 10

When that's done, run the following command to plot the model's force predictions:

python plot_forces.py

This will create both static images and a gif of the model's force predictions like the following.

To see what the oracle predictions look like, run the following command:

python fit_force_vector_oracle.py

This will save a folder called scratch/oracle. To plot those, replace the predictions in plot_forces.py with the oracle predictions.

For the symbolic regressions, we fine-tune on data from multiple solar systems. Run the following command:

python train_model.py --config force_magnitude_config --pretrained next_token --max_iters 10_000 --batch_size 64 --eval_interval 50

To get the symbolic regression equations, run

python fit_symbolic_regression.py

Note that the equations might be slightly different from the ones in the paper due to random seeds.

To get the LLM force predictions, make sure you have the following environment variables set:

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GOOGLE_API_KEY=...

Then, run the following command. Make sure to set the model variable in get_llm_forces.py to the model you want to use (e.g. "o3", "claude-sonnet-4-20250514", "gemini-2.5-pro").

python get_llm_forces.py

To plot the LLM force predictions, run the following command.

python plot_llm_forces.py

The code for the inductive bias probes for physics and the lattice and Othello experiments will be added soon. We're also working on sharing model checkpoints and the larger datasets.

Inductive Bias Probe

Once you have a pretrained model, we can apply the inductive bias probe by following the steps listed below.

Random output data generation

To generate the random binary output dataset used for the inductive bias probes, run the following command:

python generate_white_noise_and_oracle_predictions.py --val_different_sequence

To generate the dataset used in the modified setup described in Appendix B.1, run the following command:

python generate_white_noise_and_oracle_predictions.py

Note that in addition to generating the random output dataset, each of these commands will also fit a linear model and an MLP and save their extrapolations.

Adapting to New Tasks

Run the following command to fine-tune the pretrained next-token predictor on the generated dataset:

python train_model.py --config white_noise_config --pretrained next_token --white_noise_dataset_size 100 --max_iters 100

Once the finetuning tasks are finished, run the following commands to compute the inductive bias metrics on the extrapolations generated by your model:

python compute_inductive_bias.py

This will generate the figure shown below.

Other Experiments

The code to reproduce results for the Lattice and Othello experiments can be found in inductivebiasprobs/experiments/gridworld/ and inductivebiasprobs/experiments/othello/, respectively. Refer to the README.md file in each directory for details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
inductivebiasprobes		inductivebiasprobes
.gitignore		.gitignore
.mailmap		.mailmap
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Inductive Bias Probes

Setup

Data generation

Pretraining

Adapting to new tasks

Inductive Bias Probe

Random output data generation

Adapting to New Tasks

Other Experiments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

keyonvafa/inductive-bias-probes

Folders and files

Latest commit

History

Repository files navigation

Inductive Bias Probes

Setup

Data generation

Pretraining

Adapting to new tasks

Inductive Bias Probe

Random output data generation

Adapting to New Tasks

Other Experiments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages