experiments

Mar 16, 2020

a7034c4 · Mar 16, 2020

Name	Name	Last commit message	Last commit date
parent directory ..
data	data	Update results of experiments	Feb 20, 2020
html	html	Update results of experiments	Feb 20, 2020
notebooks	notebooks	Add notebooks	Mar 16, 2020
scripts	scripts	Update scripts to new package versions	Feb 20, 2020
Manifest.toml	Manifest.toml	Update scripts to new package versions	Feb 20, 2020
Project.toml	Project.toml	Update scripts to new package versions	Feb 20, 2020
README.md	README.md	Add notebooks	Mar 16, 2020

README.md

Experiments

This folder contains the implementation and the results of the experiments in the paper "Calibration tests in multi-class classification: A unifying framework" by Widmann, Lindsten, and Zachariah, which is going to be presented at NeurIPS 2019.

Structure

The subfolder scripts contains the scripts

errors.jl for evaluating calibration error estimators for different calibrated and uncalibrated models,
pvalues.jl for evaluating bounds and approximation of the p-value of calibration error estimates under the null hypothesis of the model being calibrated for different calibrated and uncalibrated models,
nn.jl for evaluating calibration error estimators and p-value approximations for different neural networks pre-trained on the CIFAR-10 image data set,
timings.jl for benchmarking different calibration error estimations.

These scripts are written in a format of the literate programming tool Weave. The subfolder html contains HTML files that are generated from these scripts and illustrate the results that are saved as CSV files in the subfolder data.

View HTML files

You can use the HTML preview feature of Github to display the HTML version of the experiments errors.jl, pvalues.jl, nn.jl, and timings.jl online.

Reproducibility

If you want to rerun the experiments, make sure to delete the relevant CSV files with our results in the subfolder data. Open a terminal in this folder and install the required Julia packages by running

julia --project=. -e 'using Pkg; Pkg.instantiate()'

You can run the desired experiment EXPERIMENT.jl in the subfolder scripts with

cd scripts
julia --project=.. EXPERIMENT.jl

Note that the experiments errors.jl, pvalues.jl, and nn.jl take multiple hours or even days to complete, if you run them from scratch. Hence these scripts are heavily parallelized and make use of multiple cores, if possible. It is recommended to run them on a dedicated server and to use multi-core processing with

julia --project=.. -p=n EXPERIMENT.jl

where n is the number of additional local worker processes. By specifying auto, as many workers as the number of local CPU threads (logical cores) are launched.

The corresponding HTML files can be regenerated and updated by running

cd scripts
julia --project=.. -e 'using Weave; weave("EXPERIMENT.jl"; out_path = joinpath("..", "html"))'

It is recommended to perform the experiments and to make sure that the results are saved to the subfolder data before generating the HTML file.

Similarly, the corresponding Jupyter notebooks can be regenerated and updated by running

cd scripts
julia --project=.. -e 'using Weave; notebook("EXPERIMENT.jl"; out_path = joinpath("..", "notebooks"))'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

experiments

experiments

README.md

Experiments

Structure

View HTML files

Reproducibility

Files

experiments

Directory actions

More options

Directory actions

More options

Latest commit

History

experiments

Folders and files

parent directory

README.md

Experiments

Structure

View HTML files

Reproducibility