This repository contains source code to our NeurIPS 2023 paper:
OKRidge: Scalable Optimal k-Sparse Ridge Regression
- Documentation: https://okridge.readthedocs.io
- GitHub: https://github.com/jiachangliu/OKRidge
- PyPI: https://pypi.org/project/okridge/
- Free and open source software: BSD license
We consider the following optimization problem:
Optimal k-sparse ridge regression is a crucial ML problem that has many applications in statistics, machine learning, and data mining.
However, the problem is NP-hard, and existing algorithms are either slow (using commercial MIP solvers) or suboptimal (using convex or nonconvex regularizers to approximate
OKRidge is based on the branch-and-bound framework. The insight leading to the computational efficiency comes from a novel lower bound calculation involving, first, a saddle point formulation, and from there, either solving (i) a linear system or (ii) using an ADMM-based approach, where the proximal operators can be efficiently evaluated by solving another linear system and an isotonic regression problem. We also propose a method to warm-start our solver, which leverages a beam search.
Experimentally, our methods attain provable optimality with run times that are orders of magnitude faster than those of the existing MIP formulations solved by the commercial solver Gurobi.
$ pip install okridge
Please see the example.ipynb jupyter notebook on GitHub or Example Usage on Read the Docs for a detailed tutorial on how to use OKRidge in a python environment. The detailed descriptions of key functions can be found in the API Reference on Read the Docs.
from okridge.tree import BNBTree
k = 10 # cardinality constraint
lambda2 = 0.1 # l2 regularization parameter
gap_tol = 1e-4 # optimality gap tolerance
verbose = True # print out the progress
time_limit = 180 # time limit in seconds
BnB_optimizer = BNBTree(X=X, y=y, lambda2=lambda2)
upper_bound, betas, optimality_gap, max_lower_bound, running_time = BnB_optimizer.solve(k = k, gap_tol = gap_tol, verbose = verbose, time_limit = time_limit)
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
okridge
was created by Jiachang Liu. It is licensed under the terms of the BSD 3-Clause license.
okridge
was created with cookiecutter
and the py-pkgs-cookiecutter
template.
If you find our work useful in your research, please consider citing the following paper:
@inproceedings{liu2023okridge,
title={OKRidge: Scalable Optimal k-Sparse Ridge Regression},
author={Liu, Jiachang and Rosen, Sam and Zhong, Chudi and Rudin, Cynthia},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023}
}