Skip to content

openml/openml-python

This branch is 2 commits ahead of develop.

Folders and files

NameName
Last commit message
Last commit date
Oct 17, 2024
Oct 18, 2024
Aug 15, 2023
Oct 18, 2024
Feb 10, 2025
Dec 12, 2024
Nov 2, 2020
Jan 17, 2024
Feb 10, 2025
Jan 11, 2022
Oct 15, 2024
Apr 4, 2018
Jul 26, 2019
Mar 14, 2016
Jan 8, 2024
May 17, 2021
Oct 15, 2024
Oct 15, 2024

Repository files navigation

    OpenML Logo

    OpenML-Python

    Python Logo

The Python API for a World of Data and More 💫

Latest Release Python Versions Downloads License

Installation | Documentation | Contribution guidelines

OpenML-Python provides an easy-to-use and straightforward Python interface for OpenML, an online platform for open science collaboration in machine learning. It can download or upload data from OpenML, such as datasets and machine learning experiment results.

🕹️ Minimal Example

Use the following code to get the credit-g dataset:

import openml

dataset = openml.datasets.get_dataset("credit-g") # or by ID get_dataset(31)
X, y, categorical_indicator, attribute_names = dataset.get_data(target="class")

Get a task for supervised classification on credit-g:

import openml

task = openml.tasks.get_task(31)
dataset = task.get_dataset()
X, y, categorical_indicator, attribute_names = dataset.get_data(target=task.target_name)
# get splits for the first fold of 10-fold cross-validation
train_indices, test_indices = task.get_train_test_split_indices(fold=0)

Use an OpenML benchmarking suite to get a curated list of machine-learning tasks:

import openml

suite = openml.study.get_suite("amlb-classification-all")  # Get a curated list of tasks for classification
for task_id in suite.tasks:
    task = openml.tasks.get_task(task_id)

🪄 Installation

OpenML-Python is supported on Python 3.8 - 3.13 and is available on Linux, MacOS, and Windows.

You can install OpenML-Python with:

pip install openml

📄 Citing OpenML-Python

If you use OpenML-Python in a scientific publication, we would appreciate a reference to the following paper:

Matthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren, Frank Hutter
OpenML-Python: an extensible Python API for OpenML
Journal of Machine Learning Research, 22(100):1−5, 2021

Bibtex entry:

@article{JMLR:v22:19-920,
  author  = {Matthias Feurer and Jan N. van Rijn and Arlind Kadra and Pieter Gijsbers and Neeratyoy Mallik and Sahithya Ravi and Andreas Müller and Joaquin Vanschoren and Frank Hutter},
  title   = {OpenML-Python: an extensible Python API for OpenML},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {100},
  pages   = {1--5},
  url     = {http://jmlr.org/papers/v22/19-920.html}
}