Model-probing mislabeled examples detection in machine learning datasets
A ModelProbingDetector
assigns trust_scores
to training examples probing
an Ensemble
of machine learning model
.
pip install git+https://github.com/orange-opensource/mislabeled
X, y = fetch_openml("mnist_784", return_X_y=True, as_frame=False)
y = LabelEncoder().fit_transform(y)
mlp = make_pipeline(MinMaxScaler(), MLPClassifier())
mlp.fit(X, y)
probe = Representer()
representer_values = probe(mlp, X, y)
supicious = np.argsort(-representer_values)[0:top_k]
for i in suspicious:
plt.imshow(X[i].reshape(28, 28))
detector = ModelProbingDetector(mlp, Representer(), ProgressiveEnsemble(), "var")
var_representer_values = detector.trust_scores(X, y)
Detector | Paper | Code (from mislabeled.detect.detectors ) |
---|---|---|
Area Under the Margin (AUM) | NeurIPS 2020 | import AreaUnderMargin |
Influence | Paper 1974 | import InfluenceDetector |
Representer | Paper 1972 | import RepresenterDetector |
TracIn | NeurIPS 2020 | import TracIn |
Forget Scores | ICLR 2019 | import ForgetScores |
VoG | CVPR 2022 | import VoLG, VoSG, LinearVoSG |
Small Loss | ICML 2018 | import SmallLoss |
CleanLab | JAIR 2021 | import ConfidentLearning |
Consensus (C-Scores) | Applied Intelligence 2011 | import ConsensusConsistency |
AGRA | ECML 2023 | import AGRA |
and other limitless combinations by using ModelProbingDetector
with any probe
and Ensembles
from the library.
For more details and examples, check the notebooks !
If you use this library in a research project, please consider citing the corresponding paper with the following bibtex entry:
@article{george2024mislabeled,
title={Mislabeled examples detection viewed as probing machine learning models: concepts, survey and extensive benchmark},
author={Thomas George and Pierre Nodet and Alexis Bondu and Vincent Lemaire},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2024},
url={https://openreview.net/forum?id=3YlOr7BHkx},
note={}
}
Install hatch.
To format and lint:
hatch fmt
To run tests:
hatch test