scikit-fair

Fairness-aware machine learning toolkit with a scikit-learn compatible API.

scikit-fair (skfair) is a Python library for fairness-aware binary classification. It covers the full pipeline — preprocessing, evaluation, auditing, comparison, and experimentation — and integrates seamlessly with scikit-learn and imbalanced-learn workflows.

What it does

Algorithmic fairness is concerned with preventing machine learning models from discriminating against individuals based on sensitive attributes such as race, sex, or age. scikit-fair provides tools for every stage of the fairness workflow:

  • Preprocessing: transform data before training — weighting, resampling, and feature transformation techniques
  • Metrics: nine group-fairness metrics and nine performance metrics with a unified API
  • Datasets: five standard fairness benchmark datasets with convenient loaders
  • Audit: pre-model data analysis (BiasAuditor) and post-model fairness evaluation (FairnessAuditor)
  • Comparison: visual comparison of multiple preprocessing methods across datasets and classifiers (ComparisonReport)
  • Experimentation: automated dataset × method × classifier experiments with cross-validation (Experiment)

At a glance

from skfair.preprocessing import Massaging
from skfair.metrics import disparate_impact, statistical_parity_difference
from skfair.datasets import load_adult

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

X, y = load_adult(return_X_y=True, as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Reduce discrimination via label massaging
sampler = Massaging(sens_attr="sex", priv_group=1)
X_fair, y_fair = sampler.fit_resample(X_train, y_train)

clf = LogisticRegression(max_iter=1000)
clf.fit(X_fair, y_fair)
y_pred = clf.predict(X_test)

# Measure improvement
print(disparate_impact(y_test.values, y_pred, X_test["sex"].values))

Algorithms

Class Type Reference
Reweighing Weighting Kamiran & Calders (2012)
FairBalance Weighting Yu et al. (2024)
ReweighingClassifier Meta-estimator
FairBalanceClassifier Meta-estimator
Massaging Label modification Kamiran & Calders (2012)
FairwayRemover Undersampling Fairway (2019)
FairOversampling Oversampling Dablan et al.
FairSmote Oversampling Chakraborty et al. (2021)
FAWOS Oversampling Salazar et al. (2021)
HeterogeneousFOS Oversampling Sonoda et al. (2023)
DisparateImpactRemover Feature transformation Feldman et al. (2015)
OptimizedPreprocessing Feature transformation Calmon et al. (2017)
LearningFairRepresentations Feature transformation Zemel et al. (2013)
FairMask Meta-estimator Peng et al. (2021)
IntersectionalBinarizer Utility
DropColumns Utility

Metrics

Nine group fairness metrics and nine performance metrics are included. All follow a consistent signature: metric(y_true, y_pred, sensitive_attr).

Example notebooks

The examples/ folder contains step-by-step Jupyter notebooks that walk through every module:

Notebook Description
01_datasets Loading, exploring, and preprocessing the bundled datasets
02_methods Using fairness methods — transformers, samplers, and meta-estimators
03_audit Pre-model bias analysis and post-model fairness auditing
04_comparison Comparing methods side-by-side with ComparisonReport
05_experiment Running cross-validated experiments with Experiment
05a_experiment_config Configuring experiments from Python and YAML
05b_custom_datasets Using custom (user-provided) datasets in experiments
06_benchmark Full-scale benchmark driven by a YAML config file