Comparison¶

The skfair.comparison module provides ComparisonReport, a visualisation tool for comparing multiple fairness preprocessing methods across datasets and classifiers.

Creating a report¶

ComparisonReport takes a results DataFrame with columns dataset, method, classifier, plus one column per metric (e.g. accuracy, spd). Optional {metric}_std columns are preserved but not used in plots.

The easiest way to get this DataFrame is from an Experiment (see Experimentation):

from skfair.experimentation import Experiment
from skfair.comparison import ComparisonReport

exp = Experiment(
    datasets=["adult", "compas"],
    methods=["Massaging", "FairSmote", "ReweighingClassifier"],
    n_splits=5,
)
results = exp.run()

report = ComparisonReport(results)
# or equivalently:
report = exp.to_report()

On construction, ComparisonReport auto-detects which columns are performance metrics and which are fairness metrics.

You can also filter the report at construction time:

report = ComparisonReport(
    results,
    datasets=["adult"],
    methods=["Massaging", "FairSmote"],
    classifiers=["LogReg"],
)

Summary tables¶

tables = report.summary_tables()

Returns a dictionary of DataFrames — one per metric — with method means averaged over classifiers, pivoted by dataset.

The classifier parameter controls how classifiers are aggregated:

None or "average" (default) — average over all classifiers.
"best" — keep only the best-performing classifier per method (by accuracy).
A specific name (e.g. "LogReg") — filter to that classifier only.

# Only show the best classifier per method
tables = report.summary_tables(classifier="best")

# Filter to a specific classifier
tables = report.summary_tables(classifier="LogReg")

Plot methods¶

All plot methods return (fig, axes) tuples.

Metric bar chart¶

report.plot_metric_bar(metric="accuracy")   # performance
report.plot_metric_bar(metric="spd")        # fairness (auto reference line)

Grouped bar chart for any single metric across datasets. For fairness metrics, a reference line is added automatically.

Fairness–performance tradeoff¶

report.plot_tradeoff(fairness_metric="spd", performance_metric="accuracy")

Scatter plot of |fairness metric| vs. performance metric for each method, faceted by dataset. Helps identify methods that achieve a good balance.

Method ranking¶

report.plot_ranking()

Heatmap of method rankings per dataset across all metrics. Lower rank (closer to 1) is better.

The classifier parameter works the same way as in summary_tables():

# Rank using the best classifier per method
report.plot_ranking(classifier="best")

# Rank using a specific classifier
report.plot_ranking(classifier="LogReg")

All plots at once¶

report.plot_all(fairness_metric="spd")

Runs all plot methods (one per metric, plus tradeoff and ranking) and returns a list of (fig, axes) tuples.

HTML report¶

report.to_html("report.html")

Generates a self-contained interactive HTML file with embedded matplotlib charts (base64-encoded), tab navigation, filtering checkboxes, and a PDF export button.

Report tabs¶

The HTML report is organized into five tabs:

Tab	Contents
Tables	Summary tables for every metric, with methods as rows and datasets as columns
Performance	Bar charts for each performance metric, grouped by dataset and method
Fairness	Bar charts for each fairness metric, with automatic reference lines at zero
Rankings	Heatmap of method rankings across datasets and metrics (lower rank = better)
Tradeoff	Scatter plot of fairness vs. performance, faceted by dataset

Interactive controls¶

Each tab includes filtering checkboxes for datasets, classifiers, and metrics. Use the Select All / Deselect All buttons to quickly toggle visibility. Filtering is applied client-side — no re-generation needed.

PDF export¶

Click the Export PDF button in the top-right corner to produce a print-friendly version. Collapsed sections stay collapsed in the PDF output.

`to_html()` parameters¶

Parameter	Type	Default	Description
`path`	`str`	—	Output file path (e.g. `"report.html"`)
`datasets`	`list[str]`	`None`	Filter to specific datasets
`methods`	`list[str]`	`None`	Filter to specific methods
`classifiers`	`list[str]`	`None`	Filter to specific classifiers
`metrics`	`list[str]`	`None`	Metrics to include (defaults to all detected)
`fairness_metric`	`str`	`"spd"`	Fairness metric for tradeoff and detail charts
`performance_metric`	`str`	`"accuracy"`	Performance metric for tradeoff chart
`classifier`	`str`	`None`	Classifier aggregation: `None`/`"average"`, `"best"`, or a specific name

# Full example with all parameters
report.to_html(
    "report.html",
    datasets=["adult"],
    methods=["Massaging", "FairSmote"],
    fairness_metric="spd",
    performance_metric="accuracy",
    classifier="best",
)

Future benchmark

The comparison module will be expanded as the authors design their own comprehensive fairness preprocessing benchmark. Future versions will include additional analysis tools and visualization options.