skfair.experimentation

from skfair.experimentation import Experiment, DATASET_REGISTRY, METHOD_REGISTRY, METRIC_REGISTRY

skfair.experimentation.Experiment

Run a fairness-method comparison experiment.

Parameters:
  • datasets (list of str or dict, default: None ) –

    Each element is either a string (key of DATASET_REGISTRY) or a dict with keys "name", "data", "sens_attr" and optionally "priv_group" (default 1). Example::

    datasets=[
        "ricci",
        {"name": "my_data", "data": (X, y),
         "sens_attr": "gender", "priv_group": 1},
    ]
    

    Default: ["adult"].

  • methods (list of str, default: None ) –

    Method names (keys of METHOD_REGISTRY). Default: all methods.

  • classifiers (dict or list, default: None ) –

    Either {"name": estimator_instance} or a list of dotted import paths (e.g. ["sklearn.svm.SVC"]). Default: {"LogReg": LogisticRegression(...)}.

  • metrics (list of str, default: None ) –

    Metric names (keys of METRIC_REGISTRY). Default: all 6 metrics.

  • n_splits (int, default: 5 ) –

    Number of CV folds (1 for a single train/test split).

  • random_state (int, default: 42 ) –

    Random seed.

  • dataset_config (dict, default: None ) –

    Per-dataset overrides, e.g. {"adult": {"sens_attr": "race", "priv_group": 1}}.

  • method_config (dict, default: None ) –

    Per-method param overrides, e.g. {"FairSmote": {"random_state": 0}}.

  • std (bool, default: False ) –

    If True, include {metric}_std columns in the results DataFrame.

  • audit_bias (bool, default: False ) –

    If True, create a BiasAuditor per dataset after loading.

  • audit_fairness (bool, default: False ) –

    If True, store out-of-fold predictions so that :meth:audit_fairness can build a FairnessAuditor later.

  • save_models (dict or None, default: None ) –

    When not None, fitted pipelines are persisted. Keys:

    • "full_data_retrain" (bool, default True): retrain on the full dataset before saving. If False, save the model from the last CV fold.
    • "models" ("all" or list of dicts, default "all"): "all" saves every combination; a list of {"method": ..., "classifier": ...} dicts restricts which combinations are saved.
  • config (str, default: None ) –

    Path to (or raw string of) a YAML configuration. When provided, all other arguments are ignored and the config is read from YAML.

dataset_names property

Return list of dataset display names.

from_config(path) classmethod

Create an Experiment from a YAML configuration file/string.

get_fairness_auditor(dataset, method, classifier, aggregate=False)

Create a FairnessAuditor from stored out-of-fold predictions.

Parameters:
  • dataset (str) –

    Must match values in the results DataFrame.

  • method (str) –

    Must match values in the results DataFrame.

  • classifier (str) –

    Must match values in the results DataFrame.

  • aggregate (bool, default: False ) –

    If True, all out-of-fold predictions are concatenated into a single FairnessAuditor (one metric computation on all data). If False (default), fairness metrics are computed per fold and averaged, consistent with how the comparison report works.

Returns:
  • FairnessAuditor

    When aggregate=False the auditor is built from the first fold but its :meth:fairness_metrics is monkey-patched to return the cross-fold average instead.

load(path) classmethod

Load a previously saved Experiment from a .pkl file.

run(verbose=True)

Execute the experiment.

Returns:
  • DataFrame

    One row per (dataset, method, classifier) with metric columns. Includes {metric}_std columns when std=True.

save(path=None, results_csv=None, object_pkl=None, report_html=None, models=None)

Save experiment outputs.

Parameters:
  • path (str, default: None ) –

    Base path (without extension). Defaults to self.save_path.

  • results_csv (bool, default: None ) –

    Write results DataFrame to {path}.csv. Defaults to self.save_results_csv.

  • object_pkl (bool, default: None ) –

    Pickle full Experiment to {path}.pkl. Defaults to self.save_object_pkl.

  • report_html (bool, default: None ) –

    Generate HTML report to {path}.html. Defaults to self.save_report_html.

  • models (bool, default: None ) –

    Save fitted models to {path}_models/. Defaults to True when self.save_models is not None.

to_report()

Wrap results in a ComparisonReport.