art.metrics

Module providing metrics and verifications.

Clique Method Robustness Verification

class art.metrics.RobustnessVerificationTreeModelsCliqueMethod(classifier: ClassifierDecisionTree, verbose: bool = True)

Robustness verification for decision-tree-based models. Following the implementation in https://github.com/chenhongge/treeVerification (MIT License, 9 August 2019)

__init__(classifier: ClassifierDecisionTree, verbose: bool = True) None

Create robustness verification for a decision-tree-based classifier.

Parameters:
  • classifier – A trained decision-tree-based classifier.

  • verbose (bool) – Show progress bars.

__weakref__

list of weak references to the object (if defined)

verify(x: ndarray, y: ndarray, eps_init: float, norm: float = inf, nb_search_steps: int = 10, max_clique: int = 2, max_level: int = 2) Tuple[float, float]

Verify the robustness of the classifier on the dataset (x, y).

Parameters:
  • x (ndarray) – Feature data of shape (nb_samples, nb_features).

  • y (ndarray) – Labels, one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,)`.

  • eps_init (float) – Attack budget for the first search step.

  • norm (float) – The norm to apply epsilon.

  • nb_search_steps (int) – The number of search steps.

  • max_clique (int) – The maximum number of nodes in a clique.

  • max_level (int) – The maximum number of clique search levels.

Returns:

A tuple of the average robustness bound and the verification error at eps.

Loss Sensitivity

art.metrics.loss_sensitivity(classifier: CLASSIFIER_LOSS_GRADIENTS_TYPE, x: ndarray, y: ndarray) ndarray

Local loss sensitivity estimated through the gradients of the prediction at points in x.

Return type:

ndarray

Parameters:
  • classifier – A trained model.

  • x (ndarray) – Data sample of shape that can be fed into classifier.

  • y (ndarray) – Labels for sample x, one-hot encoded.

Returns:

The average loss sensitivity of the model.

Empirical Robustness

art.metrics.empirical_robustness(classifier: CLASSIFIER_TYPE, x: ndarray, attack_name: str, attack_params: Dict[str, Any] | None = None) float | ndarray

Compute the Empirical Robustness of a classifier object over the sample x for a given adversarial crafting method attack. This is equivalent to computing the minimal perturbation that the attacker must introduce for a successful attack.

Parameters:
  • classifier – A trained model.

  • x (ndarray) – Data sample of shape that can be fed into classifier.

  • attack_name (str) – A string specifying the attack to be used as a key to art.metrics.metrics.SUPPORTED_METHODS.

  • attack_params – A dictionary with attack-specific parameters. If the attack has a norm attribute, then it will be used as the norm for calculating the robustness; otherwise the standard Euclidean distance is used (norm=2).

Returns:

The average empirical robustness computed on x.

CLEVER

art.metrics.clever_u(classifier: CLASSIFIER_CLASS_LOSS_GRADIENTS_TYPE, x: ndarray, nb_batches: int, batch_size: int, radius: float, norm: int, c_init: float = 1.0, pool_factor: int = 10, verbose: bool = True) float

Compute CLEVER score for an untargeted attack.

Return type:

float

Parameters:
  • classifier – A trained model.

  • x (ndarray) – One input sample.

  • nb_batches (int) – Number of repetitions of the estimate.

  • batch_size (int) – Number of random examples to sample per batch.

  • radius (float) – Radius of the maximum perturbation.

  • norm (int) – Current support: 1, 2, np.inf.

  • c_init (float) – initialization of Weibull distribution.

  • pool_factor (int) – The factor to create a pool of random samples with size pool_factor x n_s.

  • verbose (bool) – Show progress bars.

Returns:

CLEVER score.

art.metrics.clever_t(classifier: CLASSIFIER_CLASS_LOSS_GRADIENTS_TYPE, x: ndarray, target_class: int, nb_batches: int, batch_size: int, radius: float, norm: float, c_init: float = 1.0, pool_factor: int = 10) float

Compute CLEVER score for a targeted attack.

Return type:

float

Parameters:
  • classifier – A trained model.

  • x (ndarray) – One input sample.

  • target_class (int) – Targeted class.

  • nb_batches (int) – Number of repetitions of the estimate.

  • batch_size (int) – Number of random examples to sample per batch.

  • radius (float) – Radius of the maximum perturbation.

  • norm (float) – Current support: 1, 2, np.inf.

  • c_init (float) – Initialization of Weibull distribution.

  • pool_factor (int) – The factor to create a pool of random samples with size pool_factor x n_s.

Returns:

CLEVER score.

Wasserstein Distance

art.metrics.wasserstein_distance(u_values: ndarray, v_values: ndarray, u_weights: ndarray | None = None, v_weights: ndarray | None = None) ndarray

Compute the first Wasserstein distance between two 1D distributions.

Return type:

ndarray

Parameters:
  • u_values (ndarray) – Values of first distribution with shape (nb_samples, feature_dim_1, …, feature_dim_n).

  • v_values (ndarray) – Values of second distribution with shape (nb_samples, feature_dim_1, …, feature_dim_n).

  • u_weights – Weight for each value. If None, equal weights will be used.

  • v_weights – Weight for each value. If None, equal weights will be used.

Returns:

The Wasserstein distance between the two distributions.

Pointwise Differential Training Privacy

art.metrics.PDTP(target_estimator: CLASSIFIER_TYPE, extra_estimator: CLASSIFIER_TYPE, x: ndarray, y: ndarray, indexes: ndarray | None = None, num_iter: int = 10, comparison_type: ComparisonType | None = ComparisonType.RATIO) Tuple[ndarray, ndarray, ndarray]

Compute the pointwise differential training privacy metric for the given classifier and training set.

Parameters:
  • target_estimator – The classifier to be analyzed.

  • extra_estimator – Another classifier of the same type as the target classifier, but not yet fit.

  • x (ndarray) – The training data of the classifier.

  • y (ndarray) – Target values (class labels) of x, one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • indexes – the subset of indexes of x to compute the PDTP metric on. If not supplied, PDTP will be computed for all samples in x.

  • num_iter (int) – the number of iterations of PDTP computation to run for each sample. If not supplied, defaults to 10. The result is the average across iterations.

  • comparison_type – the way in which to compare the model outputs between models trained with and without a certain sample. Default is to compute the ratio.

Returns:

A tuple of three arrays, containing the average (worse, standard deviation) PDTP value for each sample in the training set respectively. The higher the value, the higher the privacy leakage for that sample.

SHAPr Membership Privacy Risk

art.metrics.SHAPr(target_estimator: CLASSIFIER_TYPE, x_train: ndarray, y_train: ndarray, x_test: ndarray, y_test: ndarray, knn_metric: str | None = None) ndarray

Compute the SHAPr membership privacy risk metric for the given classifier and training set.

Return type:

ndarray

Parameters:
  • target_estimator – The classifier to be analyzed.

  • x_train (ndarray) – The training data of the classifier.

  • y_train (ndarray) – Target values (class labels) of x_train, one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • x_test (ndarray) – The test data of the classifier.

  • y_test (ndarray) – Target values (class labels) of x_test, one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • knn_metric – The distance metric to use for the KNN classifier (default is ‘minkowski’, which represents Euclidean distance).

Returns:

an array containing the SHAPr scores for each sample in the training set. The higher the value, the higher the privacy leakage for that sample. Any value above 0 should be considered a privacy leak.