art.metrics

Module providing metrics and verifications.

Clique Method Robustness Verification

class art.metrics.RobustnessVerificationTreeModelsCliqueMethod(classifier: ClassifierDecisionTree, verbose: bool = True)

Robustness verification for decision-tree-based models. Following the implementation in https://github.com/chenhongge/treeVerification (MIT License, 9 August 2019)

__init__(classifier: ClassifierDecisionTree, verbose: bool = True) None

Create robustness verification for a decision-tree-based classifier.

Parameters
  • classifier – A trained decision-tree-based classifier.

  • verbose (bool) – Show progress bars.

__weakref__

list of weak references to the object (if defined)

verify(x: numpy.ndarray, y: numpy.ndarray, eps_init: float, norm: int = inf, nb_search_steps: int = 10, max_clique: int = 2, max_level: int = 2) Tuple[float, float]

Verify the robustness of the classifier on the dataset (x, y).

Return type

Tuple

Parameters
  • x (ndarray) – Feature data of shape (nb_samples, nb_features).

  • y (ndarray) – Labels, one-vs-rest encoding of shape (nb_samples, nb_classes).

  • eps_init (float) – Attack budget for the first search step.

  • norm (int) – The norm to apply epsilon.

  • nb_search_steps (int) – The number of search steps.

  • max_clique (int) – The maximum number of nodes in a clique.

  • max_level (int) – The maximum number of clique search levels.

Returns

A tuple of the average robustness bound and the verification error at eps.

Loss Sensitivity

art.metrics.loss_sensitivity(classifier: CLASSIFIER_LOSS_GRADIENTS_TYPE, x: numpy.ndarray, y: numpy.ndarray) numpy.ndarray

Local loss sensitivity estimated through the gradients of the prediction at points in x.

Return type

ndarray

Parameters
  • classifier – A trained model.

  • x (ndarray) – Data sample of shape that can be fed into classifier.

  • y (ndarray) – Labels for sample x, one-hot encoded.

Returns

The average loss sensitivity of the model.

Empirical Robustness

art.metrics.empirical_robustness(classifier: CLASSIFIER_TYPE, x: numpy.ndarray, attack_name: str, attack_params: Optional[Dict[str, Any]] = None) Union[float, numpy.ndarray]

Compute the Empirical Robustness of a classifier object over the sample x for a given adversarial crafting method attack. This is equivalent to computing the minimal perturbation that the attacker must introduce for a successful attack.

Parameters
  • classifier – A trained model.

  • x (ndarray) – Data sample of shape that can be fed into classifier.

  • attack_name (str) – A string specifying the attack to be used. Currently supported attacks are {fgsm’, `hsj} (Fast Gradient Sign Method, Hop Skip Jump).

  • attack_params – A dictionary with attack-specific parameters. If the attack has a norm attribute, then it will be used as the norm for calculating the robustness; otherwise the standard Euclidean distance is used (norm=2).

Returns

The average empirical robustness computed on x.

CLEVER

art.metrics.clever_u(classifier: CLASSIFIER_CLASS_LOSS_GRADIENTS_TYPE, x: numpy.ndarray, nb_batches: int, batch_size: int, radius: float, norm: int, c_init: float = 1.0, pool_factor: int = 10, verbose: bool = True) float

Compute CLEVER score for an untargeted attack.

Return type

float

Parameters
  • classifier – A trained model.

  • x (ndarray) – One input sample.

  • nb_batches (int) – Number of repetitions of the estimate.

  • batch_size (int) – Number of random examples to sample per batch.

  • radius (float) – Radius of the maximum perturbation.

  • norm (int) – Current support: 1, 2, np.inf.

  • c_init (float) – initialization of Weibull distribution.

  • pool_factor (int) – The factor to create a pool of random samples with size pool_factor x n_s.

  • verbose (bool) – Show progress bars.

Returns

CLEVER score.

art.metrics.clever_t(classifier: CLASSIFIER_CLASS_LOSS_GRADIENTS_TYPE, x: numpy.ndarray, target_class: int, nb_batches: int, batch_size: int, radius: float, norm: int, c_init: float = 1.0, pool_factor: int = 10) float

Compute CLEVER score for a targeted attack.

Return type

float

Parameters
  • classifier – A trained model.

  • x (ndarray) – One input sample.

  • target_class (int) – Targeted class.

  • nb_batches (int) – Number of repetitions of the estimate.

  • batch_size (int) – Number of random examples to sample per batch.

  • radius (float) – Radius of the maximum perturbation.

  • norm (int) – Current support: 1, 2, np.inf.

  • c_init (float) – Initialization of Weibull distribution.

  • pool_factor (int) – The factor to create a pool of random samples with size pool_factor x n_s.

Returns

CLEVER score.

Wasserstein Distance

art.metrics.wasserstein_distance(u_values: numpy.ndarray, v_values: numpy.ndarray, u_weights: Optional[numpy.ndarray] = None, v_weights: Optional[numpy.ndarray] = None) numpy.ndarray

Compute the first Wasserstein distance between two 1D distributions.

Return type

ndarray

Parameters
  • u_values (ndarray) – Values of first distribution with shape (nb_samples, feature_dim_1, …, feature_dim_n).

  • v_values (ndarray) – Values of second distribution with shape (nb_samples, feature_dim_1, …, feature_dim_n).

  • u_weights – Weight for each value. If None, equal weights will be used.

  • v_weights – Weight for each value. If None, equal weights will be used.

Returns

The Wasserstein distance between the two distributions.

Pointwise Differential Training Privacy

art.metrics.PDTP(target_estimator: Classifier, extra_estimator: Classifier, x: numpy.ndarray, y: numpy.ndarray, indexes: Optional[numpy.ndarray] = None, num_iter: Optional[int] = 10) numpy.ndarray

Compute the pointwise differential training privacy metric for the given classifier and training set.

Return type

ndarray

Parameters
  • target_estimator – The classifier to be analyzed.

  • extra_estimator – Another classifier of the same type as the target classifier, but not yet fit.

  • x (ndarray) – The training data of the classifier.

  • y (ndarray) – Target values (class labels) of x, one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • indexes – the subset of indexes of x to compute the PDTP metric on. If not supplied, PDTP will be computed for all samples in x.

  • num_iter – the number of iterations of PDTP computation to run for each sample. If not supplied, defaults to 10. The result is the average across iterations.

Returns

an array containing the average PDTP value for each sample in the training set. The higher the value, the higher the privacy leakage for that sample.