art.defences.detector.evasion

Module implementing detector-based defences against evasion attacks.

Base Class

class art.defences.detector.evasion.EvasionDetector

Abstract base class for all evasion detectors.

abstract detect(x: ndarray, batch_size: int = 128, **kwargs) Tuple[dict, ndarray]

Perform detection of adversarial data and return prediction as tuple.

Parameters:
  • x (ndarray) – Data sample on which to perform detection.

  • batch_size (int) – Size of batches.

  • kwargs – Defence-specific parameters used by child classes.

Returns:

(report, is_adversarial): where report is a dictionary containing information specific to the detection defence; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial

abstract fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) None

Fit the detection classifier if necessary.

Parameters:
  • x (ndarray) – Training set to fit the detector.

  • y (ndarray) – Labels for the training set.

  • batch_size (int) – Size of batches.

  • nb_epochs (int) – Number of epochs to use for training.

  • kwargs – Other parameters.

get_params() Dict[str, Any]

Returns dictionary of parameters used to run defence.

Returns:

Dictionary of parameters of the method.

set_params(**kwargs) None

Take in a dictionary of parameters and apply defence-specific checks before saving them as attributes.

Parameters:

kwargs – A dictionary of defence-specific parameters.

Binary Input Detector

class art.defences.detector.evasion.BinaryInputDetector(detector: CLASSIFIER_NEURALNETWORK_TYPE)

Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and trains it on data labeled as clean (label 0) or adversarial (label 1).

detect(x: ndarray, batch_size: int = 128, **kwargs) Tuple[dict, ndarray]

Perform detection of adversarial data and return prediction as tuple.

Parameters:
  • x (ndarray) – Data sample on which to perform detection.

  • batch_size (int) – Size of batches.

Returns:

(report, is_adversarial): where report is a dictionary containing the detector model output predictions; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial or not and has the same batch_size (first dimension) as x.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) None

Fit the detector using clean and adversarial samples.

Parameters:
  • x (ndarray) – Training set to fit the detector.

  • y (ndarray) – Labels for the training set.

  • batch_size (int) – Size of batches.

  • nb_epochs (int) – Number of epochs to use for training.

  • kwargs – Other parameters.

Binary Activation Detector

class art.defences.detector.evasion.BinaryActivationDetector(classifier: CLASSIFIER_NEURALNETWORK_TYPE, detector: CLASSIFIER_NEURALNETWORK_TYPE, layer: int | str)

Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and is trained on the values of the activations of a classifier at a given layer.

detect(x: ndarray, batch_size: int = 128, **kwargs) Tuple[dict, ndarray]

Perform detection of adversarial data and return prediction as tuple.

Parameters:
  • x (ndarray) – Data sample on which to perform detection.

  • batch_size (int) – Size of batches.

Returns:

(report, is_adversarial): where report is a dictionary containing the detector model output predictions; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial or not and has the same batch_size (first dimension) as x.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) None

Fit the detector using training data.

Parameters:
  • x (ndarray) – Training set to fit the detector.

  • y (ndarray) – Labels for the training set.

  • batch_size (int) – Size of batches.

  • nb_epochs (int) – Number of epochs to use for training.

  • kwargs – Other parameters.

Subset Scanning Detector

class art.defences.detector.evasion.SubsetScanningDetector(classifier: CLASSIFIER_NEURALNETWORK_TYPE, bgd_data: ndarray, layer: int | str, scoring_function: Literal['BerkJones', 'HigherCriticism', 'KolmarovSmirnov'] = 'BerkJones', verbose: bool = True)

Fast generalized subset scan based detector by McFowland, E., Speakman, S., and Neill, D. B. (2013).

detect(x: ndarray, batch_size: int = 128, **kwargs) Tuple[dict, ndarray]

Perform detection of adversarial data and return prediction as tuple.

Parameters:
  • x (ndarray) – Data sample on which to perform detection.

  • batch_size (int) – Size of batches.

Returns:

(report, is_adversarial): where report is a dictionary containing contains information specified by the subset scanning method; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial or not and has the same batch_size (first dimension) as x.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) None

Fit the detector using training data. Assumes that the classifier is already trained.

Raises:

NotImplementedException – This method is not supported for this detector.

scan(clean_x: ndarray, adv_x: ndarray, clean_size: int | None = None, adv_size: int | None = None, run: int = 10) Tuple[ndarray, ndarray, float]

Returns scores of highest scoring subsets.

Parameters:
  • clean_x (ndarray) – Data presumably without anomalies.

  • adv_x (ndarray) – Data presumably with anomalies (adversarial samples).

  • clean_size

  • adv_size

  • run (int) –

Returns:

(clean_scores, adv_scores, detection_power).