`art.defences.detector.evasion`¶

Module implementing detector-based defences against evasion attacks.

Base Class¶

class art.defences.detector.evasion.EvasionDetector¶

Abstract base class for all evasion detectors.

abstract detect(x: ndarray, batch_size: int = 128, **kwargs) → Tuple[dict, ndarray]¶

Perform detection of adversarial data and return prediction as tuple.

Parameters:

x (ndarray) – Data sample on which to perform detection.
batch_size (int) – Size of batches.
kwargs – Defence-specific parameters used by child classes.

Returns:

(report, is_adversarial): where report is a dictionary containing information specific to the detection defence; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial

abstract fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None¶

Fit the detection classifier if necessary.

Parameters:

x (ndarray) – Training set to fit the detector.
y (ndarray) – Labels for the training set.
batch_size (int) – Size of batches.
nb_epochs (int) – Number of epochs to use for training.
kwargs – Other parameters.

get_params() → Dict[str, Any]¶

Returns dictionary of parameters used to run defence.

Returns:: Dictionary of parameters of the method.

set_params(**kwargs) → None¶

Take in a dictionary of parameters and apply defence-specific checks before saving them as attributes.

Parameters:: kwargs – A dictionary of defence-specific parameters.

Binary Input Detector¶

class art.defences.detector.evasion.BinaryInputDetector(detector: CLASSIFIER_NEURALNETWORK_TYPE)¶

Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and trains it on data labeled as clean (label 0) or adversarial (label 1).

detect(x: ndarray, batch_size: int = 128, **kwargs) → Tuple[dict, ndarray]¶

Perform detection of adversarial data and return prediction as tuple.

Parameters:

x (ndarray) – Data sample on which to perform detection.
batch_size (int) – Size of batches.

Returns:

(report, is_adversarial): where report is a dictionary containing the detector model output predictions; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial or not and has the same batch_size (first dimension) as x.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None¶

Fit the detector using clean and adversarial samples.

Parameters:

x (ndarray) – Training set to fit the detector.
y (ndarray) – Labels for the training set.
batch_size (int) – Size of batches.
nb_epochs (int) – Number of epochs to use for training.
kwargs – Other parameters.

Binary Activation Detector¶

class art.defences.detector.evasion.BinaryActivationDetector(classifier: CLASSIFIER_NEURALNETWORK_TYPE, detector: CLASSIFIER_NEURALNETWORK_TYPE, layer: int | str)¶

Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and is trained on the values of the activations of a classifier at a given layer.

detect(x: ndarray, batch_size: int = 128, **kwargs) → Tuple[dict, ndarray]¶

Perform detection of adversarial data and return prediction as tuple.

Parameters:

x (ndarray) – Data sample on which to perform detection.
batch_size (int) – Size of batches.

Returns:

(report, is_adversarial): where report is a dictionary containing the detector model output predictions; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial or not and has the same batch_size (first dimension) as x.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None¶

Fit the detector using training data.

Parameters:

x (ndarray) – Training set to fit the detector.
y (ndarray) – Labels for the training set.
batch_size (int) – Size of batches.
nb_epochs (int) – Number of epochs to use for training.
kwargs – Other parameters.

Subset Scanning Detector¶

class art.defences.detector.evasion.SubsetScanningDetector(classifier: CLASSIFIER_NEURALNETWORK_TYPE, bgd_data: ndarray, layer: int | str, scoring_function: Literal['BerkJones', 'HigherCriticism', 'KolmarovSmirnov'] = 'BerkJones', verbose: bool = True)¶

Fast generalized subset scan based detector by McFowland, E., Speakman, S., and Neill, D. B. (2013).

Paper link: https://www.cs.cmu.edu/~neill/papers/mcfowland13a.pdf

detect(x: ndarray, batch_size: int = 128, **kwargs) → Tuple[dict, ndarray]¶

Perform detection of adversarial data and return prediction as tuple.

Parameters:

x (ndarray) – Data sample on which to perform detection.
batch_size (int) – Size of batches.

Returns:

(report, is_adversarial): where report is a dictionary containing contains information specified by the subset scanning method; where is_adversarial is a boolean list of per-sample prediction whether the sample is adversarial or not and has the same batch_size (first dimension) as x.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None¶

Fit the detector using training data. Assumes that the classifier is already trained.

Raises:: NotImplementedException – This method is not supported for this detector.

scan(clean_x: ndarray, adv_x: ndarray, clean_size: int | None = None, adv_size: int | None = None, run: int = 10) → Tuple[ndarray, ndarray, float]¶

Returns scores of highest scoring subsets.

Parameters:

clean_x (ndarray) – Data presumably without anomalies.
adv_x (ndarray) – Data presumably with anomalies (adversarial samples).
clean_size –
adv_size –
run (int) –

Returns:

(clean_scores, adv_scores, detection_power).

`art.defences.detector.evasion`¶

Base Class¶

Binary Input Detector¶

Binary Activation Detector¶

Subset Scanning Detector¶

Adversarial Robustness Toolbox

Navigation

Related Topics

art.defences.detector.evasion¶

Base Class¶

Binary Input Detector¶

Binary Activation Detector¶

Subset Scanning Detector¶

`art.defences.detector.evasion`¶