art.defences.detector.evasion
¶
Module implementing detectorbased defences against evasion attacks.
Binary Input Detector¶

class
art.defences.detector.evasion.
BinaryInputDetector
(detector: ClassifierNeuralNetwork)¶ Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and trains it on data labeled as clean (label 0) or adversarial (label 1).

property
channels_first
¶  Returns
Boolean to indicate index of the color channels in the sample x.

class_gradient
(x: numpy.ndarray, label: Optional[Union[int, List[int]]] = None, training_mode: bool = False, **kwargs) → numpy.ndarray¶ Compute perclass derivatives w.r.t. x.
 Return type
ndarray
 Parameters
x (
ndarray
) – Sample input with shape as expected by the model.label – Index of a specific perclass derivative. If an integer is provided, the gradient of that class output is computed for all samples. If multiple values as provided, the first dimension should match the batch size of x, and each value will be used as target for its corresponding sample in x. If None, then gradients for all classes will be computed for each sample.
training_mode (
bool
) – True for model set to training mode and ‘False for model set to evaluation mode.
 Returns
Array of gradients of input features w.r.t. each class in the form (batch_size, nb_classes, input_shape) when computing for all classes, otherwise shape becomes (batch_size, 1, input_shape) when label parameter is specified.

property
clip_values
¶ Return the clip values of the input samples.
 Returns
Clip values (min, max).

compute_loss
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the loss of the neural network for samples x.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
 Returns
Loss values.
 Return type
Format as expected by the model

fit
(*args, **kwargs)¶ Fit the detector using clean and adversarial samples.
 Parameters
x – Training set to fit the detector.
y – Labels for the training set.
batch_size – Size of batches.
nb_epochs – Number of epochs to use for training.
kwargs – Other parameters.

fit_generator
(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None¶ Fit the classifier using the generator gen that yields batches as specified. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

get_activations
(x: numpy.ndarray, layer: Union[int, str], batch_size: int, framework: bool = False) → numpy.ndarray¶ Return the output of the specified layer for input x. layer is specified by layer index (between 0 and nb_layers  1) or by name. The number of layers can be determined by counting the results returned by calling layer_names. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

property
input_shape
¶ Return the shape of one input sample.
 Returns
Shape of one input sample.

loss_gradient
(x: numpy.ndarray, y: numpy.ndarray, training_mode: bool = False, **kwargs) → numpy.ndarray¶ Compute the gradient of the loss function w.r.t. x.
 Return type
ndarray
 Parameters
x (
ndarray
) – Sample input with shape as expected by the model.y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).training_mode (
bool
) – True for model set to training mode and ‘False for model set to evaluation mode.
 Returns
Array of gradients of the same shape as x.

property
nb_classes
¶ Return the number of output classes.
 Returns
Number of classes in the data.

predict
(*args, **kwargs)¶ Perform detection of adversarial data and return prediction as tuple.
 Parameters
x – Data sample on which to perform detection.
batch_size – Size of batches.
 Returns
Persample prediction whether data is adversarial or not, where 0 means nonadversarial. Return variable has the same batch_size (first dimension) as x.

save
(filename: str, path: Optional[str] = None) → None¶ Save the detector model.
param filename: The name of the saved file. param path: The path to the location of the saved file.

property
Binary Activation Detector¶

class
art.defences.detector.evasion.
BinaryActivationDetector
(classifier: ClassifierNeuralNetwork, detector: ClassifierNeuralNetwork, layer: Union[int, str])¶ Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and is trained on the values of the activations of a classifier at a given layer.

property
channels_first
¶  Returns
Boolean to indicate index of the color channels in the sample x.

class_gradient
(x: numpy.ndarray, label: Optional[Union[int, List[int]]] = None, training_mode: bool = False, **kwargs) → numpy.ndarray¶ Compute perclass derivatives w.r.t. x.
 Return type
ndarray
 Parameters
x (
ndarray
) – Sample input with shape as expected by the model.label – Index of a specific perclass derivative. If an integer is provided, the gradient of that class output is computed for all samples. If multiple values as provided, the first dimension should match the batch size of x, and each value will be used as target for its corresponding sample in x. If None, then gradients for all classes will be computed for each sample.
training_mode (
bool
) – True for model set to training mode and ‘False for model set to evaluation mode.
 Returns
Array of gradients of input features w.r.t. each class in the form (batch_size, nb_classes, input_shape) when computing for all classes, otherwise shape becomes (batch_size, 1, input_shape) when label parameter is specified.

property
clip_values
¶ Return the clip values of the input samples.
 Returns
Clip values (min, max).

compute_loss
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the loss of the neural network for samples x.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
 Returns
Loss values.
 Return type
Format as expected by the model

fit
(*args, **kwargs)¶ Fit the detector using training data.
 Parameters
x – Training set to fit the detector.
y – Labels for the training set.
batch_size – Size of batches.
nb_epochs – Number of epochs to use for training.
kwargs – Other parameters.

fit_generator
(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None¶ Fit the classifier using the generator gen that yields batches as specified. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

get_activations
(x: numpy.ndarray, layer: Union[int, str], batch_size: int, framework: bool = False) → numpy.ndarray¶ Return the output of the specified layer for input x. layer is specified by layer index (between 0 and nb_layers  1) or by name. The number of layers can be determined by counting the results returned by calling layer_names. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

property
input_shape
¶ Return the shape of one input sample.
 Returns
Shape of one input sample.

property
layer_names
¶ Return the names of the hidden layers in the model, if applicable.
 Returns
The names of the hidden layers in the model, input and output layers are ignored.
Warning
layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

loss_gradient
(x: numpy.ndarray, y: numpy.ndarray, training_mode: bool = False, **kwargs) → numpy.ndarray¶ Compute the gradient of the loss function w.r.t. x.
 Return type
ndarray
 Parameters
x (
ndarray
) – Sample input with shape as expected by the model.y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).training_mode (
bool
) – True for model set to training mode and ‘False for model set to evaluation mode.
 Returns
Array of gradients of the same shape as x.

property
nb_classes
¶ Return the number of output classes.
 Returns
Number of classes in the data.

predict
(*args, **kwargs)¶ Perform detection of adversarial data and return prediction as tuple.
 Parameters
x – Data sample on which to perform detection.
batch_size – Size of batches.
 Returns
Persample prediction whether data is adversarial or not, where 0 means nonadversarial. Return variable has the same batch_size (first dimension) as x.

save
(filename: str, path: Optional[str] = None) → None¶ Save the detector model.
param filename: The name of the saved file. param path: The path to the location of the saved file.

property