art.defences.detector.evasion
¶
Module implementing detectorbased defences against evasion attacks.
Binary Input Detector¶

class
art.defences.detector.evasion.
BinaryInputDetector
(detector: ClassifierNeuralNetwork)¶ Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and trains it on data labeled as clean (label 0) or adversarial (label 1).

property
channel_index
¶  Returns
Index of the axis containing the color channels in the samples x.

property
channels_first
¶  Returns
Boolean to indicate index of the color channels in the sample x.

class_gradient
(x: numpy.ndarray, label: Optional[Union[int, List[int]]] = None, **kwargs) → numpy.ndarray¶ Compute perclass derivatives w.r.t. x.
 Return type
ndarray
 Parameters
x (np.ndarray or pandas.DataFrame) – Samples.
label – Index of a specific perclass derivative. If an integer is provided, the gradient of that class output is computed for all samples. If multiple values as provided, the first dimension should match the batch size of x, and each value will be used as target for its corresponding sample in x. If None, then gradients for all classes will be computed for each sample.
 Returns
Gradients of input features w.r.t. each class in the form (batch_size, nb_classes, input_shape) when computing for all classes, otherwise shape becomes (batch_size, 1, input_shape) when label parameter is specified.

property
clip_values
¶ Return the clip values of the input samples.
 Returns
Clip values (min, max).

fit
(*args, **kwargs)¶ Fit the detector using clean and adversarial samples.
 Parameters
x – Training set to fit the detector.
y – Labels for the training set.
batch_size – Size of batches.
nb_epochs – Number of epochs to use for training.
kwargs – Other parameters.

fit_generator
(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None¶ Fit the classifier using the generator gen that yields batches as specified. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

get_activations
(*args, **kwargs)¶ Return the output of the specified layer for input x. layer is specified by layer index (between 0 and nb_layers  1) or by name. The number of layers can be determined by counting the results returned by calling layer_names. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

property
input_shape
¶ Return the shape of one input sample.
 Returns
Shape of one input sample.

property
learning_phase
¶ The learning phase set by the user. Possible values are True for training or False for prediction and None if it has not been set by the library. In the latter case, the library does not do any explicit learning phase manipulation and the current value of the backend framework is used. If a value has been set by the user for this property, it will impact all following computations for model fitting, prediction and gradients.
 Returns
Learning phase.

loss
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the loss of the neural network for samples x.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
 Returns
Loss values.
 Return type
Format as expected by the model

loss_gradient
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the gradient of the loss function w.r.t. x.
 Parameters
x (Format as expected by the model) – Samples.
y (Format as expected by the model) – Target values.
 Returns
Loss gradients w.r.t. x in the same format as x.
 Return type
Format as expected by the model

property
nb_classes
¶ Return the number of output classes.
 Returns
Number of classes in the data.

predict
(*args, **kwargs)¶ Perform detection of adversarial data and return prediction as tuple.
 Parameters
x – Data sample on which to perform detection.
batch_size – Size of batches.
 Returns
Persample prediction whether data is adversarial or not, where 0 means nonadversarial. Return variable has the same batch_size (first dimension) as x.

set_learning_phase
(train: bool) → None¶ Set the learning phase for the backend framework.
 Parameters
train (
bool
) – True if the learning phase is training, otherwise False.

property
Binary Activation Detector¶

class
art.defences.detector.evasion.
BinaryActivationDetector
(classifier: ClassifierNeuralNetwork, detector: ClassifierNeuralNetwork, layer: Union[int, str])¶ Binary detector of adversarial samples coming from evasion attacks. The detector uses an architecture provided by the user and is trained on the values of the activations of a classifier at a given layer.

property
channel_index
¶  Returns
Index of the axis containing the color channels in the samples x.

property
channels_first
¶  Returns
Boolean to indicate index of the color channels in the sample x.

class_gradient
(x: numpy.ndarray, label: Optional[Union[int, List[int]]] = None, **kwargs) → numpy.ndarray¶ Compute perclass derivatives w.r.t. x.
 Return type
ndarray
 Parameters
x (np.ndarray or pandas.DataFrame) – Samples.
label – Index of a specific perclass derivative. If an integer is provided, the gradient of that class output is computed for all samples. If multiple values as provided, the first dimension should match the batch size of x, and each value will be used as target for its corresponding sample in x. If None, then gradients for all classes will be computed for each sample.
 Returns
Gradients of input features w.r.t. each class in the form (batch_size, nb_classes, input_shape) when computing for all classes, otherwise shape becomes (batch_size, 1, input_shape) when label parameter is specified.

property
clip_values
¶ Return the clip values of the input samples.
 Returns
Clip values (min, max).

fit
(*args, **kwargs)¶ Fit the detector using training data.
 Parameters
x – Training set to fit the detector.
y – Labels for the training set.
batch_size – Size of batches.
nb_epochs – Number of epochs to use for training.
kwargs – Other parameters.

fit_generator
(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None¶ Fit the classifier using the generator gen that yields batches as specified. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

get_activations
(*args, **kwargs)¶ Return the output of the specified layer for input x. layer is specified by layer index (between 0 and nb_layers  1) or by name. The number of layers can be determined by counting the results returned by calling layer_names. This function is not supported for this detector.
 Raises
NotImplementedException – This method is not supported for detectors.

property
input_shape
¶ Return the shape of one input sample.
 Returns
Shape of one input sample.

property
layer_names
¶ Return the names of the hidden layers in the model, if applicable.
 Returns
The names of the hidden layers in the model, input and output layers are ignored.
Warning
layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

property
learning_phase
¶ The learning phase set by the user. Possible values are True for training or False for prediction and None if it has not been set by the library. In the latter case, the library does not do any explicit learning phase manipulation and the current value of the backend framework is used. If a value has been set by the user for this property, it will impact all following computations for model fitting, prediction and gradients.
 Returns
Learning phase.

loss
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the loss of the neural network for samples x.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
 Returns
Loss values.
 Return type
Format as expected by the model

loss_gradient
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the gradient of the loss function w.r.t. x.
 Parameters
x (Format as expected by the model) – Samples.
y (Format as expected by the model) – Target values.
 Returns
Loss gradients w.r.t. x in the same format as x.
 Return type
Format as expected by the model

property
nb_classes
¶ Return the number of output classes.
 Returns
Number of classes in the data.

predict
(*args, **kwargs)¶ Perform detection of adversarial data and return prediction as tuple.
 Parameters
x – Data sample on which to perform detection.
batch_size – Size of batches.
 Returns
Persample prediction whether data is adversarial or not, where 0 means nonadversarial. Return variable has the same batch_size (first dimension) as x.

set_learning_phase
(train: bool) → None¶ Set the learning phase for the backend framework.
 Parameters
train (
bool
) – True if the learning phase is training, otherwise False.

property