art.estimators.certification.randomized_smoothing

Randomized smoothing estimators.

Mixin Base Class Randomized Smoothing

class art.estimators.certification.randomized_smoothing.RandomizedSmoothingMixin(sample_size: int, *args, scale: float = 0.1, alpha: float = 0.001, **kwargs)

Implementation of Randomized Smoothing applied to classifier predictions and gradients, as introduced in Cohen et al. (2019).

certify(x: ndarray, n: int, batch_size: int = 32) Tuple[ndarray, ndarray]

Computes certifiable radius around input x and returns radius r and prediction.

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • n (int) – Number of samples for estimate certifiable radius.

  • batch_size (int) – Batch size.

Returns:

Tuple of length 2 of the selected class and certified radius.

fit(x: ndarray, y: ndarray, batch_size: int = 128, nb_epochs: int = 10, **kwargs) None

Fit the classifier on the training set (x, y).

Parameters:
  • x (ndarray) – Training data.

  • y (ndarray) – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • batch_size (int) – Batch size.

  • nb_epochs (int) – Number of epochs to use for training.

  • kwargs – Dictionary of framework-specific arguments. This parameter is not currently supported for PyTorch and providing it takes no effect.

predict(x: ndarray, batch_size: int = 128, verbose: bool = False, **kwargs) ndarray

Perform prediction of the given classifier for a batch of inputs, taking an expectation over transformations.

Return type:

ndarray

Parameters:
  • x (ndarray) – Input samples.

  • batch_size (int) – Batch size.

  • verbose (bool) – Display training progress bar.

  • is_abstain (boolean) – True if function will abstain from prediction and return 0s. Default: True

Returns:

Array of predictions of shape (nb_inputs, nb_classes).

PyTorch Randomized Smoothing Classifier

class art.estimators.certification.randomized_smoothing.PyTorchRandomizedSmoothing(model: torch.nn.Module, loss: torch.nn.modules.loss._Loss, input_shape: Tuple[int, ...], nb_classes: int, optimizer: torch.optim.Optimizer | None = None, channels_first: bool = True, clip_values: CLIP_VALUES_TYPE | None = None, preprocessing_defences: Preprocessor | List[Preprocessor] | None = None, postprocessing_defences: Postprocessor | List[Postprocessor] | None = None, preprocessing: PREPROCESSING_TYPE = (0.0, 1.0), device_type: str = 'gpu', sample_size: int = 32, scale: float = 0.1, alpha: float = 0.001)

Implementation of Randomized Smoothing applied to classifier predictions and gradients, as introduced in Cohen et al. (2019).

__init__(model: torch.nn.Module, loss: torch.nn.modules.loss._Loss, input_shape: Tuple[int, ...], nb_classes: int, optimizer: torch.optim.Optimizer | None = None, channels_first: bool = True, clip_values: CLIP_VALUES_TYPE | None = None, preprocessing_defences: Preprocessor | List[Preprocessor] | None = None, postprocessing_defences: Postprocessor | List[Postprocessor] | None = None, preprocessing: PREPROCESSING_TYPE = (0.0, 1.0), device_type: str = 'gpu', sample_size: int = 32, scale: float = 0.1, alpha: float = 0.001)

Create a randomized smoothing classifier.

Parameters:
  • model – PyTorch model. The output of the model can be logits, probabilities or anything else. Logits output should be preferred where possible to ensure attack efficiency.

  • loss – The loss function for which to compute gradients for training. The target label must be raw categorical, i.e. not converted to one-hot encoding.

  • input_shape – The shape of one input instance.

  • nb_classes (int) – The number of classes of the model.

  • optimizer – The optimizer used to train the classifier.

  • channels_first (bool) – Set channels first or last.

  • clip_values – Tuple of the form (min, max) of floats or np.ndarray representing the minimum and maximum values allowed for features. If floats are provided, these will be used as the range of all features. If arrays are provided, each value will be considered the bound for a feature, thus the shape of clip values needs to match the total number of features.

  • preprocessing_defences – Preprocessing defence(s) to be applied by the classifier.

  • postprocessing_defences – Postprocessing defence(s) to be applied by the classifier.

  • preprocessing – Tuple of the form (subtrahend, divisor) of floats or np.ndarray of values to be used for data preprocessing. The first value will be subtracted from the input. The input will then be divided by the second one.

  • device_type (str) – Type of device on which the classifier is run, either gpu or cpu.

  • sample_size (int) – Number of samples for smoothing.

  • scale (float) – Standard deviation of Gaussian noise added.

  • alpha (float) – The failure probability of smoothing.

certify(x: ndarray, n: int, batch_size: int = 32) Tuple[ndarray, ndarray]

Computes certifiable radius around input x and returns radius r and prediction.

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • n (int) – Number of samples for estimate certifiable radius.

  • batch_size (int) – Batch size.

Returns:

Tuple of length 2 of the selected class and certified radius.

property channels_first: bool
Returns:

Boolean to indicate index of the color channels in the sample x.

class_gradient(x: ndarray, label: int | List[int] | ndarray | None = None, training_mode: bool = False, **kwargs) ndarray

Compute per-class derivatives of the given classifier w.r.t. x of original classifier.

Return type:

ndarray

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • label – Index of a specific per-class derivative. If an integer is provided, the gradient of that class output is computed for all samples. If multiple values as provided, the first dimension should match the batch size of x, and each value will be used as target for its corresponding sample in x. If None, then gradients for all classes will be computed for each sample.

  • training_mode (bool) – True for model set to training mode and ‘False for model set to evaluation mode.

Returns:

Array of gradients of input features w.r.t. each class in the form (batch_size, nb_classes, input_shape) when computing for all classes, otherwise shape becomes (batch_size, 1, input_shape) when label parameter is specified.

property clip_values: CLIP_VALUES_TYPE | None

Return the clip values of the input samples.

Returns:

Clip values (min, max).

clone_for_refitting() PyTorchClassifier

Create a copy of the classifier that can be refit from scratch. Will inherit same architecture, same type of optimizer and initialization as the original classifier, but without weights.

Returns:

new estimator

compute_loss(x: ndarray | torch.Tensor, y: ndarray | torch.Tensor, reduction: str = 'none', **kwargs) ndarray | torch.Tensor

Compute the loss.

Parameters:
  • x – Sample input with shape as expected by the model.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • reduction (str) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed.

Returns:

Array of losses of the same shape as x.

compute_loss_from_predictions(pred: ndarray, y: ndarray, **kwargs) ndarray

Compute the loss of the estimator for predictions pred.

Return type:

ndarray

Parameters:
  • pred (ndarray) – Model predictions.

  • y (ndarray) – Target values.

Returns:

Loss values.

compute_losses(x: ndarray | torch.Tensor, y: ndarray | torch.Tensor, reduction: str = 'none') Dict[str, ndarray | torch.Tensor]

Compute all loss components.

Parameters:
  • x – Sample input with shape as expected by the model.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • reduction (str) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed.

Returns:

Dictionary of loss components.

custom_loss_gradient(loss_fn, x: ndarray | torch.Tensor, y: ndarray | torch.Tensor, layer_name, training_mode: bool = False) ndarray | torch.Tensor

Compute the gradient of the loss function w.r.t. x.

Loss_fn:

Loss function w.r.t to which gradient needs to be calculated.

Parameters:
  • x – Sample input with shape as expected by the model(base image).

  • y – Sample input with shape as expected by the model(target image).

  • training_mode (bool) – True for model set to training mode and ‘False for model set to evaluation mode.`

  • layer_name – Name of the layer from which activation needs to be extracted/activation layer.

Returns:

Array of gradients of the same shape as x.

property device: torch.device

Get current used device.

Returns:

Current used device.

property device_type: str

Return the type of device on which the estimator is run.

Returns:

Type of device on which the estimator is run, either gpu or cpu.

fit(*args, **kwargs)

Fit the classifier on the training set (x, y).

Parameters:
  • x – Training data.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or index labels of shape (nb_samples,).

  • batch_size – Size of batches.

  • nb_epochs – Number of epochs to use for training.

  • training_modeTrue for model set to training mode and ‘False for model set to evaluation mode.

  • drop_last – Set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)

  • scheduler – Learning rate scheduler to run at the start of every epoch.

  • verbose – Display the training progress bar.

  • kwargs – Dictionary of framework-specific arguments. This parameter is not currently supported for PyTorch and providing it takes no effect.

fit_generator(generator: DataGenerator, nb_epochs: int = 20, verbose: bool = False, **kwargs) None

Fit the classifier using the generator that yields batches as specified.

Parameters:
  • generator – Batch generator providing (x, y) for each epoch.

  • nb_epochs (int) – Number of epochs to use for training.

  • verbose (bool) – Display the training progress bar.

  • kwargs – Dictionary of framework-specific arguments. This parameter is not currently supported for PyTorch and providing it takes no effect.

get_activations(x: ndarray | torch.Tensor, layer: int | str | None = None, batch_size: int = 128, framework: bool = False) ndarray | torch.Tensor

Return the output of the specified layer for input x. layer is specified by layer index (between 0 and nb_layers - 1) or by name. The number of layers can be determined by counting the results returned by calling layer_names.

Parameters:
  • x – Input for computing the activations.

  • layer – Layer for computing the activations

  • batch_size (int) – Size of batches.

  • framework (bool) – If true, return the intermediate tensor representation of the activation.

Returns:

The output of layer, where the first dimension is the batch size corresponding to x.

get_params() Dict[str, Any]

Get all parameters and their values of this estimator.

Returns:

A dictionary of string parameter names to their value.

property input_shape: Tuple[int, ...]

Return the shape of one input sample.

Returns:

Shape of one input sample.

property layer_names: List[str] | None

Return the names of the hidden layers in the model, if applicable.

Returns:

The names of the hidden layers in the model, input and output layers are ignored.

Warning

layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

property loss: torch.nn.modules.loss._Loss

Return the loss function.

Returns:

The loss function.

loss_gradient(x: ndarray, y: ndarray, training_mode: bool = False, **kwargs) ndarray

Compute the gradient of the loss function w.r.t. x.

Return type:

ndarray

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • y (ndarray) – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • training_mode (bool) – True for model set to training mode and ‘False for model set to evaluation mode.

  • sampling (bool) – True if loss gradients should be determined with Monte Carlo sampling.

Returns:

Array of gradients of the same shape as x.

property loss_scale: float | str

Return the loss scaling value.

Returns:

Loss scaling. Possible values for string: a string representing a number, e.g., “1.0”, or the string “dynamic”.

property model: torch.nn.Module

Return the model.

Returns:

The model.

property nb_classes: int

Return the number of output classes.

Returns:

Number of classes in the data.

property opt_level: str

Return a string specifying a pure or mixed precision optimization level.

Returns:

A string specifying a pure or mixed precision optimization level. Possible values are O0, O1, O2, and O3.

property optimizer: torch.optim.Optimizer

Return the optimizer.

Returns:

The optimizer.

predict(*args, **kwargs)

Perform prediction of the given classifier for a batch of inputs, taking an expectation over transformations.

Parameters:
  • x – Input samples.

  • batch_size – Batch size.

  • verbose – Display training progress bar.

  • is_abstain (boolean) – True if function will abstain from prediction and return 0s. Default: True

Returns:

Array of predictions of shape (nb_inputs, nb_classes).

reduce_labels(y: ndarray | torch.Tensor) ndarray | torch.Tensor

Reduce labels from one-hot encoded to index labels.

reset() None

Resets the weights of the classifier so that it can be refit from scratch.

save(filename: str, path: str | None = None) None

Save a model to file in the format specific to the backend framework.

Parameters:
  • filename (str) – Name of the file where to store the model.

  • path – Path of the folder where to store the model. If no path is specified, the model will be stored in the default data location of the library ART_DATA_PATH.

set_batchnorm(train: bool) None

Set all batch normalization layers into train or eval mode.

Parameters:

train (bool) – False for evaluation mode.

set_dropout(train: bool) None

Set all dropout layers into train or eval mode.

Parameters:

train (bool) – False for evaluation mode.

set_multihead_attention(train: bool) None

Set all multi-head attention layers into train or eval mode.

Parameters:

train (bool) – False for evaluation mode.

set_params(**kwargs) None

Take a dictionary of parameters and apply checks before setting them as attributes.

Parameters:

kwargs – A dictionary of attributes.

property use_amp: bool

Return a boolean indicating whether to use the automatic mixed precision tool.

Returns:

Whether to use the automatic mixed precision tool.

TensorFlow V2 Randomized Smoothing Classifier

class art.estimators.certification.randomized_smoothing.TensorFlowV2RandomizedSmoothing(model, nb_classes: int, input_shape: Tuple[int, ...], loss_object: tf.Tensor | None = None, optimizer: tf.keras.optimizers.Optimizer | None = None, train_step: Callable | None = None, channels_first: bool = False, clip_values: CLIP_VALUES_TYPE | None = None, preprocessing_defences: Preprocessor | List[Preprocessor] | None = None, postprocessing_defences: Postprocessor | List[Postprocessor] | None = None, preprocessing: PREPROCESSING_TYPE = (0.0, 1.0), sample_size: int = 32, scale: float = 0.1, alpha: float = 0.001)

Implementation of Randomized Smoothing applied to classifier predictions and gradients, as introduced in Cohen et al. (2019).

__init__(model, nb_classes: int, input_shape: Tuple[int, ...], loss_object: tf.Tensor | None = None, optimizer: tf.keras.optimizers.Optimizer | None = None, train_step: Callable | None = None, channels_first: bool = False, clip_values: CLIP_VALUES_TYPE | None = None, preprocessing_defences: Preprocessor | List[Preprocessor] | None = None, postprocessing_defences: Postprocessor | List[Postprocessor] | None = None, preprocessing: PREPROCESSING_TYPE = (0.0, 1.0), sample_size: int = 32, scale: float = 0.1, alpha: float = 0.001)

Create a randomized smoothing classifier.

Parameters:
  • model (function or callable class) – a python functions or callable class defining the model and providing it prediction as output.

  • nb_classes (int) – the number of classes in the classification task.

  • input_shape – Shape of one input for the classifier, e.g. for MNIST input_shape=(28, 28, 1).

  • loss_object – The loss function for which to compute gradients. This parameter is applied for training the model and computing gradients of the loss w.r.t. the input.

  • optimizer – The optimizer used to train the classifier.

  • train_step – A function that applies a gradient update to the trainable variables with signature train_step(model, images, labels). This will override the default training loop that uses the provided loss_object and optimizer parameters. It is recommended to use the @tf.function decorator, if possible, for efficient training.

  • channels_first (bool) – Set channels first or last.

  • clip_values – Tuple of the form (min, max) of floats or np.ndarray representing the minimum and maximum values allowed for features. If floats are provided, these will be used as the range of all features. If arrays are provided, each value will be considered the bound for a feature, thus the shape of clip values needs to match the total number of features.

  • preprocessing_defences – Preprocessing defence(s) to be applied by the classifier.

  • postprocessing_defences – Postprocessing defence(s) to be applied by the classifier.

  • preprocessing – Tuple of the form (subtrahend, divisor) of floats or np.ndarray of values to be used for data preprocessing. The first value will be subtracted from the input. The input will then be divided by the second one.

  • sample_size (int) – Number of samples for smoothing.

  • scale (float) – Standard deviation of Gaussian noise added.

  • alpha (float) – The failure probability of smoothing.

certify(x: ndarray, n: int, batch_size: int = 32) Tuple[ndarray, ndarray]

Computes certifiable radius around input x and returns radius r and prediction.

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • n (int) – Number of samples for estimate certifiable radius.

  • batch_size (int) – Batch size.

Returns:

Tuple of length 2 of the selected class and certified radius.

property channels_first: bool
Returns:

Boolean to indicate index of the color channels in the sample x.

class_gradient(x: ndarray, label: int | List[int] | ndarray | None = None, training_mode: bool = False, **kwargs) ndarray

Compute per-class derivatives of the given classifier w.r.t. x of original classifier.

Return type:

ndarray

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • label – Index of a specific per-class derivative. If an integer is provided, the gradient of that class output is computed for all samples. If multiple values as provided, the first dimension should match the batch size of x, and each value will be used as target for its corresponding sample in x. If None, then gradients for all classes will be computed for each sample.

  • training_mode (bool) – True for model set to training mode and ‘False for model set to evaluation mode.

Returns:

Array of gradients of input features w.r.t. each class in the form (batch_size, nb_classes, input_shape) when computing for all classes, otherwise shape becomes (batch_size, 1, input_shape) when label parameter is specified.

property clip_values: CLIP_VALUES_TYPE | None

Return the clip values of the input samples.

Returns:

Clip values (min, max).

clone_for_refitting() TensorFlowV2Classifier

Create a copy of the classifier that can be refit from scratch. Will inherit same architecture, optimizer and initialization as cloned model, but without weights.

Returns:

new estimator

compute_loss(x: ndarray, y: ndarray, reduction: str = 'none', training_mode: bool = False, **kwargs) ndarray

Compute the loss of the neural network for samples x.

Parameters:
  • x (ndarray) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).

  • y (ndarray) – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • reduction (str) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed.

  • training_mode (bool) – True for model set to training mode and ‘False for model set to evaluation mode.

Returns:

Loss values.

Return type:

Format as expected by the model

compute_loss_from_predictions(pred: ndarray, y: ndarray, **kwargs) ndarray

Compute the loss of the estimator for predictions pred.

Return type:

ndarray

Parameters:
  • pred (ndarray) – Model predictions.

  • y (ndarray) – Target values.

Returns:

Loss values.

compute_losses(x: ndarray | tf.Tensor, y: ndarray | tf.Tensor, reduction: str = 'none') Dict[str, ndarray | tf.Tensor]

Compute all loss components.

Parameters:
  • x – Sample input with shape as expected by the model.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • reduction (str) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed.

Returns:

Dictionary of loss components.

fit(*args, **kwargs)

Fit the classifier on the training set (x, y).

Parameters:
  • x – Training data.

  • y – Labels, one-hot-encoded of shape (nb_samples, nb_classes) or index labels of shape (nb_samples,).

  • batch_size – Size of batches.

  • nb_epochs – Number of epochs to use for training.

  • verbose – Display the training progress bar.

  • kwargs – Dictionary of framework-specific arguments. This parameter currently only supports “scheduler” which is an optional function that will be called at the end of every epoch to adjust the learning rate.

fit_generator(generator: DataGenerator, nb_epochs: int = 20, verbose: bool = False, **kwargs) None

Fit the classifier using the generator that yields batches as specified.

Parameters:
  • generator – Batch generator providing (x, y) for each epoch. If the generator can be used for native training in TensorFlow, it will.

  • nb_epochs (int) – Number of epochs to use for training.

  • verbose (bool) – Display training progress bar.

  • kwargs – Dictionary of framework-specific arguments. This parameter currently supports “scheduler” which is an optional function that will be called at the end of every epoch to adjust the learning rate.

get_activations(x: ndarray, layer: int | str, batch_size: int = 128, framework: bool = False) ndarray | None

Return the output of the specified layer for input x. layer is specified by layer index (between 0 and nb_layers - 1) or by name. The number of layers can be determined by counting the results returned by calling layer_names.

Parameters:
  • x (ndarray) – Input for computing the activations.

  • layer – Layer for computing the activations.

  • batch_size (int) – Batch size.

  • framework (bool) – Return activation as tensor.

Returns:

The output of layer, where the first dimension is the batch size corresponding to x.

get_params() Dict[str, Any]

Get all parameters and their values of this estimator.

Returns:

A dictionary of string parameter names to their value.

property input_shape: Tuple[int, ...]

Return the shape of one input sample.

Returns:

Shape of one input sample.

property layer_names: List[str] | None

Return the hidden layers in the model, if applicable.

Returns:

The hidden layers in the model, input and output layers excluded.

Warning

layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

loss_gradient(x: ndarray, y: ndarray, training_mode: bool = False, **kwargs) ndarray

Compute the gradient of the loss function w.r.t. x.

Return type:

ndarray

Parameters:
  • x (ndarray) – Sample input with shape as expected by the model.

  • y (ndarray) – Correct labels, one-vs-rest encoding.

  • training_mode (bool) – True for model set to training mode and ‘False for model set to evaluation mode.

  • sampling (bool) – True if loss gradients should be determined with Monte Carlo sampling.

Returns:

Array of gradients of the same shape as x.

property loss_object: tf.keras.losses.Loss

Return the loss function.

Returns:

The loss function.

property model

Return the model.

Returns:

The model.

property nb_classes: int

Return the number of output classes.

Returns:

Number of classes in the data.

property optimizer: tf.keras.optimizers.Optimizer

Return the optimizer.

Returns:

The optimizer.

predict(*args, **kwargs)

Perform prediction of the given classifier for a batch of inputs, taking an expectation over transformations.

Parameters:
  • x – Input samples.

  • batch_size – Batch size.

  • verbose – Display training progress bar.

  • is_abstain (boolean) – True if function will abstain from prediction and return 0s. Default: True

Returns:

Array of predictions of shape (nb_inputs, nb_classes).

reset() None

Resets the weights of the classifier so that it can be refit from scratch.

save(filename: str, path: str | None = None) None

Save a model to file in the format specific to the backend framework. For TensorFlow, .ckpt is used.

Parameters:
  • filename (str) – Name of the file where to store the model.

  • path – Path of the folder where to store the model. If no path is specified, the model will be stored in the default data location of the library ART_DATA_PATH.

set_params(**kwargs) None

Take a dictionary of parameters and apply checks before setting them as attributes.

Parameters:

kwargs – A dictionary of attributes.

property train_step: Callable

Return the function that applies a gradient update to the trainable variables.

Returns:

The function that applies a gradient update to the trainable variables.