art.defences.trainer

Module implementing train-based defences against adversarial attacks.

Base Class Trainer

class art.defences.trainer.Trainer(classifier: Classifier, **kwargs)

Abstract base class for training defences.

__init__(classifier: Classifier, **kwargs) → None

Create a adversarial training object

abstract fit(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → None

Train the model.

Parameters
  • x (ndarray) – Training data.

  • y (ndarray) – Labels for the training data.

  • kwargs – Other parameters.

get_classifier() → Classifier

Return the classifier trained via adversarial training.

Returns

The classifier.

Adversarial Training

class art.defences.trainer.AdversarialTrainer(classifier: Classifier, attacks: Union[EvasionAttack, List[EvasionAttack]], ratio: float = 0.5)

Class performing adversarial training based on a model architecture and one or multiple attack methods.

Incorporates original adversarial training, ensemble adversarial training (https://arxiv.org/abs/1705.07204), training on all adversarial data and other common setups. If multiple attacks are specified, they are rotated for each batch. If the specified attacks have as target a different model, then the attack is transferred. The ratio determines how many of the clean samples in each batch are replaced with their adversarial counterpart.

Warning

Both successful and unsuccessful adversarial samples are used for training. In the case of unbounded attacks (e.g., DeepFool), this can result in invalid (very noisy) samples being included.

Please keep in mind the limitations of defences. While adversarial training is widely regarded as a promising, principled approach to making classifiers more robust (see https://arxiv.org/abs/1802.00420), very careful evaluations are required to assess its effectiveness case by case (see https://arxiv.org/abs/1902.06705).
__init__(classifier: Classifier, attacks: Union[EvasionAttack, List[EvasionAttack]], ratio: float = 0.5) → None

Create an AdversarialTrainer instance.

Parameters
  • classifier – Model to train adversarially.

  • attacks – attacks to use for data augmentation in adversarial training

  • ratio (float) – The proportion of samples in each batch to be replaced with their adversarial counterparts. Setting this value to 1 allows to train only on adversarial samples.

fit(x: numpy.ndarray, y: numpy.ndarray, validation_data: Optional[numpy.ndarray] = None, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None

Train a model adversarially. See class documentation for more information on the exact procedure.

Parameters
  • x (ndarray) – Training set.

  • y (ndarray) – Labels for the training set.

  • validation_data – Validation data, not used.

  • batch_size (int) – Size of batches.

  • nb_epochs (int) – Number of epochs to use for trainings.

  • kwargs – Dictionary of framework-specific arguments. These will be passed as such to the fit function of the target classifier.

fit_generator(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None

Train a model adversarially using a data generator. See class documentation for more information on the exact procedure.

Parameters
  • generator – Data generator.

  • nb_epochs (int) – Number of epochs to use for trainings.

  • kwargs – Dictionary of framework-specific arguments. These will be passed as such to the fit function of the target classifier.

predict(x: numpy.ndarray, **kwargs) → numpy.ndarray

Perform prediction using the adversarially trained classifier.

Return type

ndarray

Parameters
  • x (ndarray) – Test set.

  • kwargs – Other parameters to be passed on to the predict function of the classifier.

Returns

Predictions for test set.

Adversarial Training Madry PGD

class art.defences.trainer.AdversarialTrainerMadryPGD(classifier: ClassifierGradients, nb_epochs: int = 391, batch_size: int = 128, eps: float = 8.0, eps_step: float = 2.0, max_iter: int = 7, num_random_init: int = True)

Class performing adversarial training following Madry’s Protocol.

Please keep in mind the limitations of defences. While adversarial training is widely regarded as a promising, principled approach to making classifiers more robust (see https://arxiv.org/abs/1802.00420), very careful evaluations are required to assess its effectiveness case by case (see https://arxiv.org/abs/1902.06705).
__init__(classifier: ClassifierGradients, nb_epochs: int = 391, batch_size: int = 128, eps: float = 8.0, eps_step: float = 2.0, max_iter: int = 7, num_random_init: int = True) → None

Create an AdversarialTrainerMadryPGD instance.

Default values are for CIFAR-10 in pixel range 0-255.

Parameters
  • classifier – Classifier to train adversarially.

  • nb_epochs (int) – Number of training epochs.

  • batch_size (int) – Size of the batch on which adversarial samples are generated.

  • eps (float) – Maximum perturbation that the attacker can introduce.

  • eps_step (float) – Attack step size (input variation) at each iteration.

  • max_iter (int) – The maximum number of iterations.

  • num_random_init (int) – Number of random initialisations within the epsilon ball. For num_random_init=0 starting at the original input.

fit(x: numpy.ndarray, y: numpy.ndarray, validation_data: Optional[numpy.ndarray] = None, **kwargs) → None

Train a model adversarially. See class documentation for more information on the exact procedure.

Parameters
  • x (ndarray) – Training data.

  • y (ndarray) – Labels for the training data.

  • validation_data – Validation data.

  • kwargs – Dictionary of framework-specific arguments.

get_classifier() → Classifier

Return the classifier trained via adversarial training.

Returns

The classifier.

Base Class Adversarial Training Fast is Better than Free

class art.defences.trainer.AdversarialTrainerFBF(classifier, eps=8, **kwargs)

This is abstract class for different backend-specific implementations of Fast is Better than Free protocol for adversarial training.

__init__(classifier, eps=8, **kwargs)

Create an AdversarialTrainerFBF instance.

Parameters
  • classifier (Classifier) – Model to train adversarially.

  • eps (float) – Maximum perturbation that the attacker can introduce.

abstract fit(x, y, validation_data=None, batch_size=128, nb_epochs=20, **kwargs)

Train a model adversarially with FBF. See class documentation for more information on the exact procedure.

Parameters
  • x (np.ndarray) – Training set.

  • y (np.ndarray) – Labels for the training set.

  • batch_size (int) – Size of batches.

  • nb_epochs (int) – Number of epochs to use for trainings.

  • kwargs (dict) – Dictionary of framework-specific arguments. These will be passed as such to the fit function of the target classifier.

Returns

None

abstract fit_generator(generator, nb_epochs=20, **kwargs)

Train a model adversarially using a data generator. See class documentation for more information on the exact procedure.

Parameters
  • generator (DataGenerator) – Data generator.

  • nb_epochs (int) – Number of epochs to use for trainings.

  • kwargs (dict) – Dictionary of framework-specific arguments. These will be passed as such to the fit function of the target classifier.

Returns

None

predict(x, **kwargs)

Perform prediction using the adversarially trained classifier.

Parameters
  • x (np.ndarray) – Test set.

  • kwargs (dict) – Other parameters to be passed on to the predict function of the classifier.

Returns

Predictions for test set.

Return type

np.ndarray

Adversarial Training Fast is Better than Free - PyTorch

class art.defences.trainer.AdversarialTrainerFBFPyTorch(classifier, eps=8, use_amp=False, **kwargs)

Class performing adversarial training following Fast is Better Than Free protocol.

The effectiveness of this protoocl is found to be sensitive to the use of techniques like data augmentation, gradient clipping and learning rate schedules. Optionally, the use of mixed precision arithmetic operation via apex library can significantly reduce the training time making this one of the fastest adversarial training protocol.
__init__(classifier, eps=8, use_amp=False, **kwargs)

Create an AdversarialTrainerFBFPyTorch instance.

Parameters
  • classifier (Classifier) – Model to train adversarially.

  • eps (float) – Maximum perturbation that the attacker can introduce.

  • use_amp (bool) – Boolean that decides if apex should be used for mixed precision arithmantic during training

fit(x, y, validation_data=None, batch_size=128, nb_epochs=20, **kwargs)

Train a model adversarially with FBF protocol. See class documentation for more information on the exact procedure.

Parameters
  • x (np.ndarray) – Training set.

  • y (np.ndarray) – Labels for the training set.

  • validation_data (np.ndarray) – Tuple consisting of validation data

  • batch_size (int) – Size of batches.

  • nb_epochs (int) – Number of epochs to use for trainings.

  • kwargs (dict) – Dictionary of framework-specific arguments. These will be passed as such to the fit function of the target classifier.

Returns

None

fit_generator(generator, nb_epochs=20, **kwargs)

Train a model adversarially with FBF protocol using a data generator. See class documentation for more information on the exact procedure.

Parameters
  • generator (DataGenerator) – Data generator.

  • nb_epochs (int) – Number of epochs to use for trainings.

  • kwargs (dict) – Dictionary of framework-specific arguments. These will be passed as such to the fit function of the target classifier.

Returns

None