art.defences.transformer.poisoning

Module implementing transformer-based defences against poisoning attacks.

Neural Cleanse

class art.defences.transformer.poisoning.NeuralCleanse(classifier: CLASSIFIER_TYPE)

Implementation of methods in Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. Wang et al. (2019).

__call__(transformed_classifier: CLASSIFIER_TYPE, steps: int = 1000, init_cost: float = 0.001, norm: Union[int, float] = 2, learning_rate: float = 0.1, attack_success_threshold: float = 0.99, patience: int = 5, early_stop: bool = True, early_stop_threshold: float = 0.99, early_stop_patience: int = 10, cost_multiplier: float = 1.5, batch_size: int = 32) → art.estimators.certification.neural_cleanse.keras.KerasNeuralCleanse

Returns an new classifier with implementation of methods in Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. Wang et al. (2019).

Namely, the new classifier has a new method mitigate(). This can also affect the predict() function.

Return type

KerasNeuralCleanse

Parameters
  • transformed_classifier – An ART classifier

  • steps (int) – The maximum number of steps to run the Neural Cleanse optimization

  • init_cost (float) – The initial value for the cost tensor in the Neural Cleanse optimization

  • norm – The norm to use for the Neural Cleanse optimization, can be 1, 2, or np.inf

  • learning_rate (float) – The learning rate for the Neural Cleanse optimization

  • attack_success_threshold (float) – The threshold at which the generated backdoor is successful enough to stop the Neural Cleanse optimization

  • patience (int) – How long to wait for changing the cost multiplier in the Neural Cleanse optimization

  • early_stop (bool) – Whether or not to allow early stopping in the Neural Cleanse optimization

  • early_stop_threshold (float) – How close values need to come to max value to start counting early stop

  • early_stop_patience (int) – How long to wait to determine early stopping in the Neural Cleanse optimization

  • cost_multiplier (float) – How much to change the cost in the Neural Cleanse optimization

  • batch_size (int) – The batch size for optimizations in the Neural Cleanse optimization

__init__(classifier: CLASSIFIER_TYPE) → None

Create an instance of the neural cleanse defence.

Parameters

classifier – A trained classifier.

fit(x: numpy.ndarray, y: Optional[numpy.ndarray] = None, **kwargs) → None

No parameters to learn for this method; do nothing.