art.attacks.extraction

Module providing extraction attacks under a common interface.

Copycat CNN

class art.attacks.extraction.CopycatCNN(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1)

Implementation of the Copycat CNN attack from Rodrigues Correia-Silva et al. (2018).

__init__(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1) → None

Create a Copycat CNN attack instance.

Parameters
  • classifier – A victim classifier.

  • batch_size_fit (int) – Size of batches for fitting the thieved classifier.

  • batch_size_query (int) – Size of batches for querying the victim classifier.

  • nb_epochs (int) – Number of epochs to use for training.

  • nb_stolen (int) – Number of queries submitted to the victim classifier to steal it.

extract(*args, **kwargs)

Extract a thieved classifier.

Parameters
  • x – An array with the source input to the victim classifier.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,). Not used in this attack.

  • thieved_classifier (Classifier) – A classifier to be stolen, currently always trained on one-hot labels.

Returns

The stolen classifier.

Functionally Equivalent Extraction

class art.attacks.extraction.FunctionallyEquivalentExtraction(classifier: CLASSIFIER_TYPE, num_neurons: Optional[int] = None)

This module implements the Functionally Equivalent Extraction attack for neural networks with two dense layers, ReLU activation at the first layer and logits output after the second layer.

__init__(classifier: CLASSIFIER_TYPE, num_neurons: Optional[int] = None) → None

Create a FunctionallyEquivalentExtraction instance.

Parameters
  • classifier – A trained ART classifier.

  • num_neurons – The number of neurons in the first dense layer.

extract(*args, **kwargs)

Extract the targeted model.

Parameters
  • x – Samples of input data of shape (num_samples, num_features).

  • y – Correct labels or target labels for x, depending if the attack is targeted or not. This parameter is only used by some of the attacks.

  • delta_0 – Initial step size of binary search.

  • fraction_true – Fraction of output predictions that have to fulfill criteria for critical point.

  • rel_diff_slope – Relative slope difference at critical points.

  • rel_diff_value – Relative value difference at critical points.

  • delta_init_value – Initial delta of weight value search.

  • delta_value_max – Maximum delta of weight value search.

  • d2_min – Minimum acceptable value of sum of absolute second derivatives.

  • d_step – Step size of delta increase.

  • delta_sign – Delta of weight sign search.

  • unit_vector_scale – Multiplicative scale of the unit vector e_j.

Returns

ART BlackBoxClassifier of the extracted model.

Knockoff Nets

class art.attacks.extraction.KnockoffNets(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1, sampling_strategy: str = 'random', reward: str = 'all')

Implementation of the Knockoff Nets attack from Orekondy et al. (2018).

__init__(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1, sampling_strategy: str = 'random', reward: str = 'all') → None

Create a KnockoffNets attack instance. Note, it is assumed that both the victim classifier and the thieved classifier produce logit outputs.

Parameters
  • classifier – A victim classifier.

  • batch_size_fit (int) – Size of batches for fitting the thieved classifier.

  • batch_size_query (int) – Size of batches for querying the victim classifier.

  • nb_epochs (int) – Number of epochs to use for training.

  • nb_stolen (int) – Number of queries submitted to the victim classifier to steal it.

  • sampling_strategy (str) – Sampling strategy, either random or adaptive.

  • reward (str) – Reward type, in [‘cert’, ‘div’, ‘loss’, ‘all’].

extract(*args, **kwargs)

Extract a thieved classifier.

Parameters
  • x – An array with the source input to the victim classifier.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

  • thieved_classifier – A thieved classifier to be stolen.

Returns

The stolen classifier.