art.attacks.extraction
¶
Module providing extraction attacks under a common interface.
Copycat CNN¶
- class art.attacks.extraction.CopycatCNN(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1, use_probability: bool = False)¶
Implementation of the Copycat CNN attack from Rodrigues Correia-Silva et al. (2018).
Paper link: https://arxiv.org/abs/1806.05476- __init__(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1, use_probability: bool = False) None ¶
Create a Copycat CNN attack instance.
- Parameters:
classifier – A victim classifier.
batch_size_fit (
int
) – Size of batches for fitting the thieved classifier.batch_size_query (
int
) – Size of batches for querying the victim classifier.nb_epochs (
int
) – Number of epochs to use for training.nb_stolen (
int
) – Number of queries submitted to the victim classifier to steal it.
- extract(x: ndarray, y: Optional[ndarray] = None, **kwargs) CLASSIFIER_TYPE ¶
Extract a thieved classifier.
- Parameters:
x (
ndarray
) – An array with the source input to the victim classifier.y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,). Not used in this attack.
thieved_classifier (
Classifier
) – A classifier to be stolen, currently always trained on one-hot labels.
- Returns:
The stolen classifier.
Functionally Equivalent Extraction¶
- class art.attacks.extraction.FunctionallyEquivalentExtraction(classifier: CLASSIFIER_TYPE, num_neurons: Optional[int] = None)¶
This module implements the Functionally Equivalent Extraction attack for neural networks with two dense layers, ReLU activation at the first layer and logits output after the second layer.
Paper link: https://arxiv.org/abs/1909.01838- __init__(classifier: CLASSIFIER_TYPE, num_neurons: Optional[int] = None) None ¶
Create a FunctionallyEquivalentExtraction instance.
- Parameters:
classifier – A trained ART classifier.
num_neurons – The number of neurons in the first dense layer.
- extract(x: ndarray, y: Optional[ndarray] = None, delta_0: float = 0.05, fraction_true: float = 0.3, rel_diff_slope: float = 1e-05, rel_diff_value: float = 1e-06, delta_init_value: float = 0.1, delta_value_max: int = 50, d2_min: float = 0.0004, d_step: float = 0.01, delta_sign: float = 0.02, unit_vector_scale: int = 10000, ftol: float = 1e-08, **kwargs) BlackBoxClassifier ¶
Extract the targeted model.
- Return type:
- Parameters:
x (
ndarray
) – Samples of input data of shape (num_samples, num_features).y – Correct labels or target labels for x, depending if the attack is targeted or not. This parameter is only used by some of the attacks.
delta_0 (
float
) – Initial step size of binary search.fraction_true (
float
) – Fraction of output predictions that have to fulfill criteria for critical point.rel_diff_slope (
float
) – Relative slope difference at critical points.rel_diff_value (
float
) – Relative value difference at critical points.delta_init_value (
float
) – Initial delta of weight value search.delta_value_max (
int
) – Maximum delta of weight value search.d2_min (
float
) – Minimum acceptable value of sum of absolute second derivatives.d_step (
float
) – Step size of delta increase.delta_sign (
float
) – Delta of weight sign search.unit_vector_scale (
int
) – Multiplicative scale of the unit vector e_j.ftol (
float
) – Tolerance for termination by the change of the cost function.
- Returns:
ART
BlackBoxClassifier
of the extracted model.
Knockoff Nets¶
- class art.attacks.extraction.KnockoffNets(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1, sampling_strategy: str = 'random', reward: str = 'all', verbose: bool = True, use_probability: bool = False)¶
Implementation of the Knockoff Nets attack from Orekondy et al. (2018).
Paper link: https://arxiv.org/abs/1812.02766- __init__(classifier: CLASSIFIER_TYPE, batch_size_fit: int = 1, batch_size_query: int = 1, nb_epochs: int = 10, nb_stolen: int = 1, sampling_strategy: str = 'random', reward: str = 'all', verbose: bool = True, use_probability: bool = False) None ¶
Create a KnockoffNets attack instance. Note, it is assumed that both the victim classifier and the thieved classifier produce logit outputs.
- Parameters:
classifier – A victim classifier.
batch_size_fit (
int
) – Size of batches for fitting the thieved classifier.batch_size_query (
int
) – Size of batches for querying the victim classifier.nb_epochs (
int
) – Number of epochs to use for training.nb_stolen (
int
) – Number of queries submitted to the victim classifier to steal it.sampling_strategy (
str
) – Sampling strategy, either random or adaptive.reward (
str
) – Reward type, in [‘cert’, ‘div’, ‘loss’, ‘all’].verbose (
bool
) – Show progress bars.
- extract(x: ndarray, y: Optional[ndarray] = None, **kwargs) CLASSIFIER_TYPE ¶
Extract a thieved classifier.
- Parameters:
x (
ndarray
) – An array with the source input to the victim classifier.y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
thieved_classifier – A thieved classifier to be stolen.
- Returns:
The stolen classifier.