art.attacks.inference.model_inversion

Module providing model inversion attacks.

Model Inversion MIFace

class art.attacks.inference.model_inversion.MIFace(classifier: CLASSIFIER_CLASS_LOSS_GRADIENTS_TYPE, max_iter: int = 10000, window_length: int = 100, threshold: float = 0.99, learning_rate: float = 0.1, batch_size: int = 1)

Implementation of the MIFace algorithm from Fredrikson et al. (2015). While in that paper the attack is demonstrated specifically against face recognition models, it is applicable more broadly to classifiers with continuous features which expose class gradients.

__init__(classifier: CLASSIFIER_CLASS_LOSS_GRADIENTS_TYPE, max_iter: int = 10000, window_length: int = 100, threshold: float = 0.99, learning_rate: float = 0.1, batch_size: int = 1)

Create an MIFace attack instance.

Parameters
  • classifier – Target classifier.

  • max_iter (int) – Maximum number of gradient descent iterations for the model inversion.

  • window_length (int) – Length of window for checking whether descent should be aborted.

  • threshold (float) – Threshold for descent stopping criterion.

  • batch_size (int) – Size of internal batches.

infer(x: Optional[numpy.ndarray], y: Optional[numpy.ndarray] = None, **kwargs) → numpy.ndarray

Extract a thieved classifier.

Return type

ndarray

Parameters
  • x – An array with the initial input to the victim classifier. If None, then initial input will be initialized as zero array.

  • y – Target values (class labels) one-hot-encoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).

Returns

The inferred training samples.