art.attacks.inference.membership_inference

Module providing membership inference attacks.

Membership Inference Black-Box

class art.attacks.inference.membership_inference.MembershipInferenceBlackBox(estimator: Union[CLASSIFIER_TYPE, REGRESSOR_TYPE], input_type: str = 'prediction', attack_model_type: str = 'nn', attack_model: Optional[Any] = None)

Implementation of a learned black-box membership inference attack.

This implementation can use as input to the learning process probabilities/logits or losses, depending on the type of model and provided configuration.

__init__(estimator: Union[CLASSIFIER_TYPE, REGRESSOR_TYPE], input_type: str = 'prediction', attack_model_type: str = 'nn', attack_model: Optional[Any] = None)

Create a MembershipInferenceBlackBox attack instance.

Parameters
  • estimator – Target estimator.

  • attack_model_type (str) – the type of default attack model to train, optional. Should be one of nn (for neural network, default), rf (for random forest) or gb (gradient boosting). If attack_model is supplied, this option will be ignored.

  • input_type (str) – the type of input to train the attack on. Can be one of: ‘prediction’ or ‘loss’. Default is prediction. Predictions can be either probabilities or logits, depending on the return type of the model. If the model is a regressor, only loss can be used.

  • attack_model – The attack model to train, optional. If none is provided, a default model will be created.

fit(x: numpy.ndarray, y: numpy.ndarray, test_x: numpy.ndarray, test_y: numpy.ndarray, **kwargs)

Train the attack model.

Parameters
  • x (ndarray) – Records that were used in training the target estimator.

  • y (ndarray) – True labels for x.

  • test_x (ndarray) – Records that were not used in training the target estimator.

  • test_y (ndarray) – True labels for test_x.

infer(x: numpy.ndarray, y: Optional[numpy.ndarray] = None, **kwargs) numpy.ndarray

Infer membership in the training set of the target estimator.

Return type

ndarray

Parameters
  • x (ndarray) – Input records to attack.

  • y – True labels for x.

  • probabilities – a boolean indicating whether to return the predicted probabilities per class, or just the predicted class

Returns

An array holding the inferred membership status, 1 indicates a member and 0 indicates non-member, or class probabilities.

Membership Inference Black-Box Rule-Based

class art.attacks.inference.membership_inference.MembershipInferenceBlackBoxRuleBased(classifier: CLASSIFIER_TYPE)

Implementation of a simple, rule-based black-box membership inference attack.

This implementation uses the simple rule: if the model’s prediction for a sample is correct, then it is a member. Otherwise, it is not a member.

__init__(classifier: CLASSIFIER_TYPE)

Create a MembershipInferenceBlackBoxRuleBased attack instance.

Parameters

classifier – Target classifier.

infer(x: numpy.ndarray, y: Optional[numpy.ndarray] = None, **kwargs) numpy.ndarray

Infer membership in the training set of the target estimator.

Return type

ndarray

Parameters
  • x (ndarray) – Input records to attack.

  • y – True labels for x.

  • probabilities – a boolean indicating whether to return the predicted probabilities per class, or just the predicted class.

Returns

An array holding the inferred membership status, 1 indicates a member and 0 indicates non-member, or class probabilities.

Membership Inference Label-Only - Decision Boundary

class art.attacks.inference.membership_inference.LabelOnlyDecisionBoundary(estimator: CLASSIFIER_TYPE, distance_threshold_tau: Optional[float] = None)

Implementation of Label-Only Inference Attack based on Decision Boundary.

Paper link: https://arxiv.org/abs/2007.14321 (Choquette-Choo et al.) and https://arxiv.org/abs/2007.15528 (Li

and Zhang)

You only need to call ONE of the calibrate methods, depending on which attack you want to launch.

__init__(estimator: CLASSIFIER_TYPE, distance_threshold_tau: Optional[float] = None)

Create a LabelOnlyDecisionBoundary instance for Label-Only Inference Attack based on Decision Boundary.

Parameters
  • estimator – A trained classification estimator.

  • distance_threshold_tau – Threshold distance for decision boundary. Samples with boundary distances larger than threshold are considered members of the training dataset.

calibrate_distance_threshold(x_train: numpy.ndarray, y_train: numpy.ndarray, x_test: numpy.ndarray, y_test: numpy.ndarray, **kwargs)

Calibrate distance threshold maximising the membership inference accuracy on x_train and x_test.

Parameters
  • x_train (ndarray) – Training data.

  • y_train (ndarray) – Labels of training data x_train.

  • x_test (ndarray) – Test data.

  • y_test (ndarray) – Labels of test data x_test.

Keyword Arguments for HopSkipJump
  • norm: Order of the norm. Possible values: “inf”, np.inf or 2.

  • max_iter: Maximum number of iterations.

  • max_eval: Maximum number of evaluations for estimating gradient.

  • init_eval: Initial number of evaluations for estimating gradient.

  • init_size: Maximum number of trials for initial generation of adversarial examples.

  • verbose: Show progress bars.

calibrate_distance_threshold_unsupervised(top_t: int = 50, num_samples: int = 100, max_queries: int = 1, **kwargs)

Calibrate distance threshold on randomly generated samples, choosing the top-t percentile of the noise needed to change the classifier’s initial prediction. This method requires the model’s clip_values to be set.

Parameters
  • top_t (int) – Top-t percentile.

  • num_samples (int) – Number of random samples to generate.

  • max_queries (int) – Maximum number of queries. Maximum number of HSJ iterations on a single sample will be max_queries * max_iter.

Keyword Arguments for HopSkipJump
  • norm: Order of the norm. Possible values: “inf”, np.inf or 2.

  • max_iter: Maximum number of iterations.

  • max_eval: Maximum number of evaluations for estimating gradient.

  • init_eval: Initial number of evaluations for estimating gradient.

  • init_size: Maximum number of trials for initial generation of adversarial examples.

  • verbose: Show progress bars.

infer(x: numpy.ndarray, y: Optional[numpy.ndarray] = None, **kwargs) numpy.ndarray

Infer membership of input x in estimator’s training data.

Return type

ndarray

Parameters
  • x (ndarray) – Input data.

  • y – True labels for x.

  • probabilities – a boolean indicating whether to return the predicted probabilities per class, or just the predicted class

Keyword Arguments for HopSkipJump
  • norm: Order of the norm. Possible values: “inf”, np.inf or 2.

  • max_iter: Maximum number of iterations.

  • max_eval: Maximum number of evaluations for estimating gradient.

  • init_eval: Initial number of evaluations for estimating gradient.

  • init_size: Maximum number of trials for initial generation of adversarial examples.

  • verbose: Show progress bars.

Returns

An array holding the inferred membership status, 1 indicates a member and 0 indicates non-member, or class probabilities.

Membership Inference Label-Only - Gap Attack

art.attacks.inference.membership_inference.LabelOnlyGapAttack

alias of art.attacks.inference.membership_inference.black_box_rule_based.MembershipInferenceBlackBoxRuleBased