art.attacks.inference.membership_inference

Module providing membership inference attacks.

Membership Inference Black-Box

class art.attacks.inference.membership_inference.MembershipInferenceBlackBox(classifier: CLASSIFIER_TYPE, input_type: str = 'prediction', attack_model_type: str = 'nn', attack_model: Optional[Any] = None)

Implementation of a learned black-box membership inference attack.

This implementation can use as input to the learning process probabilities/logits or losses, depending on the type of model and provided configuration.

__init__(classifier: CLASSIFIER_TYPE, input_type: str = 'prediction', attack_model_type: str = 'nn', attack_model: Optional[Any] = None)

Create a MembershipInferenceBlackBox attack instance.

Parameters
  • classifier – Target classifier.

  • attack_model_type (str) – the type of default attack model to train, optional. Should be one of nn (for neural network, default), rf (for random forest) or gb (gradient boosting). If attack_model is supplied, this option will be ignored.

  • input_type (str) – the type of input to train the attack on. Can be one of: ‘prediction’ or ‘loss’. Default is prediction. Predictions can be either probabilities or logits, depending on the return type of the model.

  • attack_model – The attack model to train, optional. If none is provided, a default model will be created.

fit(x: numpy.ndarray, y: numpy.ndarray, test_x: numpy.ndarray, test_y: numpy.ndarray, **kwargs)

Infer membership in the training set of the target estimator.

Parameters
  • x (ndarray) – Records that were used in training the target model.

  • y (ndarray) – True labels for x.

  • test_x (ndarray) – Records that were not used in training the target model.

  • test_y (ndarray) – True labels for test_x.

Returns

An array holding the inferred membership status, 1 indicates a member and 0 indicates non-member.

infer(x: numpy.ndarray, y: Optional[numpy.ndarray] = None, **kwargs) → numpy.ndarray

Infer membership in the training set of the target estimator.

Return type

ndarray

Parameters
  • x (ndarray) – Input records to attack.

  • y – True labels for x.

Returns

An array holding the inferred membership status, 1 indicates a member and 0 indicates non-member.

Membership Inference Black-Box Rule-Based

class art.attacks.inference.membership_inference.MembershipInferenceBlackBoxRuleBased(classifier: CLASSIFIER_TYPE)

Implementation of a simple, rule-based black-box membership inference attack.

This implementation uses the simple rule: if the model’s prediction for a sample is correct, then it is a member. Otherwise, it is not a member.

__init__(classifier: CLASSIFIER_TYPE)

Create a MembershipInferenceBlackBoxRuleBased attack instance.

Parameters

classifier – Target classifier.

infer(x: numpy.ndarray, y: Optional[numpy.ndarray] = None, **kwargs) → numpy.ndarray

Infer membership in the training set of the target estimator.

Return type

ndarray

Parameters
  • x (ndarray) – Input records to attack.

  • y – True labels for x.

Returns

An array holding the inferred membership status, 1 indicates a member and 0 indicates non-member.