art.estimators.object_detection

Module containing estimators for object detection.

Mixin Base Class Object Detector

class art.estimators.object_detection.ObjectDetectorMixin

Mix-in Base class for ART object detectors.

Object Detector PyTorch Faster-RCNN

class art.estimators.object_detection.PyTorchFasterRCNN(**kwargs)

This class implements a model-specific object detector using Faster-RCNN and PyTorch.

__init__(model: Optional[torchvision.models.detection.fasterrcnn_resnet50_fpn] = None, clip_values: Optional[CLIP_VALUES_TYPE] = None, channel_index=<art.utils._Deprecated object>, channels_first: Optional[bool] = None, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = None, attack_losses: Tuple[str, ...] = ('loss_classifier', 'loss_box_reg', 'loss_objectness', 'loss_rpn_box_reg'), device_type: str = 'gpu')

Initialization.

Parameters
  • model

    Faster-RCNN model. The output of the model is List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:

    • boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W

    • labels (Int64Tensor[N]): the predicted labels for each image

    • scores (Tensor[N]): the scores or each prediction

  • clip_values – Tuple of the form (min, max) of floats or np.ndarray representing the minimum and maximum values allowed for features. If floats are provided, these will be used as the range of all features. If arrays are provided, each value will be considered the bound for a feature, thus the shape of clip values needs to match the total number of features.

  • channel_index (int) – Index of the axis in data containing the color channels or features.

  • channels_first – Set channels first or last.

  • preprocessing_defences – Preprocessing defence(s) to be applied by the classifier.

  • postprocessing_defences – Postprocessing defence(s) to be applied by the classifier.

  • preprocessing – Tuple of the form (subtractor, divider) of floats or np.ndarray of values to be used for data preprocessing. The first value will be subtracted from the input. The input will then be divided by the second one.

  • attack_losses (Tuple) – Tuple of any combination of strings of loss components: ‘loss_classifier’, ‘loss_box_reg’, ‘loss_objectness’, and ‘loss_rpn_box_reg’.

  • device_type (str) – Type of device to be used for model and tensors, if cpu run on CPU, if gpu run on GPU if available otherwise run on CPU.

property channel_index
Returns

Index of the axis containing the color channels in the samples x.

property channels_first
Returns

Boolean to indicate index of the color channels in the sample x.

property clip_values

Return the clip values of the input samples.

Returns

Clip values (min, max).

fit(x: numpy.ndarray, y, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None

Fit the model of the estimator on the training data x and y.

Parameters
  • x (ndarray) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).

  • y (Format as expected by the model) – Target values.

  • batch_size (int) – Batch size.

  • nb_epochs (int) – Number of training epochs.

fit_generator(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None

Fit the estimator using a generator yielding training batches. Implementations can provide framework-specific versions of this function to speed-up computation.

Parameters
  • generator – Batch generator providing (x, y) for each epoch.

  • nb_epochs (int) – Number of training epochs.

get_activations(x: numpy.ndarray, layer: Union[int, str], batch_size: int, framework: bool = False) → numpy.ndarray

Return the output of a specific layer for samples x where layer is the index of the layer between 0 and nb_layers - 1 or the name of the layer. The number of layers can be determined by counting the results returned by calling `layer_names.

Return type

ndarray

Parameters
  • x (ndarray) – Samples

  • layer – Index or name of the layer.

  • batch_size (int) – Batch size.

  • framework (bool) – If true, return the intermediate tensor representation of the activation.

Returns

The output of layer, where the first dimension is the batch size corresponding to x.

get_params() → Dict[str, Any]

Get all parameters and their values of this estimator.

Returns

A dictionary of string parameter names to their value.

property input_shape

Return the shape of one input sample.

Returns

Shape of one input sample.

property layer_names

Return the names of the hidden layers in the model, if applicable.

Returns

The names of the hidden layers in the model, input and output layers are ignored.

Warning

layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

property learning_phase

The learning phase set by the user. Possible values are True for training or False for prediction and None if it has not been set by the library. In the latter case, the library does not do any explicit learning phase manipulation and the current value of the backend framework is used. If a value has been set by the user for this property, it will impact all following computations for model fitting, prediction and gradients.

Returns

Learning phase.

loss_gradient(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray

Compute the gradient of the loss function w.r.t. x.

Return type

ndarray

Parameters
  • x (ndarray) – Samples of shape (nb_samples, height, width, nb_channels).

  • y (ndarray) –

    Target values of format List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:

    • boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W

    • labels (Int64Tensor[N]): the predicted labels for each image

    • scores (Tensor[N]): the scores or each prediction.

Returns

Loss gradients of the same shape as x.

property model

Return the model.

Returns

The model.

predict(x: numpy.ndarray, batch_size: int = 128, **kwargs) → numpy.ndarray

Perform prediction for a batch of inputs.

Return type

ndarray

Parameters
  • x (ndarray) – Samples of shape (nb_samples, height, width, nb_channels).

  • batch_size (int) – Batch size.

Returns

Predictions of format List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:

  • boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W

  • labels (Int64Tensor[N]): the predicted labels for each image

  • scores (Tensor[N]): the scores or each prediction.

set_learning_phase(train: bool) → None

Set the learning phase for the backend framework.

Parameters

train (bool) – True if the learning phase is training, otherwise False.

set_params(**kwargs) → None

Take a dictionary of parameters and apply checks before setting them as attributes.

Parameters

kwargs – A dictionary of attributes.