art.estimators.object_detection
¶
Module containing estimators for object detection.
Mixin Base Class Object Detector¶

class
art.estimators.object_detection.
ObjectDetectorMixin
¶ Mixin Base class for ART object detectors.
Object Detector PyTorch FasterRCNN¶

class
art.estimators.object_detection.
PyTorchFasterRCNN
(model: Optional[torchvision.models.detection.fasterrcnn_resnet50_fpn] = None, clip_values: Optional[CLIP_VALUES_TYPE] = None, channels_first: Optional[bool] = None, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = None, attack_losses: Tuple[str, …] = ('loss_classifier', 'loss_box_reg', 'loss_objectness', 'loss_rpn_box_reg'), device_type: str = 'gpu')¶ This class implements a modelspecific object detector using FasterRCNN and PyTorch.

__init__
(model: Optional[torchvision.models.detection.fasterrcnn_resnet50_fpn] = None, clip_values: Optional[CLIP_VALUES_TYPE] = None, channels_first: Optional[bool] = None, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = None, attack_losses: Tuple[str, …] = ('loss_classifier', 'loss_box_reg', 'loss_objectness', 'loss_rpn_box_reg'), device_type: str = 'gpu')¶ Initialization.
 Parameters
model –
FasterRCNN model. The output of the model is List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:
boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W
labels (Int64Tensor[N]): the predicted labels for each image
scores (Tensor[N]): the scores or each prediction
clip_values – Tuple of the form (min, max) of floats or np.ndarray representing the minimum and maximum values allowed for features. If floats are provided, these will be used as the range of all features. If arrays are provided, each value will be considered the bound for a feature, thus the shape of clip values needs to match the total number of features.
channels_first – Set channels first or last.
preprocessing_defences – Preprocessing defence(s) to be applied by the classifier.
postprocessing_defences – Postprocessing defence(s) to be applied by the classifier.
preprocessing – Tuple of the form (subtrahend, divisor) of floats or np.ndarray of values to be used for data preprocessing. The first value will be subtracted from the input. The input will then be divided by the second one.
attack_losses (
Tuple
) – Tuple of any combination of strings of loss components: ‘loss_classifier’, ‘loss_box_reg’, ‘loss_objectness’, and ‘loss_rpn_box_reg’.device_type (
str
) – Type of device to be used for model and tensors, if cpu run on CPU, if gpu run on GPU if available otherwise run on CPU.

property
channels_first
¶  Returns
Boolean to indicate index of the color channels in the sample x.

property
clip_values
¶ Return the clip values of the input samples.
 Returns
Clip values (min, max).

compute_loss
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the loss of the neural network for samples x.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
 Returns
Loss values.
 Return type
Format as expected by the model

property
device
¶ Get current used device.
 Returns
Current used device.

property
device_type
¶ Return the type of device on which the estimator is run.
 Returns
Type of device on which the estimator is run, either gpu or cpu.

fit
(x: numpy.ndarray, y, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None¶ Fit the model of the estimator on the training data x and y.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (Format as expected by the model) – Target values.
batch_size (
int
) – Batch size.nb_epochs (
int
) – Number of training epochs.

fit_generator
(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None¶ Fit the estimator using a generator yielding training batches. Implementations can provide frameworkspecific versions of this function to speedup computation.
 Parameters
generator – Batch generator providing (x, y) for each epoch.
nb_epochs (
int
) – Number of training epochs.

get_activations
(x: numpy.ndarray, layer: Union[int, str], batch_size: int, framework: bool = False) → numpy.ndarray¶ Return the output of a specific layer for samples x where layer is the index of the layer between 0 and nb_layers  1 or the name of the layer. The number of layers can be determined by counting the results returned by calling `layer_names.
 Return type
ndarray
 Parameters
x (
ndarray
) – Sampleslayer – Index or name of the layer.
batch_size (
int
) – Batch size.framework (
bool
) – If true, return the intermediate tensor representation of the activation.
 Returns
The output of layer, where the first dimension is the batch size corresponding to x.

get_params
() → Dict[str, Any]¶ Get all parameters and their values of this estimator.
 Returns
A dictionary of string parameter names to their value.

property
input_shape
¶ Return the shape of one input sample.
 Returns
Shape of one input sample.

property
layer_names
¶ Return the names of the hidden layers in the model, if applicable.
 Returns
The names of the hidden layers in the model, input and output layers are ignored.
Warning
layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

loss_gradient
(x: numpy.ndarray, y: Union[List[Dict[str, numpy.ndarray]], List[Dict[str, torch.Tensor]]], **kwargs) → numpy.ndarray¶ Compute the gradient of the loss function w.r.t. x.
 Return type
ndarray
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, height, width, nb_channels).y –
Target values of format List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:
boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W
labels (Int64Tensor[N]): the predicted labels for each image
scores (Tensor[N]): the scores or each prediction.
 Returns
Loss gradients of the same shape as x.

property
model
¶ Return the model.
 Returns
The model.

predict
(x: numpy.ndarray, batch_size: int = 128, **kwargs) → List[Dict[str, numpy.ndarray]]¶ Perform prediction for a batch of inputs.
 Return type
List
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, height, width, nb_channels).batch_size (
int
) – Batch size.
 Returns
Predictions of format List[Dict[str, np.ndarray]], one for each input image. The fields of the Dict are as follows:
boxes [N, 4]: the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W
labels [N]: the predicted labels for each image
scores [N]: the scores or each prediction.

set_batchnorm
(train: bool) → None¶ Set all batch normalization layers into train or eval mode.
 Parameters
train (
bool
) – False for evaluation mode.

set_dropout
(train: bool) → None¶ Set all dropout layers into train or eval mode.
 Parameters
train (
bool
) – False for evaluation mode.

set_params
(**kwargs) → None¶ Take a dictionary of parameters and apply checks before setting them as attributes.
 Parameters
kwargs – A dictionary of attributes.

Object Detector TensorFlow FasterRCNN¶

class
art.estimators.object_detection.
TensorFlowFasterRCNN
(images: tf.Tensor, model: Optional[FasterRCNNMetaArch] = None, filename: Optional[str] = None, url: Optional[str] = None, sess: Optional[Session] = None, is_training: bool = False, clip_values: Optional[CLIP_VALUES_TYPE] = None, channels_first: bool = False, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = (0.0, 1.0), attack_losses: Tuple[str, …] = ('Loss/RPNLoss/localization_loss', 'Loss/RPNLoss/objectness_loss', 'Loss/BoxClassifierLoss/localization_loss', 'Loss/BoxClassifierLoss/classification_loss'))¶ This class implements a modelspecific object detector using FasterRCNN and TensorFlow.

__init__
(images: tf.Tensor, model: Optional[FasterRCNNMetaArch] = None, filename: Optional[str] = None, url: Optional[str] = None, sess: Optional[Session] = None, is_training: bool = False, clip_values: Optional[CLIP_VALUES_TYPE] = None, channels_first: bool = False, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = (0.0, 1.0), attack_losses: Tuple[str, …] = ('Loss/RPNLoss/localization_loss', 'Loss/RPNLoss/objectness_loss', 'Loss/BoxClassifierLoss/localization_loss', 'Loss/BoxClassifierLoss/classification_loss'))¶ Initialization of an instance TensorFlowFasterRCNN.
 Parameters
images – Input samples of shape (nb_samples, height, width, nb_channels).
model –
A TensorFlow FasterRCNN model. The output that can be computed from the model includes a tuple of (predictions, losses, detections):
predictions: a dictionary holding “raw” prediction tensors.
 losses: a dictionary mapping loss keys (Loss/RPNLoss/localization_loss,
Loss/RPNLoss/objectness_loss, Loss/BoxClassifierLoss/localization_loss, Loss/BoxClassifierLoss/classification_loss) to scalar tensors representing corresponding loss values.
detections: a dictionary containing final detection results.
filename – Filename of the detection model without filename extension.
url – URL to download archive of detection model including filename extension.
sess – Computation session.
is_training (
bool
) – A boolean indicating whether the training version of the computation graph should be constructed.clip_values – Tuple of the form (min, max) of floats or np.ndarray representing the minimum and maximum values allowed for input image features. If floats are provided, these will be used as the range of all features. If arrays are provided, each value will be considered the bound for a feature, thus the shape of clip values needs to match the total number of features.
channels_first (
bool
) – Set channels first or last.preprocessing_defences – Preprocessing defence(s) to be applied by the classifier.
postprocessing_defences – Postprocessing defence(s) to be applied by the classifier.
preprocessing – Tuple of the form (subtractor, divider) of floats or np.ndarray of values to be used for data preprocessing. The first value will be subtracted from the input. The input will then be divided by the second one.
attack_losses (
Tuple
) – Tuple of any combination of strings of the following loss components: first_stage_localization_loss, first_stage_objectness_loss, second_stage_localization_loss, second_stage_classification_loss.

property
channels_first
¶  Returns
Boolean to indicate index of the color channels in the sample x.

property
clip_values
¶ Return the clip values of the input samples.
 Returns
Clip values (min, max).

compute_loss
(x: numpy.ndarray, y: numpy.ndarray, **kwargs) → numpy.ndarray¶ Compute the loss of the neural network for samples x.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (
ndarray
) – Target values (class labels) onehotencoded of shape (nb_samples, nb_classes) or indices of shape (nb_samples,).
 Returns
Loss values.
 Return type
Format as expected by the model

property
detections
¶ Get the _detections attribute.
 Returns
A dictionary containing final detection results.

fit
(x: numpy.ndarray, y, batch_size: int = 128, nb_epochs: int = 20, **kwargs) → None¶ Fit the model of the estimator on the training data x and y.
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_features) or (nb_samples, nb_pixels_1, nb_pixels_2, nb_channels) or (nb_samples, nb_channels, nb_pixels_1, nb_pixels_2).y (Format as expected by the model) – Target values.
batch_size (
int
) – Batch size.nb_epochs (
int
) – Number of training epochs.

fit_generator
(generator: DataGenerator, nb_epochs: int = 20, **kwargs) → None¶ Fit the estimator using a generator yielding training batches. Implementations can provide frameworkspecific versions of this function to speedup computation.
 Parameters
generator – Batch generator providing (x, y) for each epoch.
nb_epochs (
int
) – Number of training epochs.

get_activations
(x: numpy.ndarray, layer: Union[int, str], batch_size: int, framework: bool = False) → numpy.ndarray¶ Return the output of a specific layer for samples x where layer is the index of the layer between 0 and nb_layers  1 or the name of the layer. The number of layers can be determined by counting the results returned by calling `layer_names.
 Return type
ndarray
 Parameters
x (
ndarray
) – Sampleslayer – Index or name of the layer.
batch_size (
int
) – Batch size.framework (
bool
) – If true, return the intermediate tensor representation of the activation.
 Returns
The output of layer, where the first dimension is the batch size corresponding to x.

get_params
() → Dict[str, Any]¶ Get all parameters and their values of this estimator.
 Returns
A dictionary of string parameter names to their value.

property
input_images
¶ Get the images attribute.
 Returns
The input image tensor.

property
input_shape
¶ Return the shape of one input sample.
 Returns
Shape of one input sample.

property
layer_names
¶ Return the names of the hidden layers in the model, if applicable.
 Returns
The names of the hidden layers in the model, input and output layers are ignored.
Warning
layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.

loss_gradient
(x: numpy.ndarray, y: List[Dict[str, numpy.ndarray]], **kwargs) → numpy.ndarray¶ Compute the gradient of the loss function w.r.t. x.
 Return type
ndarray
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, height, width, nb_channels).y (
List
) –A dictionary of target values. The fields of the dictionary are as follows:
boxes: A list of nb_samples size of 2D tf.float32 tensors of shape [num_boxes, 4] containing coordinates of the groundtruth boxes. Groundtruth boxes are provided in [y_min, x_min, y_max, x_max] format and also assumed to be normalized as well as clipped relative to the image window with conditions y_min <= y_max and x_min <= x_max.
labels: A list of nb_samples size of 1D tf.float32 tensors of shape [num_boxes] containing the class targets with the zero index assumed to map to the first nonbackground class.
scores: A list of nb_samples size of 1D tf.float32 tensors of shape [num_boxes] containing weights for groundtruth boxes.
 Returns
Loss gradients of the same shape as x.

property
losses
¶ Get the _losses attribute.
 Returns
A dictionary mapping loss keys (Loss/RPNLoss/localization_loss, Loss/RPNLoss/objectness_loss, Loss/BoxClassifierLoss/localization_loss, Loss/BoxClassifierLoss/classification_loss) to scalar tensors representing corresponding loss values.

property
model
¶ Return the model.
 Returns
The model.

predict
(x: numpy.ndarray, batch_size: int = 128, standardise_output: bool = False, **kwargs) → List[Dict[str, numpy.ndarray]]¶ Perform prediction for a batch of inputs.
 Return type
List
 Parameters
x (
ndarray
) – Samples of shape (nb_samples, height, width, nb_channels).batch_size (
int
) – Batch size.standardise_output (
bool
) – True if output should be standardised. Box coordinates will be normalised to [0, 1] and label index will be decreased by 1 to adhere to COCO categories.
 Returns
A dictionary containing the following fields:
 Returns
Predictions of format List[Dict[str, np.ndarray]], one for each input image. The fields of the Dict are as follows:
boxes [N, 4]: the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W
labels [N]: the predicted labels for each image
scores [N]: the scores or each prediction.

property
predictions
¶ Get the _predictions attribute.
 Returns
A dictionary holding “raw” prediction tensors.

property
sess
¶ Get current TensorFlow session.
 Returns
The current TensorFlow session.

set_params
(**kwargs) → None¶ Take a dictionary of parameters and apply checks before setting them as attributes.
 Parameters
kwargs – A dictionary of attributes.
