art.estimators.object_tracking
¶
Module containing estimators for object tracking.
Mixin Base Class Object Tracker¶
Object Tracker PyTorch GOTURN¶
- class art.estimators.object_tracking.PyTorchGoturn(model, input_shape: Tuple[int, ...], clip_values: Optional[CLIP_VALUES_TYPE] = None, channels_first: Optional[bool] = None, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = None, device_type: str = 'gpu')¶
This module implements the task- and model-specific estimator for PyTorch GOTURN (object tracking).
- __init__(model, input_shape: Tuple[int, ...], clip_values: Optional[CLIP_VALUES_TYPE] = None, channels_first: Optional[bool] = None, preprocessing_defences: Optional[Union[Preprocessor, List[Preprocessor]]] = None, postprocessing_defences: Optional[Union[Postprocessor, List[Postprocessor]]] = None, preprocessing: PREPROCESSING_TYPE = None, device_type: str = 'gpu')¶
Initialization.
- Parameters
model – GOTURN model.
input_shape (
Tuple
) – Shape of one input sample as expected by the model, e.g. input_shape=(3, 227, 227).clip_values – Tuple of the form (min, max) of floats or np.ndarray representing the minimum and maximum values allowed for features. If floats are provided, these will be used as the range of all features. If arrays are provided, each value will be considered the bound for a feature, thus the shape of clip values needs to match the total number of features.
channels_first – Set channels first or last.
preprocessing_defences – Preprocessing defence(s) to be applied by the classifier.
postprocessing_defences – Postprocessing defence(s) to be applied by the classifier.
preprocessing – Tuple of the form (subtrahend, divisor) of floats or np.ndarray of values to be used for data preprocessing. The first value will be subtracted from the input. The input will then be divided by the second one.
device_type (
str
) – Type of device to be used for model and tensors, if cpu run on CPU, if gpu run on GPU if available otherwise run on CPU.
- property channels_first: bool¶
- Returns
Boolean to indicate index of the color channels in the sample x.
- property clip_values: Optional[CLIP_VALUES_TYPE]¶
Return the clip values of the input samples.
- Returns
Clip values (min, max).
- clone_for_refitting() ESTIMATOR_TYPE ¶
Clone estimator for refitting.
- compute_loss(x: ndarray, y: List[Dict[str, Union[ndarray, torch.Tensor]]], **kwargs) ndarray ¶
Compute loss.
- Return type
ndarray
- Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_frames, height, width, nb_channels).y (
List
) –Target values of format List[Dict[str, np.ndarray]], one dictionary for each input image. The keys of the dictionary are:
- boxes [N_FRAMES, 4]: the boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and
0 <= y1 < y2 <= H.
- Returns
Total loss.
- compute_loss_from_predictions(pred: ndarray, y: ndarray, **kwargs) ndarray ¶
Compute the loss of the estimator for predictions pred.
- Return type
ndarray
- Parameters
pred (
ndarray
) – Model predictions.y (
ndarray
) – Target values.
- Returns
Loss values.
- compute_losses(x: ndarray, y: List[Dict[str, Union[ndarray, torch.Tensor]]]) Dict[str, ndarray] ¶
Compute losses.
- Return type
Dict
- Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_frames, height, width, nb_channels).y (
List
) –Target values of format List[Dict[str, np.ndarray]], one dictionary for each input image. The keys of the dictionary are:
- boxes [N_FRAMES, 4]: the boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and
0 <= y1 < y2 <= H.
- Returns
Dictionary of loss components.
- property device: torch.device¶
Get current used device.
- Returns
Current used device.
- property device_type: str¶
Return the type of device on which the estimator is run.
- Returns
Type of device on which the estimator is run, either gpu or cpu.
- fit(x: ndarray, y, batch_size: int = 128, nb_epochs: int = 20, **kwargs) None ¶
Not implemented.
- fit_generator(generator: DataGenerator, nb_epochs: int = 20, **kwargs) None ¶
Fit the estimator using a generator yielding training batches. Implementations can provide framework-specific versions of this function to speed-up computation.
- Parameters
generator – Batch generator providing (x, y) for each epoch.
nb_epochs (
int
) – Number of training epochs.
- get_activations(x: ndarray, layer: Union[int, str], batch_size: int, framework: bool = False) ndarray ¶
Not implemented.
- get_params() Dict[str, Any] ¶
Get all parameters and their values of this estimator.
- Returns
A dictionary of string parameter names to their value.
- init(image: PIL.JpegImagePlugin.JpegImageFile, box: ndarray)¶
Method init for GOT-10k trackers.
- Parameters
image – Current image.
- Returns
Predicted box.
- property input_shape: Tuple[int, ...]¶
Return the shape of one input sample.
- Returns
Shape of one input sample.
- property layer_names: Optional[List[str]]¶
Return the names of the hidden layers in the model, if applicable.
- Returns
The names of the hidden layers in the model, input and output layers are ignored.
Warning
layer_names tries to infer the internal structure of the model. This feature comes with no guarantees on the correctness of the result. The intended order of the layers tries to match their order in the model, but this is not guaranteed either.
- loss_gradient(x: ndarray, y: List[Dict[str, Union[ndarray, torch.Tensor]]], **kwargs) ndarray ¶
Compute the gradient of the loss function w.r.t. x.
- Return type
ndarray
- Parameters
x (
ndarray
) – Samples of shape (nb_samples, height, width, nb_channels).y (
List
) –Target values of format List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows:
boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with values between 0 and H and 0 and W.
labels (Int64Tensor[N]): the predicted labels for each image.
scores (Tensor[N]): the scores or each prediction.
- Returns
Loss gradients of the same shape as x.
- property model¶
Return the model.
- Returns
The model.
- property native_label_is_pytorch_format: bool¶
Are the native labels in PyTorch format [x1, y1, x2, y2]?
- predict(x: ndarray, batch_size: int = 128, **kwargs) List[Dict[str, ndarray]] ¶
Perform prediction for a batch of inputs.
- Return type
List
- Parameters
x (
ndarray
) – Samples of shape (nb_samples, nb_frames, height, width, nb_channels).batch_size (
int
) – Batch size.
- Keyword Arguments
y_init (
np.ndarray
) – Initial box around object to be tracked as [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and 0 <= y1 < y2 <= H.
- Returns
Predictions of format List[Dict[str, np.ndarray]], one dictionary for each input image. The keys of the dictionary are:
- boxes [N_FRAMES, 4]: the boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and
0 <= y1 < y2 <= H.
labels [N_FRAMES]: the labels for each image, default 0.
scores [N_FRAMES]: the scores or each prediction, default 1.
- set_batchnorm(train: bool) None ¶
Set all batch normalization layers into train or eval mode.
- Parameters
train (
bool
) – False for evaluation mode.
- set_dropout(train: bool) None ¶
Set all dropout layers into train or eval mode.
- Parameters
train (
bool
) – False for evaluation mode.
- set_multihead_attention(train: bool) None ¶
Set all multi-head attention layers into train or eval mode.
- Parameters
train (
bool
) – False for evaluation mode.
- set_params(**kwargs) None ¶
Take a dictionary of parameters and apply checks before setting them as attributes.
- Parameters
kwargs – A dictionary of attributes.
- track(img_files: List[str], box: ndarray, visualize: bool = False) Tuple[ndarray, ndarray] ¶
Method track for GOT-10k toolkit trackers (MIT licence).
- Return type
Tuple
- Parameters
img_files (
List
) – Image files.box (
ndarray
) – Initial boxes.visualize (
bool
) – Visualise tracking.
- update(image: ndarray) ndarray ¶
Method update for GOT-10k trackers.
- Return type
ndarray
- Parameters
image (
ndarray
) – Current image.- Returns
Predicted box.