shok.utils.transforms package

class shok.utils.transforms.ApplyPatch(scale_range: tuple[float, float] = (0.1, 0.4), location_range: tuple[float, float] = (0.0, 1.0), patch_crop_range: tuple[float, float] = (0.8, 1.0), rotation_probs: tuple[float, float, float, float] = (0.25, 0.25, 0.25, 0.25), flip_probability: float = 0.5)[source]

Bases: Module

Module to apply a patch to an image.

forward(x: Tensor, patch: Tensor, y: Tensor = None) Tensor[source]

Forward method.

The patch is randomly rotated, resized, and placed at a location determined by a distribution. The function ensures the patch fits within the image boundaries and updates the target tensor y if provided.

Args:

x (torch.Tensor): The input image tensor of shape (…, H, W). patch (torch.Tensor): The patch tensor to be inserted, typically of shape (…, h, w). y (torch.Tensor, optional): Target tensor containing annotations (e.g., bounding boxes and labels). Defaults to None.

Returns:
Tuple[torch.Tensor, torch.Tensor]:
  • The transformed image tensor with the patch inserted.

  • The updated target tensor (if provided), otherwise None.

class shok.utils.transforms.ConvertToTVTensorBBoxes(*args, **kwargs)[source]

Bases: Module

Module to convert bounding boxes to torchvision tensors.

This is a simplified version that does not include transformations.

This is useful due to some torchvsion transforms requiring bounding boxes to be of type torchvision.tv_tensors.BoundingBoxes.

forward(x: Tensor, y: Tensor = None) Tensor[source]

Applies transformation to input tensor x and optionally processes bounding boxes in y.

Args:

x (torch.Tensor): Input tensor, typically representing an image or batch of images. y (torch.Tensor, optional): Optional target dictionary. If provided and contains a “boxes” key,

the bounding boxes are converted to a BoundingBoxes object in “xyxy” format with the same canvas size as x and dtype torch.float32.

Returns:

Tuple[torch.Tensor, dict]: The (possibly transformed) input tensor x and the updated target dictionary y.

class shok.utils.transforms.PassRound(*args, **kwargs)[source]

Bases: Module

A custom torch.nn.Module that applies a soft rounding operation to the input tensor.

Args:

x (torch.Tensor): The input tensor to be rounded. y (optional): An optional secondary input, passed through unchanged.

Returns:

Tuple[torch.Tensor, Any]: A tuple containing the rounded tensor and the optional secondary input.

Note:

The actual rounding logic is implemented in functions.PassRound.apply.

forward(x: Tensor, y=None) Tensor[source]

Applies a placeholder soft rounding operation to the input tensor.

Args:

x (torch.Tensor): Input tensor to be processed. y (optional): Additional input, currently unused.

Returns:

Tuple[torch.Tensor, Any]: A tuple containing the processed tensor and the second input (y).

class shok.utils.transforms.ScaleApplyPatch(scale=0.25, preserve_aspect_ratio=True)[source]

Bases: Module

Applies a patch to an image at a scaled size.

This is useful for evaluating patch effectiveness.

forward(x: Tensor, patch: Tensor, y: Tensor = None) Tensor[source]

Applies a scaled patch to the input tensor x and optionally adjusts target annotations y.

Args:

x (torch.Tensor): The input tensor, typically an image of shape (C, H, W). patch (torch.Tensor): The patch tensor to be applied to x. y (torch.Tensor, optional): Target annotations dictionary containing keys such as “boxes” and “labels”.

Returns:
Tuple[torch.Tensor, Optional[dict]]:
  • Modified input tensor with the patch applied.

  • Modified target annotations dictionary, if provided, with bounding boxes and labels

adjusted to fit the new image dimensions.

Notes:
  • The patch is resized according to a fixed scale before being applied.

  • Bounding boxes in y are clamped to ensure they remain within the image boundaries.

  • If “boxes” or “labels” are missing in y, they are initialized as empty tensors.

class shok.utils.transforms.ScaleGradTransform[source]

Bases: Module

Transforms scales the gradient of the input tensor.

forward(x, y=None)[source]

Scale the gradient of the input tensor.

class shok.utils.transforms.ScaleImageValues(min=0, max=255)[source]

Bases: Module

Simple transform scales the image values to be between 0 and 1.

While the other v2 transforms do this, they seem to randomly mess with the labels. This transform ensures that the labels remain unchanged.

forward(x: Tensor, y=None) Tensor[source]

Scale the image values to be between 0 and 1.

Args:

x (torch.Tensor): Input image tensor. y (torch.Tensor, optional): Target tensor, not modified in this transform.

Returns:

torch.Tensor: Scaled image tensor.

class shok.utils.transforms.SoftRound(*args, **kwargs)[source]

Bases: Module

Transform to use the soft round function for adversarial training.

This is something being explored. Since rounding is not differentiable, additional logic is needed to ensure gradients can flow through the operation.

This way does the rounding, but then calculates what the multiplier factor was. Then this value is used to scale the gradient.

forward(x: Tensor, y=None) Tensor[source]

Applies soft rounding to the input tensor using the SoftRound function.

Args:

x (torch.Tensor): Input tensor to be processed. y (optional): An optional secondary input, not used in the transformation.

Returns:

Tuple[torch.Tensor, Any]: A tuple containing the transformed tensor and the optional secondary input.

class shok.utils.transforms.TargetInsurance(*args, **kwargs)[source]

Bases: Module

Transform that makes sure object detection targets are always present.

Sometime the targets are not in the dataset and this breaks some torchvision transforms.

forward(x: Tensor, y: Tensor) Tensor[source]

Ensures that the target dictionary y contains the keys “boxes” and “labels”.

If these keys are missing, initializes “boxes” with an empty tensor of shape (0, 4) and dtype float32, and “labels” with an empty tensor of dtype int64.

Args:

x (torch.Tensor): The input tensor. y (torch.Tensor): The target dictionary containing annotation data.

Returns:

Tuple[torch.Tensor, dict]: The input tensor and the updated target dictionary.

Submodules

shok.utils.transforms.apply_patch module

class shok.utils.transforms.apply_patch.ApplyPatch(scale_range: tuple[float, float] = (0.1, 0.4), location_range: tuple[float, float] = (0.0, 1.0), patch_crop_range: tuple[float, float] = (0.8, 1.0), rotation_probs: tuple[float, float, float, float] = (0.25, 0.25, 0.25, 0.25), flip_probability: float = 0.5)[source]

Bases: Module

Module to apply a patch to an image.

forward(x: Tensor, patch: Tensor, y: Tensor = None) Tensor[source]

Forward method.

The patch is randomly rotated, resized, and placed at a location determined by a distribution. The function ensures the patch fits within the image boundaries and updates the target tensor y if provided.

Args:

x (torch.Tensor): The input image tensor of shape (…, H, W). patch (torch.Tensor): The patch tensor to be inserted, typically of shape (…, h, w). y (torch.Tensor, optional): Target tensor containing annotations (e.g., bounding boxes and labels). Defaults to None.

Returns:
Tuple[torch.Tensor, torch.Tensor]:
  • The transformed image tensor with the patch inserted.

  • The updated target tensor (if provided), otherwise None.

shok.utils.transforms.convert_to_tv_tensor_bboxes module

class shok.utils.transforms.convert_to_tv_tensor_bboxes.ConvertToTVTensorBBoxes(*args, **kwargs)[source]

Bases: Module

Module to convert bounding boxes to torchvision tensors.

This is a simplified version that does not include transformations.

This is useful due to some torchvsion transforms requiring bounding boxes to be of type torchvision.tv_tensors.BoundingBoxes.

forward(x: Tensor, y: Tensor = None) Tensor[source]

Applies transformation to input tensor x and optionally processes bounding boxes in y.

Args:

x (torch.Tensor): Input tensor, typically representing an image or batch of images. y (torch.Tensor, optional): Optional target dictionary. If provided and contains a “boxes” key,

the bounding boxes are converted to a BoundingBoxes object in “xyxy” format with the same canvas size as x and dtype torch.float32.

Returns:

Tuple[torch.Tensor, dict]: The (possibly transformed) input tensor x and the updated target dictionary y.

shok.utils.transforms.pass_round module

class shok.utils.transforms.pass_round.PassRound(*args, **kwargs)[source]

Bases: Module

A custom torch.nn.Module that applies a soft rounding operation to the input tensor.

Args:

x (torch.Tensor): The input tensor to be rounded. y (optional): An optional secondary input, passed through unchanged.

Returns:

Tuple[torch.Tensor, Any]: A tuple containing the rounded tensor and the optional secondary input.

Note:

The actual rounding logic is implemented in functions.PassRound.apply.

forward(x: Tensor, y=None) Tensor[source]

Applies a placeholder soft rounding operation to the input tensor.

Args:

x (torch.Tensor): Input tensor to be processed. y (optional): Additional input, currently unused.

Returns:

Tuple[torch.Tensor, Any]: A tuple containing the processed tensor and the second input (y).

shok.utils.transforms.scale_apply_patch module

class shok.utils.transforms.scale_apply_patch.ScaleApplyPatch(scale=0.25, preserve_aspect_ratio=True)[source]

Bases: Module

Applies a patch to an image at a scaled size.

This is useful for evaluating patch effectiveness.

forward(x: Tensor, patch: Tensor, y: Tensor = None) Tensor[source]

Applies a scaled patch to the input tensor x and optionally adjusts target annotations y.

Args:

x (torch.Tensor): The input tensor, typically an image of shape (C, H, W). patch (torch.Tensor): The patch tensor to be applied to x. y (torch.Tensor, optional): Target annotations dictionary containing keys such as “boxes” and “labels”.

Returns:
Tuple[torch.Tensor, Optional[dict]]:
  • Modified input tensor with the patch applied.

  • Modified target annotations dictionary, if provided, with bounding boxes and labels

adjusted to fit the new image dimensions.

Notes:
  • The patch is resized according to a fixed scale before being applied.

  • Bounding boxes in y are clamped to ensure they remain within the image boundaries.

  • If “boxes” or “labels” are missing in y, they are initialized as empty tensors.

shok.utils.transforms.scale_grad_transform module

class shok.utils.transforms.scale_grad_transform.ScaleGradTransform[source]

Bases: Module

Transforms scales the gradient of the input tensor.

forward(x, y=None)[source]

Scale the gradient of the input tensor.

shok.utils.transforms.scale_image_values module

class shok.utils.transforms.scale_image_values.ScaleImageValues(min=0, max=255)[source]

Bases: Module

Simple transform scales the image values to be between 0 and 1.

While the other v2 transforms do this, they seem to randomly mess with the labels. This transform ensures that the labels remain unchanged.

forward(x: Tensor, y=None) Tensor[source]

Scale the image values to be between 0 and 1.

Args:

x (torch.Tensor): Input image tensor. y (torch.Tensor, optional): Target tensor, not modified in this transform.

Returns:

torch.Tensor: Scaled image tensor.

shok.utils.transforms.simple_apply_patch module

class shok.utils.transforms.simple_apply_patch.SimpleApplyPatch[source]

Bases: Module

Super simple patch applying transformation.

This is used for debugging and testing purposes.

forward(x: Tensor, patch: Tensor, y: Tensor = None) Tensor[source]

Forwards the input tensor x through the transformation.

Applies a patch to the input tensor x by replacing its leading channels and spatial dimensions with those from the patch tensor. Optionally returns a target tensor y.

Args:

x (torch.Tensor): The input tensor to be modified. patch (torch.Tensor): The patch tensor to be inserted into x. y (torch.Tensor, optional): An optional target tensor to be returned.

Returns:

Tuple[torch.Tensor, torch.Tensor]: A tuple containing the modified input tensor and the optional target tensor y.

shok.utils.transforms.soft_round module

class shok.utils.transforms.soft_round.SoftRound(*args, **kwargs)[source]

Bases: Module

Transform to use the soft round function for adversarial training.

This is something being explored. Since rounding is not differentiable, additional logic is needed to ensure gradients can flow through the operation.

This way does the rounding, but then calculates what the multiplier factor was. Then this value is used to scale the gradient.

forward(x: Tensor, y=None) Tensor[source]

Applies soft rounding to the input tensor using the SoftRound function.

Args:

x (torch.Tensor): Input tensor to be processed. y (optional): An optional secondary input, not used in the transformation.

Returns:

Tuple[torch.Tensor, Any]: A tuple containing the transformed tensor and the optional secondary input.

shok.utils.transforms.target_insurance module

class shok.utils.transforms.target_insurance.TargetInsurance(*args, **kwargs)[source]

Bases: Module

Transform that makes sure object detection targets are always present.

Sometime the targets are not in the dataset and this breaks some torchvision transforms.

forward(x: Tensor, y: Tensor) Tensor[source]

Ensures that the target dictionary y contains the keys “boxes” and “labels”.

If these keys are missing, initializes “boxes” with an empty tensor of shape (0, 4) and dtype float32, and “labels” with an empty tensor of dtype int64.

Args:

x (torch.Tensor): The input tensor. y (torch.Tensor): The target dictionary containing annotation data.

Returns:

Tuple[torch.Tensor, dict]: The input tensor and the updated target dictionary.

shok.utils.transforms.utils module

shok.utils.transforms.utils.default_patched_image_mutator()[source]

Default image mutator for patching images.