s6.vision.camera¶

Camera model and image-space transforms for Sense Core.

This module defines a differentiable pinhole camera with optional lens distortion and a suite of utilities to transform points and images between world, camera, and pixel coordinate systems. It is written on top of PyTorch and supports batched inputs, autograd, and GPU execution where applicable.

Key concepts¶

Intrinsic matrix K: maps normalized camera coordinates to pixel space.
Extrinsic matrix E (world-to-camera): rigid transform of world points.
Distortion parameters: radial/tangential or fisheye polynomial models.
Coordinate transforms: transform/transform_inv, project, unproject for moving between spaces; warping helpers for points/images.

All point-like APIs accept tensors with an arbitrary leading batch shape and operate on the last dimension, e.g. (..., 3) for 3D and (..., 2) for 2D. Decorators in this module handle reshaping and type guarantees.

s6.vision.camera.transform_operator(input_dims, output_dims)

Decorator to broadcast point transforms over arbitrary batch shapes.

The wrapped function must accept a tensor of shape (N, input_dims) and return a tensor of shape (N, output_dims). This decorator allows callers to pass shapes like (..., input_dims). Inputs are flattened to 2D, passed to the wrapped function, then reshaped back to (..., output_dims).

Parameters:

input_dims (int) – Size of the trailing dimension expected by the wrapped function.
output_dims (int) – Size of the trailing dimension returned by the wrapped function.

s6.vision.camera.ensure_tensor(shape: List[int] | None = None, dtype: dtype = torch.float32)

Decorator to coerce the first non-self argument to a torch.Tensor.

Accepts NumPy arrays, Python lists, or scalars and moves them to the specified dtype. Optionally validates the exact shape. Intended for helper functions that may be called with heterogeneous input types.

Parameters:

shape (list[int] | None) – If provided, require the input tensor to match this shape exactly.
dtype (torch.dtype) – Target dtype for the coerced tensor (default: torch.float32).

s6.vision.camera.meshgrid(height, width) → Tensor

Create an (H, W, 2) grid of pixel centers for sampling operations.

The grid stores subpixel-centered coordinates (x+0.5, y+0.5) suitable for image sampling and warping.

Returns:: A tensor of shape (height, width, 2) containing the pixel coordinates.
Return type:: torch.Tensor

class s6.vision.camera.Camera(intrinsic: Tensor | ndarray | None = None, extrinsic: Tensor | None = None, fov: float | None = None, distortion: Tensor | ndarray | None = None, resolution: Tuple[int, int] = (1024, 1024), requires_grad: bool = False, fisheye: bool = False, near: float = 0.01, far: float = 1000.0, dtype: dtype = torch.float32)

Bases: object

Pinhole camera with optional lens distortion and PyTorch ops.

The camera stores intrinsics, extrinsics (world-to-camera), and optional lens distortion parameters. Methods provide common transformations between world, camera, and pixel spaces; image/point warping; and look-at/tilting helpers for virtual camera control.

Most methods accept batched tensors and preserve leading dimensions.

create_intrinsic_matrix(intrinsic_params) → Tensor: create a 3x3 intrinsic matrix from intrinsic parameters cx,cy,fx,fy, to ensure that the intrinsic matrix remain an intrinsic matrix after gradient update.

property requires_grad: bool

property fisheye

property dtype

property near

property far

property cx: Tensor: Principal point x-coordinate (in pixels).

property cy: Tensor: Principal point y-coordinate (in pixels).

property fx: Tensor: Focal length along x (in pixels).

property fy: Tensor: Focal length along y (in pixels).

property resolution: Tuple[int, int]: Image resolution as (H, W).

property intrinsic: Tensor: 3x3 intrinsic matrix K.

property intrinsic_inv: Tensor: Inverse of the 3x3 intrinsic matrix K^{-1}.

property distortion: Tensor: Distortion coefficient vector.

property extrinsic: Tensor: 4x4 world-to-camera transform matrix E.

property extrinsic_inv: Tensor: Inverse of the extrinsic matrix (camera-to-world).

property projection_matrix: Tensor: 3x4 projection matrix P = K [R|t].

property translation: Tensor: Camera translation vector in world coordinates (from E^{-1}).

property rotation_matrix: Tensor: 3x3 rotation part of the extrinsic matrix (world-to-camera).

property vfov: Tensor: Return the vertical field of view in degrees.

property hfov: Tensor: Return the horizontal field of view in degrees.

property forward

classmethod calculate_intrinsic_matrix_fov(fov: float, resolution: Tuple[int, int]) → Tuple[float, float, float, float]: Calculate the intrinsic matrix using field of view.

classmethod calculate_intrinsic_matrix_focal_length(focal_length: float, sensor_size: float, resolution: Tuple[int, int], vertical: bool = True) → Tuple[float, float, float, float]: Calculate the intrinsic matrix using focal length and sensor size.

static to_homogeneous(points: Tensor) → Tensor

Convert points to homogeneous coordinates using PyTorch.

Parameters:: points – A torch.Tensor of points. The shape is (…, N).
Returns:: Points in homogeneous coordinates with shape (…, N+1).

static from_homogeneous(points_h: Tensor) → Tensor

Convert points from homogeneous coordinates to Cartesian coordinates using PyTorch.

Parameters:: points_h – A torch.Tensor of points in homogeneous coordinates. The shape is (…, N+1).
Returns:: Points in Cartesian coordinates with shape (…, N).

transform(points: Tensor) → Tensor

Convert points from world coordinates to the camera’s view space using PyTorch tensors.

Parameters:: points – A torch.Tensor of points. The shape is (…, 3).
Returns:: Transformed points in the camera’s view space.

transform_inv(points: Tensor) → Tensor

Convert points from the camera’s coordinate system to the world coordinate system.

Parameters:: points – A numpy array of points in the camera’s coordinate system. The shape is (…, 3).
Returns:: Points in the world coordinate system.

project(points: Tensor) → Tensor

Project points from the camera’s view space to the camera’s pixel space using PyTorch tensors.

Parameters:: points – A torch.Tensor of points. The shape is (…, 3).
Returns:: Projected points in the camera’s pixel space.

unproject(points: Tensor) → Tensor

Unproject 2D points from the image plane to 3D space in normalized homogeneous coordinates using PyTorch tensors.

Parameters:: points – A torch.Tensor of 2D points. The shape is (…, 2).
Returns:: Points in 3D space as normalized homogeneous coordinates.

resize(new_resolution): Change the camera’s resolution using PyTorch. :param new_resolution: New resolution as a tuple (width, height). :return: New Camera instance with updated resolution and adjusted intrinsic matrix.

zoom(new_fov: float, new_resolution: List[float]): Adjust the camera’s field of view using PyTorch. :param new_hfov: New horizontal field of view in degrees (optional). :param new_vfov: New vertical field of view in degrees (optional). :return: New Camera instance with updated intrinsic matrix.

classmethod from_physical_parameters(resolution, sensor_size, focal_length, **kwargs) → Camera

distort_points(points: Tensor): Apply non-fisheye distortion to 2D points using PyTorch.

undistort_points(points: Tensor, iterations=5): Iteratively undistort points using the non-fisheye model.

distort_points_fisheye(points: Tensor) → Tensor

Apply fisheye distortion to 2D points using PyTorch.

Parameters:: points (torch.Tensor) – Tensor of shape (*batch_shape, …, 2), 2D points.
Returns:: Distorted 2D points.
Return type:: torch.Tensor

undistort_points_fisheye(points: Tensor) → Tensor

Fisheye undistort 2D points using PyTorch.

Parameters:: points (torch.Tensor) – Tensor of shape (*batch_shape, …, 2), 2D points.
Returns:: Undistorted 2D points.
Return type:: torch.Tensor

classmethod rotation_matrix_from_axis_angle(axis, angle): Compute the rotation matrix from an axis and an angle (Rodrigues’ rotation formula)

tilt(angle: float) → Camera

Tilt the Camera instance by rotating its extrinsic matrix along the local z-axis.

Parameters:: angle (float) – Tilt angle in radians.
Return type:: New Camera instance with the tilted extrinsic matrix.

look_at_uv(uv: Tensor) → Camera

Adjust the camera to look at a 2D point or multiple points specified in image coordinates.

Parameters:: uv (torch.Tensor) – Target points of shape (…, 2), supports arbitrary batch shape.
Returns:: A new Camera instance looking at the specified point(s).
Return type:: Camera

eye() → Camera: Return an indentical version of this Camera instance with extrinsic set to identity

warpPoints(points: Tensor, dest: Camera) → Tensor

Warp points from this camera’s view to the destination camera’s view.

Parameters:

points (torch.Tensor) – Tensor of shape (*batch_shape, …, 2), 2D points.
dest (Camera) – Destination camera with no relative translation.

Returns:

Warped 2D points in the destination camera’s view.

Return type:

torch.Tensor

warpImage(image: Tensor, dest: Camera, border_mode='zeros') → Tensor

Apply perspective warp from this camera to the destination camera to a batch of images.

Parameters:

image (torch.Tensor) – Input image tensor.
dest (Camera) – Destination camera with no relative translation.
border_mode (str, optional) – Border mode for interpolation (‘constant’, ‘nearest’, ‘reflect’, or ‘wrap’).

Returns:

Warped image.

Return type:

torch.Tensor

checkerboard(grid_size)

Generate a checkerboard pattern as a numpy array.

Parameters:

resolution (tuple) – The resolution of the image (height, width).
grid_size (int) – The size of each square in the grid.

Returns:

The checkerboard pattern as a 2D numpy array.

Return type:

numpy.ndarray

checkerboard_with_aruco(grid_size, aruco_dict=None)

Generate a checkerboard pattern with ArUco markers in the black tiles.

Parameters:

resolution (tuple) – The resolution of the image (height, width).
grid_size (int) – The size of each square in the grid.
aruco_dict (cv2.aruco_Dictionary) – The dictionary of ArUco markers to use.

Returns:

The checkerboard pattern with ArUco markers as a 2D numpy array.

Return type:

numpy.ndarray

to_dict() → Dict[str, Any]

Serialize the Camera instance to a dictionary.

Returns:: A dictionary containing all serializable attributes of the Camera.
Return type:: dict

classmethod from_dict(dict_obj: Dict[str, Any]) → Camera

Deserialize a dictionary to create a Camera instance.

Parameters:: dict_obj (dict) – A dictionary containing serialized Camera attributes.
Returns:: A new instance of Camera initialized with the provided attributes.
Return type:: Camera

look_at(target: Tensor, up: Tensor = tensor([0., 1., 0.])) → Camera

Create a rotated Camera instance facing the new view target with a specified up direction.

Parameters:

target (torch.Tensor) – Target points with compatible batch shape (…, 3)
up (torch.Tensor, optional) – Up direction vector with a shape of (3,), defaults to (0, -1, 0)

Return type:

New Camera instance

normalize(points: Tensor) → Tensor

Normalize pixel coordinates to the [0, 1) range.

Parameters:: points (Tensor) – Tensor of shape (…, 2), where the last dimension represents (x, y) coordinates.
Returns:: Normalized coordinates in the range [0, 1).
Return type:: Tensor

denormalize(points: Tensor) → Tensor

Denormalize normalized coordinates back to pixel coordinates.

Parameters:: points (Tensor) – Tensor of shape (…, 2), where the last dimension represents normalized (x, y) coordinates.
Returns:: Pixel coordinates.
Return type:: Tensor