s6.vision.camera¶
Camera model and image-space transforms for Sense Core.
This module defines a differentiable pinhole camera with optional lens distortion and a suite of utilities to transform points and images between world, camera, and pixel coordinate systems. It is written on top of PyTorch and supports batched inputs, autograd, and GPU execution where applicable.
Key concepts¶
Intrinsic matrix
K: maps normalized camera coordinates to pixel space.Extrinsic matrix
E(world-to-camera): rigid transform of world points.Distortion parameters: radial/tangential or fisheye polynomial models.
Coordinate transforms:
transform/transform_inv,project,unprojectfor moving between spaces; warping helpers for points/images.
All point-like APIs accept tensors with an arbitrary leading batch shape
and operate on the last dimension, e.g. (..., 3) for 3D and (..., 2)
for 2D. Decorators in this module handle reshaping and type guarantees.
- s6.vision.camera.transform_operator(input_dims, output_dims)
Decorator to broadcast point transforms over arbitrary batch shapes.
The wrapped function must accept a tensor of shape
(N, input_dims)and return a tensor of shape(N, output_dims). This decorator allows callers to pass shapes like(..., input_dims). Inputs are flattened to 2D, passed to the wrapped function, then reshaped back to(..., output_dims).- Parameters:
input_dims (int) – Size of the trailing dimension expected by the wrapped function.
output_dims (int) – Size of the trailing dimension returned by the wrapped function.
- s6.vision.camera.ensure_tensor(shape: List[int] | None = None, dtype: dtype = torch.float32)
Decorator to coerce the first non-
selfargument to a torch.Tensor.Accepts NumPy arrays, Python lists, or scalars and moves them to the specified dtype. Optionally validates the exact shape. Intended for helper functions that may be called with heterogeneous input types.
- Parameters:
shape (list[int] | None) – If provided, require the input tensor to match this shape exactly.
dtype (torch.dtype) – Target dtype for the coerced tensor (default:
torch.float32).
- s6.vision.camera.meshgrid(height, width) Tensor
Create an
(H, W, 2)grid of pixel centers for sampling operations.The grid stores subpixel-centered coordinates
(x+0.5, y+0.5)suitable for image sampling and warping.- Returns:
A tensor of shape (height, width, 2) containing the pixel coordinates.
- Return type:
torch.Tensor
- class s6.vision.camera.Camera(intrinsic: Tensor | ndarray | None = None, extrinsic: Tensor | None = None, fov: float | None = None, distortion: Tensor | ndarray | None = None, resolution: Tuple[int, int] = (1024, 1024), requires_grad: bool = False, fisheye: bool = False, near: float = 0.01, far: float = 1000.0, dtype: dtype = torch.float32)
Bases:
objectPinhole camera with optional lens distortion and PyTorch ops.
The camera stores intrinsics, extrinsics (world-to-camera), and optional lens distortion parameters. Methods provide common transformations between world, camera, and pixel spaces; image/point warping; and look-at/tilting helpers for virtual camera control.
Most methods accept batched tensors and preserve leading dimensions.
- create_intrinsic_matrix(intrinsic_params) Tensor
create a 3x3 intrinsic matrix from intrinsic parameters cx,cy,fx,fy, to ensure that the intrinsic matrix remain an intrinsic matrix after gradient update.
- property requires_grad: bool
- property fisheye
- property dtype
- property near
- property far
- property cx: Tensor
Principal point x-coordinate (in pixels).
- property cy: Tensor
Principal point y-coordinate (in pixels).
- property fx: Tensor
Focal length along x (in pixels).
- property fy: Tensor
Focal length along y (in pixels).
- property resolution: Tuple[int, int]
Image resolution as
(H, W).
- property intrinsic: Tensor
3x3 intrinsic matrix
K.
- property intrinsic_inv: Tensor
Inverse of the 3x3 intrinsic matrix
K^{-1}.
- property distortion: Tensor
Distortion coefficient vector.
- property extrinsic: Tensor
4x4 world-to-camera transform matrix
E.
- property extrinsic_inv: Tensor
Inverse of the extrinsic matrix (camera-to-world).
- property projection_matrix: Tensor
3x4 projection matrix
P = K [R|t].
- property translation: Tensor
Camera translation vector in world coordinates (from
E^{-1}).
- property rotation_matrix: Tensor
3x3 rotation part of the extrinsic matrix (world-to-camera).
- property vfov: Tensor
Return the vertical field of view in degrees.
- property hfov: Tensor
Return the horizontal field of view in degrees.
- property forward
- classmethod calculate_intrinsic_matrix_fov(fov: float, resolution: Tuple[int, int]) Tuple[float, float, float, float]
Calculate the intrinsic matrix using field of view.
- classmethod calculate_intrinsic_matrix_focal_length(focal_length: float, sensor_size: float, resolution: Tuple[int, int], vertical: bool = True) Tuple[float, float, float, float]
Calculate the intrinsic matrix using focal length and sensor size.
- static to_homogeneous(points: Tensor) Tensor
Convert points to homogeneous coordinates using PyTorch.
- Parameters:
points – A torch.Tensor of points. The shape is (…, N).
- Returns:
Points in homogeneous coordinates with shape (…, N+1).
- static from_homogeneous(points_h: Tensor) Tensor
Convert points from homogeneous coordinates to Cartesian coordinates using PyTorch.
- Parameters:
points_h – A torch.Tensor of points in homogeneous coordinates. The shape is (…, N+1).
- Returns:
Points in Cartesian coordinates with shape (…, N).
- transform(points: Tensor) Tensor
Convert points from world coordinates to the camera’s view space using PyTorch tensors.
- Parameters:
points – A torch.Tensor of points. The shape is (…, 3).
- Returns:
Transformed points in the camera’s view space.
- transform_inv(points: Tensor) Tensor
Convert points from the camera’s coordinate system to the world coordinate system.
- Parameters:
points – A numpy array of points in the camera’s coordinate system. The shape is (…, 3).
- Returns:
Points in the world coordinate system.
- project(points: Tensor) Tensor
Project points from the camera’s view space to the camera’s pixel space using PyTorch tensors.
- Parameters:
points – A torch.Tensor of points. The shape is (…, 3).
- Returns:
Projected points in the camera’s pixel space.
- unproject(points: Tensor) Tensor
Unproject 2D points from the image plane to 3D space in normalized homogeneous coordinates using PyTorch tensors.
- Parameters:
points – A torch.Tensor of 2D points. The shape is (…, 2).
- Returns:
Points in 3D space as normalized homogeneous coordinates.
- resize(new_resolution)
Change the camera’s resolution using PyTorch. :param new_resolution: New resolution as a tuple (width, height). :return: New Camera instance with updated resolution and adjusted intrinsic matrix.
- zoom(new_fov: float, new_resolution: List[float])
Adjust the camera’s field of view using PyTorch. :param new_hfov: New horizontal field of view in degrees (optional). :param new_vfov: New vertical field of view in degrees (optional). :return: New Camera instance with updated intrinsic matrix.
- classmethod from_physical_parameters(resolution, sensor_size, focal_length, **kwargs) Camera
- distort_points(points: Tensor)
Apply non-fisheye distortion to 2D points using PyTorch.
- undistort_points(points: Tensor, iterations=5)
Iteratively undistort points using the non-fisheye model.
- distort_points_fisheye(points: Tensor) Tensor
Apply fisheye distortion to 2D points using PyTorch.
- Parameters:
points (torch.Tensor) – Tensor of shape (*batch_shape, …, 2), 2D points.
- Returns:
Distorted 2D points.
- Return type:
torch.Tensor
- undistort_points_fisheye(points: Tensor) Tensor
Fisheye undistort 2D points using PyTorch.
- Parameters:
points (torch.Tensor) – Tensor of shape (*batch_shape, …, 2), 2D points.
- Returns:
Undistorted 2D points.
- Return type:
torch.Tensor
- classmethod rotation_matrix_from_axis_angle(axis, angle)
Compute the rotation matrix from an axis and an angle (Rodrigues’ rotation formula)
- tilt(angle: float) Camera
Tilt the Camera instance by rotating its extrinsic matrix along the local z-axis.
- Parameters:
angle (float) – Tilt angle in radians.
- Return type:
New Camera instance with the tilted extrinsic matrix.
- look_at_uv(uv: Tensor) Camera
Adjust the camera to look at a 2D point or multiple points specified in image coordinates.
- Parameters:
uv (torch.Tensor) – Target points of shape (…, 2), supports arbitrary batch shape.
- Returns:
A new Camera instance looking at the specified point(s).
- Return type:
Camera
- eye() Camera
Return an indentical version of this Camera instance with extrinsic set to identity
- warpPoints(points: Tensor, dest: Camera) Tensor
Warp points from this camera’s view to the destination camera’s view.
- Parameters:
points (torch.Tensor) – Tensor of shape (*batch_shape, …, 2), 2D points.
dest (Camera) – Destination camera with no relative translation.
- Returns:
Warped 2D points in the destination camera’s view.
- Return type:
torch.Tensor
- warpImage(image: Tensor, dest: Camera, border_mode='zeros') Tensor
Apply perspective warp from this camera to the destination camera to a batch of images.
- Parameters:
image (torch.Tensor) – Input image tensor.
dest (Camera) – Destination camera with no relative translation.
border_mode (str, optional) – Border mode for interpolation (‘constant’, ‘nearest’, ‘reflect’, or ‘wrap’).
- Returns:
Warped image.
- Return type:
torch.Tensor
- checkerboard(grid_size)
Generate a checkerboard pattern as a numpy array.
- Parameters:
resolution (tuple) – The resolution of the image (height, width).
grid_size (int) – The size of each square in the grid.
- Returns:
The checkerboard pattern as a 2D numpy array.
- Return type:
numpy.ndarray
- checkerboard_with_aruco(grid_size, aruco_dict=None)
Generate a checkerboard pattern with ArUco markers in the black tiles.
- Parameters:
resolution (tuple) – The resolution of the image (height, width).
grid_size (int) – The size of each square in the grid.
aruco_dict (cv2.aruco_Dictionary) – The dictionary of ArUco markers to use.
- Returns:
The checkerboard pattern with ArUco markers as a 2D numpy array.
- Return type:
numpy.ndarray
- to_dict() Dict[str, Any]
Serialize the Camera instance to a dictionary.
- Returns:
A dictionary containing all serializable attributes of the Camera.
- Return type:
dict
- classmethod from_dict(dict_obj: Dict[str, Any]) Camera
Deserialize a dictionary to create a Camera instance.
- Parameters:
dict_obj (dict) – A dictionary containing serialized Camera attributes.
- Returns:
A new instance of Camera initialized with the provided attributes.
- Return type:
Camera
- look_at(target: Tensor, up: Tensor = tensor([0., 1., 0.])) Camera
Create a rotated Camera instance facing the new view target with a specified up direction.
- Parameters:
target (torch.Tensor) – Target points with compatible batch shape (…, 3)
up (torch.Tensor, optional) – Up direction vector with a shape of (3,), defaults to (0, -1, 0)
- Return type:
New Camera instance
- normalize(points: Tensor) Tensor
Normalize pixel coordinates to the [0, 1) range.
- Parameters:
points (Tensor) – Tensor of shape (…, 2), where the last dimension represents (x, y) coordinates.
- Returns:
Normalized coordinates in the range [0, 1).
- Return type:
Tensor
- denormalize(points: Tensor) Tensor
Denormalize normalized coordinates back to pixel coordinates.
- Parameters:
points (Tensor) – Tensor of shape (…, 2), where the last dimension represents normalized (x, y) coordinates.
- Returns:
Pixel coordinates.
- Return type:
Tensor