s6.vision.solverΒΆ

Minimal geometric solvers for triangulation, lines, tip estimation, and pose recovery.

Provides pure-geometry routines that operate on Camera and schema primitives such as Vector2D, Vector3D, LineSegment2D, and LineSegment3D. Implementations rely on PyTorch to leverage the camera utilities and device/dtype management.

class s6.vision.solver.Solver

Bases: object

classmethod project_search_region(camera: Camera, center_world: Vector3D, radius_m: float) Tuple[Vector2D, float, float] | None

Project a constant-size 3D spherical search region to the image.

Given a world-space center and a 3D radius in the same world units as the camera extrinsics, computes the 2D pixel center and the local pixel half-extents along image x and y by projecting small offsets in the camera frame using the camera’s projection model (including distortion when enabled).

Parameters:
  • camera (Camera) – Calibrated camera instance.

  • center_world (Vector3D) – Region center in world coordinates.

  • radius_m (float) – Search sphere radius in world units matching the camera extrinsics.

Returns:

Tuple of (center_px, r_px_x, r_px_y). Returns None if the projection fails or depth is invalid.

Return type:

(Vector2D, float, float) | None

classmethod triangulate_points(cam_0: Camera, cam_1: Camera, points_0: Tensor, points_1: Tensor, return_valid: bool = False) Tensor | tuple[Tensor, Tensor]

Triangulate batched world points from two calibrated image views.

Parameters:
  • cam_0 (Camera) – Calibrated cameras sharing a common world frame.

  • cam_1 (Camera) – Calibrated cameras sharing a common world frame.

  • points_0 (Tensor) – Pixel coordinates of shape (..., 2) in the respective views.

  • points_1 (Tensor) – Pixel coordinates of shape (..., 2) in the respective views.

  • return_valid (bool, optional) – If true, return a boolean mask with shape (...) indicating non-degenerate ray intersections.

Returns:

Triangulated world points of shape (..., 3) and, optionally, the valid triangulation mask. Degenerate rays produce zero points.

Return type:

Tensor or tuple[Tensor, Tensor]

classmethod triangulate(cam_0: Camera, cam_1: Camera, point_0: Vector2D, point_1: Vector2D) Vector3D

Triangulate a 3D point from two calibrated camera observations.

The method forms two 3D rays from camera centers through the observed pixel coordinates by unprojecting, transforms them into world space, and computes the closest points between the two skew lines. The mid- point of the shortest segment connecting the rays is returned.

Parameters:
  • cam_0 (Camera) – Calibrated cameras providing intrinsics and extrinsics.

  • cam_1 (Camera) – Calibrated cameras providing intrinsics and extrinsics.

  • point_0 (Vector2D) – Pixel observations in the respective image planes.

  • point_1 (Vector2D) – Pixel observations in the respective image planes.

Returns:

Estimated 3D point in world coordinates. If the rays are nearly parallel, a zero vector is returned as a conservative fallback.

Return type:

Vector3D

classmethod triangulate_line_segment(cam_0: Camera, cam_1: Camera, segment_0: LineSegment2D, segment_1: LineSegment2D) LineSegment3D | None

Triangulate a supported world-space line segment from two image segments.

Each image-space segment defines a plane passing through that camera center. The intersection of those two planes defines the 3D line. The returned finite segment is derived by projecting the four observed endpoint rays onto that line and taking the supported interval.

Parameters:
  • cam_0 (Camera) – Calibrated cameras sharing a common world frame.

  • cam_1 (Camera) – Calibrated cameras sharing a common world frame.

  • segment_0 (LineSegment2D) – Two image-space support segments in the respective image planes.

  • segment_1 (LineSegment2D) – Two image-space support segments in the respective image planes.

Returns:

World-space supported segment, or None when the geometry is degenerate or numerically unstable.

Return type:

LineSegment3D | None

classmethod recover_pose3d(model_points: Sequence[Vector3D] | Mapping[str, Vector3D], observed_points: Sequence[Vector3D] | Mapping[str, Vector3D]) Pose3DRecoveryResult

Recover a best-fit rigid pose from 3D point correspondences.

The recovered pose maps model-space points into the observed frame: p_observed ~= R @ p_model + t. Rotation is estimated with a least-squares orthogonal-Procrustes/Kabsch solve, and translation is derived from the fitted centroids.

Parameters:
  • model_points (sequence[Vector3D] | mapping[str, Vector3D]) – Paired model-space and observed-space 3D points. Inputs must be the same container type, and mapping inputs must share the same keys.

  • observed_points (sequence[Vector3D] | mapping[str, Vector3D]) – Paired model-space and observed-space 3D points. Inputs must be the same container type, and mapping inputs must share the same keys.

Returns:

Best-fit quaternion/translation plus point residual metrics.

Return type:

Pose3DRecoveryResult

Raises:
  • TypeError – If the inputs are mixed container types or contain non-Vector3D values.

  • ValueError – If correspondences are missing, too few, or geometrically underconstrained.

classmethod solve_tip_point(camera: Camera, point: Vector2D, endpoint: Vector3D, length: float) Vector3D

Intersects a camera ray with a sphere to estimate an instrument tip.

Given a 2D detection point in camera and a known endpoint of an instrument in 3D along with the instrument length, the method computes the intersection(s) of the camera ray with the sphere centered at endpoint with radius length. It selects the nearest valid intersection in front of the camera.

Parameters:
  • camera (Camera) – Calibrated camera used to form the viewing ray.

  • point (Vector2D) – Pixel coordinate of the tip projection.

  • endpoint (Vector3D) – Known 3D endpoint of the instrument (world frame).

  • length (float) – Distance from the endpoint to the tip.

Returns:

Estimated 3D tip position. Returns a zero vector if no valid intersection lies in front of the camera.

Return type:

Vector3D