s6.vision.detectors¶

Classical image-space detectors used by the tracking pipeline.

This module contains OpenCV/Numpy-based routines for component detection, circle/rim localization, contour processing, and tip endpoint detection. Each function is designed to be fast and reasonably robust without requiring learned models. Many functions are annotated with Profiler.trace_function to integrate with the project’s lightweight performance tracing.

s6.vision.detectors.detect_components(image: ndarray, area_thresholds: Tuple[float, float] = (600, 5000), square_ratio: float = 1.2) → List[Vector2D]

Detect roughly circular (square‐ish) components in an image using connected components.

Parameters:

image (np.ndarray) – The input (grayscale or single‐channel) image.
area_thresholds (Tuple[float, float], optional) – (min, max) area thresholds. If both are in [0, 1], they’re taken as fractions of the total image area; otherwise as absolute pixel‐areas.
square_ratio (float, optional) – Maximum allowed width/height (or height/width) ratio. E.g. 1.2 means up to 20% rectangular distortion is allowed (default: 1.2).

Returns:

The centers of the fitted circles for all components passing filters.

Return type:

List[Vector2D]

s6.vision.detectors.fit_circle_to_component(image: ndarray, stat: ComponentStat) → Vector2D

Fits a circle to a given component in an image.

This function extracts a component from the provided image using the given component statistics. It then checks intensity conditions, resizes the extracted component, finds contours, and computes a minimum enclosing circle of the largest contour. The center of this circle is adjusted based on the original position of the component in the image.

Parameters:

image (np.ndarray) – The input image from which the component will be extracted.
stat (ComponentStat) – A dataclass containing the x, y coordinates, width, and height of the component.

Returns:

A dataclass containing the x, y coordinates of the circle’s center if a circle is successfully fitted; None otherwise.

Return type:

Vector2D or None

s6.vision.detectors.circle_model(params, x, y)

Circle model for fitting.

The model used is: sqrt((x - x0)**2 + (y - y0)**2) - r. Minimizing this function helps to find the circle parameters (x0, y0, r).

Parameters:

params (array-like) – Circle parameters [x0, y0, r].
x (np.ndarray) – X-coordinates of the data points.
y (np.ndarray) – Y-coordinates of the data points.

Returns:

Array of residuals for each data point.

Return type:

np.ndarray

s6.vision.detectors.fit_circle_to_contour(contour)

Fits a circle to the provided contour using least squares optimization.

Parameters:: contour (np.ndarray) – Contour points in the format (N, 1, 2).
Returns:: The optimized circle center (x, y) and radius.
Return type:: Tuple[float, float, float]

s6.vision.detectors.detect_centroid(image)

Detects the centroid of the largest blob in the binary-thresholded image by fitting a circle to its contour.

Parameters:: image (np.ndarray) – Input image to be processed.
Returns:: Coordinates of the blob’s centroid.
Return type:: Vector2D

s6.vision.detectors.detect_dominant_mask_line_segment(mask: ndarray) → MaskLineSegment2D | None

Detect the dominant straight line segment supported by a binary mask.

Parameters:: mask (np.ndarray) – Two-dimensional binary-like mask. Any nonzero value is treated as foreground.
Returns:: The fitted image-space line clipped to the input image rectangle, together with the centroid of the dominant connected component, or None when the mask is empty, degenerate, or cannot support a stable line estimate. If multiple components are present, only the largest connected component is considered.
Return type:: Optional[MaskLineSegment2D]
Raises:: ValueError – If mask is not two-dimensional.

s6.vision.detectors.fit_polynomial_to_contour(contour, degree=3)

Fits a polynomial curve to a contour to generate a smoothed 2D curve.

Parameters:

contour (np.ndarray) – The input contour of shape (N, 1, 2).
degree (int, optional) – Degree of the polynomial used for fitting, by default 3.

Returns:

The smoothed contour points of shape (N, 1, 2).

Return type:

np.ndarray

s6.vision.detectors.find_tip_points(image: ndarray, threshold: int = 120) → Tuple[Vector2D, Vector2D]

Identifies the two tip points of the largest contour in the image within a specified circle.

Parameters:

image (np.ndarray) – The input image for processing.
threshold (int, optional) – The threshold value for binary thresholding, by default 120.

Returns:

The two farthest tip points as Vector2D objects.

Return type:

Tuple[Vector2D, Vector2D]

s6.vision.detectors.detect_prong_tips_filtered(img: ndarray, mask: ndarray | None = None, suppression_radius: int = 10, top_k: int = 2, far_point: Tuple[float, float] | None = None) → List[Vector2D]

Detect instrument prong tip points in the image, then filter out points that are too close to each other and return the top_k points sorted by distance to the provided far_point (furthest first) or closest to the image center if far_point is None.

Parameters:

img (np.ndarray) – Grayscale image input.
mask (Optional[np.ndarray]) – Binary mask (1=active) specifying region to process. Detection runs only within masked region if provided.
suppression_radius (int) – Minimum distance (in pixels) between accepted points.
top_k (int) – Number of final tip points to return (sorted by distance to far_point if provided, otherwise closest to image center).
far_point (Optional[Tuple[float, float]]) – Reference (x, y) point for sorting. Endpoints are sorted by furthest distance to this point if provided.

Returns:

Filtered tip points.

Return type:

List[Vector2D]

s6.vision.detectors.detect_outer_rim(image, min_radius_ratio=0.4, dp=1.2, canny_thresh1=30, canny_thresh2=120, hough_param1=50, hough_param2=30, fallback_thresh=230, fallback_kernel=9)

Detect the outer bright rim in image, returning (x, y, r) only if the found radius r >= min_radius_ratio * image_height.

Tries Hough Circle first, then falls back to threshold/morphology.

Params:: image - BGR or grayscale cv2 image min_radius_ratio - min allowed circle radius as fraction of image height dp - inverse ratio for Hough accumulator canny_thresh1/2 - Canny edge thresholds hough_param1 - higher Canny threshold for Hough hough_param2 - accumulator threshold for Hough fallback_thresh - brightness threshold for fallback fallback_kernel - morphology kernel size for fallback

Returns:: (x, y, r) or None

s6.vision.detectors.detect_outer_rim_v2(image, min_radius_ratio=0.4, dp=1.2, canny_thresh1=30, canny_thresh2=120, hough_param1=50, hough_param2=30, downscale: int = 8)

Performance-oriented reimplementation of detect_outer_rim() with identical output semantics. It preserves the detection outcome while optimizing steps.

Returns (Vector2D(x, y), min_r) or None, matching v1.

class s6.vision.detectors.MaskUtils

Bases: object

Utility class for generating and caching circular masks and erasing pixels beyond boundaries.

static det_mask(center: Vector2D, size: Tuple[int, int], radius: int) → ndarray: Return a binary mask of shape size with ones inside the circle defined by center and radius. Masks are cached per parameters.

static erase_beyond_boundary(image: ndarray, center: Vector2D, radius: float) → None: Zero out pixels outside the circular boundary defined by center and radius.

static erase_projected_sphere(image: ndarray, camera: Camera, center_world: Vector3D, radius_world: float, roi: BoundingBox2D | None = None) → None

Zero out the projection of a 3D sphere on a 2D mask.

Parameters:

image (np.ndarray) – Two-dimensional mask image to edit in place.
camera (Camera) – Camera used to project the sphere into the current view.
center_world (Vector3D) – Sphere center in world coordinates.
radius_world (float) – Sphere radius in the same world units as camera extrinsics.
roi (BoundingBox2D, optional) – ROI describing how to convert global image coordinates into the local image coordinates.

s6.vision.detectors.estimate_tilt_from_rim(edge_pts: ndarray, K: ndarray, base_Z_mm: float, stick_l_mm: float, radius_mm: float, refine_iters: int = 4) → Tuple[float, float]

Estimate plate tilt (phi, theta) from ring edge points.

Solves the tilt of a gimbal‑mounted plane (no roll about its normal) from a single observed ring by fitting an ellipse in the image and refining a physically‑based projection model.

Parameters:

edge_pts (ndarray) – Sampled ring edge points of shape (N, 2) in pixel coordinates.
K (ndarray) – Camera intrinsics of shape (3, 3) with fx, fy, cx, cy in pixels.
base_Z_mm (float) – Distance from camera origin to the gimbal pivot along camera +Z (mm).
stick_l_mm (float) – Offset from the pivot to the ring plane center along the plane normal (mm).
radius_mm (float) – Physical ring radius on the plane (mm).
refine_iters (int, optional) – Gauss–Newton refinement iterations, by default 4.

Returns:

(phi_deg, theta_deg) in degrees. phi is the tilt magnitude from camera +Z; theta is the azimuth direction of tilt in [0, 360).

Return type:

Tuple[float, float]

Notes

The observed ellipse is fitted as a conic \(Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0\) and represented by the symmetric matrix \(C_{\text{obs}}\).
Initialization uses ellipse axis ratio and angle: \(\phi_0 \approx \arccos(b/a)\), \(\theta_0 \approx \alpha \pm 90^\circ\), with \(a \ge b\) and \(\alpha\) the ellipse rotation.
Circle samples on the tilted plane are \(\mathbf{P}(t) = \mathbf{P}_0 + r\,(\cos t\,\mathbf{u} + \sin t\,\mathbf{v})\), where \(\mathbf{P}_0 = [0,0,\text{base}_Z]^\top + R[0,0,\text{stick}_\ell]^\top\). Pixels follow the pinhole model \(u = f_x X/Z + c_x,\; v = f_y Y/Z + c_y\).
We minimize the algebraic conic residual via a few Gauss–Newton steps:

\[L(\phi, \theta) = \frac{1}{N} \sum_i \left( \mathbf{x}_i^\top C_{\text{obs}}\, \mathbf{x}_i \right)^2,\]

with \(\mathbf{x}_i = [u_i, v_i, 1]^\top\) from projecting \(\mathbf{P}(t_i)\) using the no‑roll tilt rotation \(R\) that maps \(\mathbf{z}\) to \(\mathbf{d}=[\sin\phi\cos\theta,\,\sin\phi\sin\theta,\,\cos\phi]^\top\).

s6.vision.detectors.detect_rim_and_estimate_tilt(image: ndarray, K: ndarray, base_Z_mm: float = 120.0, stick_l_mm: float = 8.0, radius_mm: float = 1.5, min_radius_ratio: float = 0.4, dp: float = 1.2, canny_thresh1: int = 30, canny_thresh2: int = 120, hough_param1: int = 50, hough_param2: int = 30, downscale: int = 8, edge_band_px: float = 20.0, max_edge_pts: int = 400) → Tuple[Tuple[Vector2D, float] | None, Tuple[float, float] | None]

Detect the outer rim and estimate tilt directly from an input image.

Parameters:

image (np.ndarray) – Grayscale or BGR image.
K (np.ndarray) – 3x3 camera intrinsic matrix (fx, fy, cx, cy in pixels).
base_Z_mm (float) – Physical setup constants for tilt estimation.
stick_l_mm (float) – Physical setup constants for tilt estimation.
radius_mm (float) – Physical setup constants for tilt estimation.
min_radius_ratio – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
dp – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
canny_thresh1 – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
canny_thresh2 – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
hough_param1 – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
hough_param2 – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
downscale – Parameters forwarded to detect_outer_rim_v2; defaults mirror tuned values.
edge_band_px (float) – Pixel half-width of the annulus around the detected rim used to sample edge points.
max_edge_pts (int) – Cap on the number of edge points (uniform subsample if exceeded).

Returns:

boundary: (Vector2D center, float radius) or None if rim not found tilt: (phi_deg, theta_deg) or None if edge points insufficient or solve fails

Return type:

(boundary, tilt)