System Architecture Overview¶
This page summarizes the current Sense Core runtime architecture as implemented
in src/s6/app/track.py, src/s6/app/_contextgenerators.py, and
src/s6/app/pipeline/.
Overview¶
The runtime is organized into four layers:
Entry layer
s6.app.mainis the public CLI loader.s6 trackdispatches intos6.app.track, which owns argument parsing, config resolution, and mode selection.
Context layer
ContextGeneratorimplementations acquire frames from live GStreamer sources or replay them from aStructuredDataset.
Pipeline layer
PipelineLoadervalidates config, resolvespipeline_name, and instantiates a concreteBasePipelinesubclass.Pipeline implementations treat input context keys as read-only and publish only
context["export"]pluscontext["debug"].
Interface layer
Headless mode runs in the foreground process.
UI mode uses
s6.app._gui.MainWindowplus a spawnedTrackRuntimeworker process.Uplink mode forwards
{"export": ...}snapshots to a WebSocket.
Execution Flow¶
s6.app.maindiscovers thetrackcommand and dispatches tos6.app.track.main().trackparses CLI flags such as--input,--config,--ui,--record-only,--uplink,--repeat,--realtime-playback, and--run-level.PipelineLoader.load()reads the caller-supplied config path or config object, validates shared fields withPipelineConfigBase, resolvespipeline_name, and parses the pipeline-specific model.s6 tracksuppliesconfigs/pipeline.config.yamlby default when--configis omitted. The loader can also overriderun_levelfrom the CLI.trackbuilds one of these context generators:RemoteGSTContextGeneratorfor--input gstLocalGSTContextGeneratorfor--input gst-localLocalGSTContextGeneratorV2for--input gst-local-v2DatasetContextGeneratorfor replaying a dataset directorydb:inputs are rejected; the old database-backed streamer path is retired
Each generator yields a rolling
contextslist wherecontexts[0]is the newest frame and later entries are prior history.Unless
--record-onlyis set, the selected pipeline runs on that rolling context list.Depending on flags, the newest context is displayed in the Qt/VisPy UI, written to an output dataset, forwarded to a telemetry uplink, or processed headlessly.
Context Generators¶
ContextGenerator¶
Base class that:
yields a rolling history up to
MAX_HISTORY_LENGTH;normalizes timestamps and per-frame metadata such as
frame_serial,frame_length, andflags;optionally records contexts to
StructuredDataset;drains command-queue messages for replay controls and recording toggles;
exports profiler output on shutdown.
RemoteGSTContextGenerator¶
Uses
platform.gstreamer.client.Builds remote
pygst.client.Clientsources.Reorders frames through
platform.order_frames(...)on the selected platform implementation.
LocalGSTContextGenerator¶
Uses
platform.gstreamer.local.Builds local
pygst.client.Pipelinecapture sources.Uses
MultiCameraCaptureand records the synchronized release timestamp ascontext["timestamp"].
LocalGSTContextGeneratorV2¶
Uses
platform.gstreamer.local.Builds one
pygst.client.CombinedPipelinesource that horizontally combines the configured local cameras.Reads that combined stream through one direct Gst appsink capture, splits the frame back into per-camera images, and timestamps each release with host time.
Does not use
MultiCameraCapture; sync-only local capture settings are largely irrelevant in this mode because there is only one capture stream.
DatasetContextGenerator¶
Replays a
StructuredDatasetdirectory from disk.Supports
--repeatfor looping.Supports
--realtime-playbackto pace replay using stored timestamp deltas.Provides random-access frame reads to
TrackRuntime, which owns interactive play, pause, stop, forward/backward, seek, and dataset switching semantics.
Pipeline Architecture¶
Pipelines are stateful Python classes, not declarative DAG definitions.
All concrete pipelines inherit
BasePipeline.Heavy initialization is deferred until first use via
_ensure_initialized().load_calibrations()populatesself.cameras.load_models()loads the runtime detector models._process_frame(contexts)performs one frame of work and returnsPipelineFrameOutput._setup_views()andviewport()provide UI bindings when a pipeline has a visual layout.run_levelcomes from config by default and can be overridden froms6 trackwith--run-level.Lower run levels reduce preview and overlay work, while
debugenables the heaviest diagnostics.Shared geometry helpers live in
s6.vision.solver, including calibrated triangulation, tip solving, and rigid model-to-observed pose recovery from matched 3D correspondences.
Concrete Pipeline¶
PipelineT1¶
Camera roles:
LL,LRCalibration frame:
LLis the world-reference camera (identity extrinsic), andLRextrinsics are expressed relative to that sharedLLframe.Per-frame sequence:
prepare the current input frame
build typed
T1CameraFrame,T1RoiInput, andT1RoiDetectionstage datarun one LL/LR ROI detector batch
either use the triplet fast path for detector models with at least three keypoints or fit typed
T1LineFitsupport-line resultssolve typed
T1SolveResultgeometry and pose statepublish optional derived three-keypoint plus mask training targets under
context["debug"]["training_targets"]render overlays into the typed camera buffers while stages publish debug values into the frame-scoped
self.debug_contextaccumulatorbuild export previews
The original input keys, including camera images and replay metadata, are preserved for dataset replay. Prepared
image/bgr_imagebuffers remain under each camera key, while ROI data, diagnostic masks, and solver state are published undercontext["debug"].T1 stage data contracts live in
src/s6/app/pipeline/t1_contracts.py. T1 stage helpers should use typed inputs, return typed results, and draw directly into typed camera buffers instead of rebuilding context-shaped dictionaries.The typed ROI-preparation stage optionally predicts each LL/LR ROI by projecting the pipeline’s persistent world-space
TrajectoryV2spherical frame into the current camera whentracking.enable_predictionis enabled. Whentracking.roi_prediction_use_extrapolationisfalse, it instead projects the last acceptedTrajectoryV2.last_framesphere. For T1, each accepted tracker frame encloses the full observedtip_point/turn_point/end_pointtriplet only; frames missing any one of those points are treated as invalid tracker updates instead of refreshing the sphere. The accepted triplet getstracking.tracking_volume_radiusas extra world-space padding, and T1 projects that sphere so its projected major axis becomes the square ROI size. ROI following is driven only by that persistent sphere state, not by output-pose gate status. T1 tries the requested sphere frame first, then falls back through older sphere-derived frames, and resets both ROIs to the default LL/LR search windows only after the configuredtracking.roi_prediction_recenter_timeout_secinterval without a valid projectable sphere, or aftertracking.roi_prediction_recenter_invalid_framesconsecutive invalid tracker updates when that optional limit is set. During tolerated invalid updates,TrajectoryV2holds the last predicted sphere steady instead of extrapolating it farther. The ROI tracking box turns green only when extrapolation is enabled and the EMA-smoothed prediction path inTrajectoryV2is active.The typed solve stage has two detector-output paths. If the loaded detector exposes at least three keypoints, T1 requests no mask output, triangulates the predicted
tip/turn/endtriplet directly, and skips tip erasure, mask-line fitting, and support-line triangulation.tip.triplet_tip_sourceselects the model tip by default or the existing intensity-refined tip when set torefined; invalid triplet frames stay on the fast path and rely on the trajectory/output gate instead of falling back to mask solving. One-point detector models keep the legacy path: triangulate the T1 tip first, then optionally use that 3D tip to erase a projected sphere from each LL/LR ROI mask viasolver.tip_mask_erase_radius, then fit the resulting image lines as LL/LR diagnostic overlays before support-line triangulation. Atrun_level=normal, it also draws projected 2D connectors fortip_point -> turn_pointandturn_point -> end_point, plus a small projected pose-axis indicator recovered by aligning the fixed instrument model pointsP1/P2/P3to the solvedtip_point/turn_point/end_pointworld points, and projects the transformed model points back into both camera views. Atrun_level=dev, it also adds a compact motion/stats box and projected tip velocity arrows. Atrun_level=debug, it projects the transformedinstrument.objvertices onto theLLframe in the solvedLL/world pose. When a solve yields the full observedtip_point/turn_point/end_pointtriplet, the typed solve stage updates the pipeline-owned persistent world-space trajectory from that triplet; otherwise the frame is treated as an invalid tracker update while the instrument-tip debug payload keeps the frame-local solved geometry and validity fields. If a newly solved tracked-point set would cause an acceleration spike beyondtracking.acceleration_rejection_threshold,TrajectoryV2rejects that sample,context["debug"]["targets"]["instrument_tip"]["tip_point_raw"]preserves the raw triangulation,context["debug"]["targets"]["instrument_tip"]["tip_tracking_filtered"]is set, and downstream solving/export reuse the predicted fallback tracked points instead. Low-velocity EMA smoothing still applies only to the persistent trajectory’s prediction path; the accepted motion fields remain raw there. The persistentTrajectoryV2sphere now owns ROI follow/recenter policy, whileOutputPoseGateremains export-only andTrajectoryV2remains the 3D tracking-volume prediction and rejection helper. When a stereoline_worldis available, the typed solve stage also stores an extended displayLineSegment3Dincontext["debug"]["targets"]["instrument_tip"]["line_world"], plus the triangulated tip astip_point, the closest-pointturn_point, a 3 mm along-lineend_pointmarker selected by minimizing the summed LL/LR projection angle against the per-viewtip_point -> mask_line_segment_global.centroiddirection, the recovered rigid model pose asinstrument_posewhen all three observed points are available,pose_solve_validonly when that full triplet yields a recovered pose, keeps its persistent output-pose tracker in the native virtual camera-Bbasis, then exportsmidpoint_3dandinstrument_pose_quaternionin the visualizer world basis derived from the visualizer helper’s knownBcamera placement (midpoint_3din meters and already in scene/world coordinates, quaternion order[x, y, z, w]and ready to apply directly to the original Three.js instrument asset), then emits base64-encoded quarter-resolutionLLandLRpreviews ascontext["export"]["bgr_image_ll_base64"]andcontext["export"]["bgr_image_lr_base64"]after the frame’s buffered marker rendering is flushed whenever preview generation is enabled, converts the fixed instrument model points from their Three.js/model-local basis into the native Sense/OpenCV basis before pose recovery, and gates both translation and rotation through a persistent output-pose tracker that enters after 3 stable frames, holds the last stable pose across up to 2 dropped frames, and leaves tracking after the configured unstable-frame budget.
Tracking state for PipelineT1 lives on the pipeline instance itself through
its persistent TrajectoryV2 and output-pose gate; frame-local solver output is
published under context["debug"]["targets"]["instrument_tip"].
Runtime Modes¶
Headless mode¶
Default mode when
--uiis not set.Runs capture and optional inference in the foreground process.
UI mode¶
Enabled with
s6 track --ui.Starts a spawned worker process for capture and inference because CUDA and TensorRT initialization are not reliable in a forked child.
The main process hosts the Qt/VisPy UI and consumes processed contexts over a multiprocessing queue.
Uplink mode¶
Enabled with
s6 track --uplink [WS_URL].Runs headlessly and forwards
{"export": context["export"]}to the given WebSocket endpoint.Intentionally incompatible with
--ui.
Retired Components¶
The old streamer-dependent runtime is retired:
s6 streams6 ids6 data collects6 track -i network
See retired_streamer.md for migration context.