Recipe: Dataset Capture + Dev Loop¶

Collect a dataset from the network source and use it to iterate on pipeline code.

This recipe covers two workflows:

Record a dataset from live network cameras (data collection).
Replay that dataset in a tight loop to develop and test _pipeline.py deterministically.

0) Prerequisites¶

s6 installed and runnable (pip install -e .).
Start the capture server in another terminal:
```
s6 stream
```
Optional but recommended: author camera ordering/ROIs once so frames are consistent:
```
s6 id
```
If you have a pipeline config (JSON/YAML), keep it under configs/ for repeatability.

1) Record from the network source¶

Use track in record-only mode so frames are saved without running inference.

# Record from the live network stream into a dataset directory
s6 track -i network -o ./datasets/run_net_01 -r

# Optionally also write logs for environment + performance context
s6 track -i network -o ./datasets/run_net_01 -r -x

Notes

-i network attaches to the running s6 stream server.
-o sets the dataset output directory (created if missing).
-r/--record-only skips inference for maximal throughput while collecting data.
With -x, a run folder is created under logs/runs/<timestamp>/ with metrics.json and a Chrome trace (perf.log.json).

2) Develop and test the pipeline on the dataset¶

Re‑run track against your saved dataset. Use --repeat to loop playback for quick iteration.

# Headless, repeat playback, and write logs for profiling
s6 track ./datasets/run_net_01 --repeat -x

# With a UI to visualize overlays while you iterate on code
s6 track ./datasets/run_net_01 --repeat -v

# Pin a specific pipeline config to keep runs consistent
s6 track ./datasets/run_net_01 --repeat -x \
  --config ./configs/pipeline.config.yaml

Tips

Edit src/s6/app/_pipeline.py, then re‑run the same command to validate changes deterministically.
For fast performance investigation, open logs/runs/<ts>/perf.log.json in chrome://tracing or Perfetto (ui.perfetto.dev).
Use s6 perf-stats to summarize timing from metrics.json and compare runs.

3) Optional: augment or refine the dataset¶

If needed, run the filter UI to prune bad samples:
```
s6 data filter ./datasets/run_net_01
```
To extend coverage, record additional sessions and merge their directories (the format is append‑friendly).

Troubleshooting¶

Recording is slow or drops frames: keep --record-only and close other heavy apps; consider disabling UI during collection.
Replay is too fast/slow: your dataset frames play as fast as the pipeline processes them; use profiling logs (-x) to spot bottlenecks.
Results differ between runs: fix your --config and environment; ensure calibrations are up‑to‑date (s6.id and configs/).

See also

Tracking app usage: application/track.md
Chrome trace workflow: recipes/pipeline_chrome_trace.md
Stats comparison tool: application/perf-stats.md