# Recipe: Dataset Capture + Dev Loop _Collect a dataset from the network source and use it to iterate on pipeline code._ This recipe covers two workflows: - Record a dataset from live network cameras (data collection). - Replay that dataset in a tight loop to develop and test `_pipeline.py` deterministically. --- ## 0) Prerequisites - `s6` installed and runnable (`pip install -e .`). - Start the capture server in another terminal: ```bash s6 stream ``` - Optional but recommended: author camera ordering/ROIs once so frames are consistent: ```bash s6 id ``` - If you have a pipeline config (JSON/YAML), keep it under `configs/` for repeatability. --- ## 1) Record from the network source Use `track` in record-only mode so frames are saved without running inference. ```bash # Record from the live network stream into a dataset directory s6 track -i network -o ./datasets/run_net_01 -r # Optionally also write logs for environment + performance context s6 track -i network -o ./datasets/run_net_01 -r -x ``` Notes - `-i network` attaches to the running `s6 stream` server. - `-o` sets the dataset output directory (created if missing). - `-r/--record-only` skips inference for maximal throughput while collecting data. - With `-x`, a run folder is created under `logs/runs//` with `metrics.json` and a Chrome trace (`perf.log.json`). --- ## 2) Develop and test the pipeline on the dataset Re‑run `track` against your saved dataset. Use `--repeat` to loop playback for quick iteration. ```bash # Headless, repeat playback, and write logs for profiling s6 track ./datasets/run_net_01 --repeat -x # With a UI to visualize overlays while you iterate on code s6 track ./datasets/run_net_01 --repeat -v # Pin a specific pipeline config to keep runs consistent s6 track ./datasets/run_net_01 --repeat -x \ --config ./configs/pipeline.config.yaml ``` Tips - Edit `src/s6/app/_pipeline.py`, then re‑run the same command to validate changes deterministically. - For fast performance investigation, open `logs/runs//perf.log.json` in `chrome://tracing` or Perfetto (ui.perfetto.dev). - Use `s6 perf-stats` to summarize timing from `metrics.json` and compare runs. --- ## 3) Optional: augment or refine the dataset - If needed, run the filter UI to prune bad samples: ```bash s6 data filter ./datasets/run_net_01 ``` - To extend coverage, record additional sessions and merge their directories (the format is append‑friendly). --- ## Troubleshooting - Recording is slow or drops frames: keep `--record-only` and close other heavy apps; consider disabling UI during collection. - Replay is too fast/slow: your dataset frames play as fast as the pipeline processes them; use profiling logs (`-x`) to spot bottlenecks. - Results differ between runs: fix your `--config` and environment; ensure calibrations are up‑to‑date (`s6.id` and `configs/`). --- See also - Tracking app usage: `application/track.md` - Chrome trace workflow: `recipes/pipeline_chrome_trace.md` - Stats comparison tool: `application/perf-stats.md`