s6.app.dataset

Manage dataset directories locally and in R2.

This master command provides subcommands to list, upload, download, and delete dataset directories. A “dataset directory” is any folder that contains data.jsonl (see examples under ./temp). By default, the command works under the local root ./temp and stores remote datasets under the R2 prefix datasets/<name>/ in the configured bucket.

Defaults

  • Local root: ./temp (override with --root)

  • Remote base prefix: datasets/ (override with --remote-prefix)

  • R2 connection defaults (bucket/endpoint) match the R2 CLI defaults and can be overridden with flags or environment variables.

  • Upload/download use multiple threads and always show progress by default.

Examples

  • List local datasets: s6 dataset list

  • List remote datasets: s6 dataset list --remote

  • Upload local dataset ./temp/diverse_2 to R2: s6 dataset upload diverse_2

  • Download remote dataset into ./temp/fail_case: s6 dataset download fail_case

  • Delete a remote dataset: s6 dataset delete fail_case --yes

s6.app.dataset.cmd_list(args: Namespace) None
s6.app.dataset.cmd_upload(args: Namespace) None
s6.app.dataset.cmd_download(args: Namespace) None
s6.app.dataset.cmd_delete(args: Namespace) None
s6.app.dataset.main() None