s6.app.dataset¶
Manage dataset directories locally and in R2.
This master command provides subcommands to list, upload, download, and
delete dataset directories. A “dataset directory” is any folder that contains
data.jsonl (see examples under ./temp). By default, the command works
under the local root ./temp and stores remote datasets under the R2 prefix
datasets/<name>/ in the configured bucket.
Defaults¶
Local root:
./temp(override with--root)Remote base prefix:
datasets/(override with--remote-prefix)R2 connection defaults (bucket/endpoint) match the R2 CLI defaults and can be overridden with flags or environment variables.
Upload/download use multiple threads and always show progress by default.
Examples
List local datasets:
s6 dataset listList remote datasets:
s6 dataset list --remoteUpload local dataset
./temp/diverse_2to R2:s6 dataset upload diverse_2Download remote dataset into
./temp/fail_case:s6 dataset download fail_caseDelete a remote dataset:
s6 dataset delete fail_case --yes
- s6.app.dataset.cmd_list(args: Namespace) None
- s6.app.dataset.cmd_upload(args: Namespace) None
- s6.app.dataset.cmd_download(args: Namespace) None
- s6.app.dataset.cmd_delete(args: Namespace) None
- s6.app.dataset.main() None