CLI Reference
Main Commands
Project Creation
Create Data Product
Creates a data product repository (data_{name}).
Create Project
Creates a project repository (prj_{name}).
Track Code Repository
Tracks a code-only repository (library, package, tool) by dropping a metadata.json for governance, ownership, and mirroring. No directory scaffold is created.
Create Enclave Workspace
Creates a secure data enclave workspace (enclave_{name}).
Enclave-specific options:
| Option | Description |
|---|---|
--registry-url TEXT |
Data Product Catalog GitHub URL |
Create from Custom Template
Common Options
| Option | Description |
|---|---|
-n, --name TEXT |
Project name (required) |
-p, --path PATH |
Output directory (default: current) |
--lang TEXT |
Primary programming language (python\|r\|stata), required for data/project |
--no-git |
Skip Git initialization |
--no-dvc |
Skip DVC initialization |
--bucket TEXT |
Custom DVC bucket name |
--register |
Register project with Data Product Catalog |
--use-current-repo |
Use current directory as project root |
Governance Options
These options are available for data, project, and code commands:
| Option | Description |
|---|---|
--public |
Mark as public data |
--private |
Mark as private/lab data (default) |
--contract TEXT |
Mark as contract data (provide contract slug) |
--contract-info TEXT |
Description or link to contract |
--team TEXT |
Owning team slug |
--admin-team TEXT |
Override default admin team |
--researcher-team TEXT |
Override default researcher team |
Configuration
mintd config show # Show current config
mintd config setup # Interactive setup
mintd config setup --set KEY VALUE # Set specific value
mintd config setup --set-credentials # Set storage credentials
mintd config setup --from lab-config.yaml # Import config from YAML file
mintd config validate # Test S3 connection
Data Management
mintd data add <targets> # Track files/directories with DVC
mintd data list # List available data products
mintd data list --imported # List imported dependencies
mintd data pull # Pull DVC data in current project
mintd data pull <product> # Clone product repo + pull primary data
mintd data pull <product> --all # Clone product repo + pull all data
mintd data pull <product> --rev v2.0 # Pull a specific version
mintd data import <product> # Import data/final/ from product (default)
mintd data import <product> --all # Import entire data/ directory
mintd data import <product> --stage raw # Import specific stage
mintd data push # Push all DVC-tracked data to project remote
mintd data push <targets> # Push specific .dvc files or stages
mintd data verify # Verify all .dvc files have valid hashes
mintd data verify <targets> # Verify specific .dvc files
mintd data update # Update all DVC imports to latest version
mintd data update <path> # Update specific .dvc file
mintd data remove <import> # Remove a data import from the project
Data Add Options
mintd data add wraps dvc add with a guard that warns when tracking large directories (which cause hanging pulls with version_aware=true).
| Option | Description |
|---|---|
TARGETS |
Files or directories to track (required) |
--no-commit |
Don't put files/directories into cache |
--glob |
Allow targets with shell-style wildcards |
-o, --out TEXT |
Destination path to put files to |
--to-remote |
Upload directly to remote storage |
-r, --remote TEXT |
Remote storage to use |
--remote-jobs INT |
Number of parallel upload jobs |
-f, --force |
Override local file or folder if exists |
--no-relink |
Don't recreate links from cache to workspace |
Data Pull Options
mintd data pull has two modes:
Inside a mintd project (no arguments): Fetches DVC-tracked data for the current project. Uses a fast S3 sync path that bypasses DVC's per-file HEAD requests when possible (version_aware remotes), falling back to dvc pull -r <remote> for targets that cannot be fast-synced. Files-format directory targets (cloud-versioned directories with inline files: lists) are retried automatically on transient S3 errors and are never sent to the dvc pull fallback (which is broken for cloud-versioned directories). Import .dvc files (created by dvc import) are automatically detected and pulled from their source repositories rather than the project's own remote.
With a product name: Clones the product's GitHub repo and pulls its primary data from S3 via DVC. The primary data path is read from the catalog's data_products.primary field, falling back to data/final/ for older projects. Pipeline outputs using wdir in dvc.yaml are correctly resolved.
| Option | Description |
|---|---|
PRODUCT_NAME |
Product to clone and pull (optional) |
-d, --dest TEXT |
Destination directory (default: ./<product-name>/) |
--rev TEXT |
Git tag or ref (default: latest) |
--all |
Pull all DVC data, not just the primary product |
-j, --jobs INT |
Number of parallel download jobs |
-p, --project-path PATH |
Path to project directory (for local pull) |
Examples:
# Pull DVC data inside your current project
mintd data pull
# Clone a product repo and pull its primary data
mintd data pull aha-annual-survey
# Clone and pull all data (not just primary)
mintd data pull aha-annual-survey --all
# Clone to a specific directory
mintd data pull aha-annual-survey --dest ~/Desktop/aha-data
# Pull a specific version
mintd data pull aha-annual-survey --rev v2.0
Data Import Options
By default, mintd data import imports only data/final/ (the validated output) from the source data product. If data/final/ is not found, you are prompted to choose from available directories.
| Option | Description |
|---|---|
--stage TEXT |
Pipeline stage to import (raw, intermediate, final) |
--source-path TEXT |
Specific path to import from the product |
--all |
Import the entire data/ directory |
--dest TEXT |
Local destination path (default: data/imports/<product>/) |
--rev TEXT |
Specific git revision to import from |
-p, --project-path PATH |
Path to project directory |
--stage, --source-path, and --all are mutually exclusive.
Data Push Options
| Option | Description |
|---|---|
TARGETS |
Specific .dvc files or pipeline stages to push (optional) |
-j, --jobs INT |
Number of parallel upload jobs |
-p, --project-path PATH |
Path to project directory |
The push command reads the DVC remote name from metadata.json so data is always pushed to the correct S3 location. There is no need to specify the remote manually.
Before pushing, mintd data push checks all targets for .dvc files with stripped md5 hashes. Hash-missing targets are excluded from the push, reported with fix guidance, and the command returns a non-zero exit code. Use mintd data verify to check integrity without pushing.
Data Verify Options
mintd data verify checks that all .dvc files have valid md5 hashes. Files with stripped hashes (e.g. hash: md5 but no actual md5: value) are reported as errors. Exits non-zero when problems are found, making it usable in CI.
| Option | Description |
|---|---|
TARGETS |
Specific .dvc files to verify (optional — defaults to all) |
-p, --project-path PATH |
Path to project directory |
Examples:
# Verify all .dvc files in the current project
mintd data verify
# Verify specific files
mintd data verify resources/fromdropbox/raw.dvc data/raw/nber/impact2024.dta.dvc
# Verify a different project
mintd data verify --project-path /projects/data_my_analysis
When hash-missing files are found, the output includes fix guidance:
2 .dvc file(s) have missing md5 hashes:
resources/fromdropbox/raw.dvc
resources/fromdropbox/clean.dvc
To fix, run: mintd data add <path>
Or restore hashes from git history: git log -p -- <file>.dvc
Data Remove Options
| Option | Description |
|---|---|
-f, --force |
Remove even if dvc.yaml still has references |
-p, --project-path PATH |
Path to project directory |
Data Update Options
| Option | Description |
|---|---|
--rev TEXT |
Specific git revision to update to |
--dry-run |
Show what would be updated without making changes |
-p, --project-path PATH |
Path to project directory |
Registry Management
mintd registry register --path <path> # Register existing project
mintd registry status <project_name> # Check registration status
mintd registry sync # Process pending registrations
mintd registry update [project_name] [OPTIONS] # Update project metadata in registry
Registry Update Options
| Option | Description |
|---|---|
-p, --path PATH |
Path to project directory |
--dry-run |
Show changes without creating PR |
Enclave Management
mintd enclave add <product> # Add data product to approved list
mintd enclave list # List approved/transferred products
mintd enclave pull # Pull data products from registry
mintd enclave package # Package data for transfer
mintd enclave unpack <archive> # Unpack a transfer archive
mintd enclave verify # Verify transfer and update manifest
mintd enclave clean # Prune old versions, clean staging
Validation
mintd check # Check metadata.json vs DVC config consistency
mintd check --validate # Also validate metadata completeness
mintd check --path <dir> # Check a specific project directory
Check Options
| Option | Description |
|---|---|
-p, --path PATH |
Path to project directory |
--validate |
Run metadata completeness validation in addition to DVC consistency check |
The check command compares the DVC remote name and URL in metadata.json against the actual .dvc/config file. With --validate, it also checks that all required metadata fields are populated and consistent with your mintd configuration.
Template Listing
Update Commands
mintd update hooks # Add or update pre-commit hooks
mintd update utils # Update mintd utility scripts
mintd update metadata # Update metadata.json to latest schema
mintd update storage # Update DVC storage configuration
mintd update schema # Add Frictionless Table Schema support
mintd update schema --generate # Auto-generate schema from data files
mintd update schema --force # Overwrite existing schema.json
Update Metadata Options
| Option | Description |
|---|---|
-p, --path PATH |
Path to project directory |
--classification TEXT |
Data classification (public\|private\|contract) |
--mirror-url TEXT |
Mirror repository URL |
--migrate |
Migrate v1 metadata to v1.1 schema |
Update Storage Options
| Option | Description |
|---|---|
-p, --path PATH |
Path to project directory |
-y, --yes |
Skip confirmation |
Update Schema Options
| Option | Description |
|---|---|
-p, --path PATH |
Path to project directory |
-g, --generate |
Auto-generate schema from data files |
-f, --force |
Overwrite existing schema.json |
Schema Generation
Generate Frictionless Table Schema from data files. Now lives under the data group. Used by Stata data projects as a standalone DVC pipeline stage (since Stata cannot call Python directly). Python and R projects use the scaffolded schemas/generate_schema.py script instead.
mintd data schema generate # Auto-detect project root
mintd data schema generate --project-dir /path # Specify project root
mintd data schema generate --data-dir /custom/data # Override data directory
mintd data schema generate --output /custom/out.json # Override output path
Schema Generate Options
| Option | Description |
|---|---|
--project-dir PATH |
Project root directory. Auto-detected via metadata.json if not specified. |
--data-dir PATH |
Directory containing data files. Default: <project>/data/final/ |
--output PATH |
Output schema file path. Default: <project>/schemas/v1/schema.json |
The output follows the Frictionless Table Schema specification and is deterministic (no timestamps) for DVC caching compatibility.