Skip to content

CLI Reference

Main Commands

mintd --help                    # Show help
mintd --version                 # Show version

Project Creation

Create Data Product

mintd create data --name <name> --lang <language> [OPTIONS]

Creates a data product repository (data_{name}).

Create Project

mintd create project --name <name> --lang <language> [OPTIONS]

Creates a project repository (prj_{name}).

Track Code Repository

mintd create code --name <name> --lang <language> [OPTIONS]

Tracks a code-only repository (library, package, tool) by dropping a metadata.json for governance, ownership, and mirroring. No directory scaffold is created.

Create Enclave Workspace

mintd create enclave --name <name> [OPTIONS]

Creates a secure data enclave workspace (enclave_{name}).

Enclave-specific options:

Option Description
--registry-url TEXT Data Product Catalog GitHub URL

Create from Custom Template

mintd create custom <template_name> --name <name> [OPTIONS]

Common Options

Option Description
-n, --name TEXT Project name (required)
-p, --path PATH Output directory (default: current)
--lang TEXT Primary programming language (python\|r\|stata), required for data/project
--no-git Skip Git initialization
--no-dvc Skip DVC initialization
--bucket TEXT Custom DVC bucket name
--register Register project with Data Product Catalog
--use-current-repo Use current directory as project root

Governance Options

These options are available for data, project, and code commands:

Option Description
--public Mark as public data
--private Mark as private/lab data (default)
--contract TEXT Mark as contract data (provide contract slug)
--contract-info TEXT Description or link to contract
--team TEXT Owning team slug
--admin-team TEXT Override default admin team
--researcher-team TEXT Override default researcher team

Configuration

mintd config show                         # Show current config
mintd config setup                        # Interactive setup
mintd config setup --set KEY VALUE        # Set specific value
mintd config setup --set-credentials      # Set storage credentials
mintd config setup --from lab-config.yaml # Import config from YAML file
mintd config validate                     # Test S3 connection

Data Management

mintd data add <targets>              # Track files/directories with DVC
mintd data list                       # List available data products
mintd data list --imported            # List imported dependencies
mintd data pull                       # Pull DVC data in current project
mintd data pull <product>             # Clone product repo + pull primary data
mintd data pull <product> --all       # Clone product repo + pull all data
mintd data pull <product> --rev v2.0  # Pull a specific version
mintd data import <product>           # Import data/final/ from product (default)
mintd data import <product> --all     # Import entire data/ directory
mintd data import <product> --stage raw  # Import specific stage
mintd data push                       # Push all DVC-tracked data to project remote
mintd data push <targets>             # Push specific .dvc files or stages
mintd data verify                     # Verify all .dvc files have valid hashes
mintd data verify <targets>           # Verify specific .dvc files
mintd data update                     # Update all DVC imports to latest version
mintd data update <path>              # Update specific .dvc file
mintd data remove <import>            # Remove a data import from the project

Data Add Options

mintd data add wraps dvc add with a guard that warns when tracking large directories (which cause hanging pulls with version_aware=true).

Option Description
TARGETS Files or directories to track (required)
--no-commit Don't put files/directories into cache
--glob Allow targets with shell-style wildcards
-o, --out TEXT Destination path to put files to
--to-remote Upload directly to remote storage
-r, --remote TEXT Remote storage to use
--remote-jobs INT Number of parallel upload jobs
-f, --force Override local file or folder if exists
--no-relink Don't recreate links from cache to workspace

Data Pull Options

mintd data pull has two modes:

Inside a mintd project (no arguments): Fetches DVC-tracked data for the current project. Uses a fast S3 sync path that bypasses DVC's per-file HEAD requests when possible (version_aware remotes), falling back to dvc pull -r <remote> for targets that cannot be fast-synced. Files-format directory targets (cloud-versioned directories with inline files: lists) are retried automatically on transient S3 errors and are never sent to the dvc pull fallback (which is broken for cloud-versioned directories). Import .dvc files (created by dvc import) are automatically detected and pulled from their source repositories rather than the project's own remote.

With a product name: Clones the product's GitHub repo and pulls its primary data from S3 via DVC. The primary data path is read from the catalog's data_products.primary field, falling back to data/final/ for older projects. Pipeline outputs using wdir in dvc.yaml are correctly resolved.

Option Description
PRODUCT_NAME Product to clone and pull (optional)
-d, --dest TEXT Destination directory (default: ./<product-name>/)
--rev TEXT Git tag or ref (default: latest)
--all Pull all DVC data, not just the primary product
-j, --jobs INT Number of parallel download jobs
-p, --project-path PATH Path to project directory (for local pull)

Examples:

# Pull DVC data inside your current project
mintd data pull

# Clone a product repo and pull its primary data
mintd data pull aha-annual-survey

# Clone and pull all data (not just primary)
mintd data pull aha-annual-survey --all

# Clone to a specific directory
mintd data pull aha-annual-survey --dest ~/Desktop/aha-data

# Pull a specific version
mintd data pull aha-annual-survey --rev v2.0

Data Import Options

By default, mintd data import imports only data/final/ (the validated output) from the source data product. If data/final/ is not found, you are prompted to choose from available directories.

Option Description
--stage TEXT Pipeline stage to import (raw, intermediate, final)
--source-path TEXT Specific path to import from the product
--all Import the entire data/ directory
--dest TEXT Local destination path (default: data/imports/<product>/)
--rev TEXT Specific git revision to import from
-p, --project-path PATH Path to project directory

--stage, --source-path, and --all are mutually exclusive.

Data Push Options

Option Description
TARGETS Specific .dvc files or pipeline stages to push (optional)
-j, --jobs INT Number of parallel upload jobs
-p, --project-path PATH Path to project directory

The push command reads the DVC remote name from metadata.json so data is always pushed to the correct S3 location. There is no need to specify the remote manually.

Before pushing, mintd data push checks all targets for .dvc files with stripped md5 hashes. Hash-missing targets are excluded from the push, reported with fix guidance, and the command returns a non-zero exit code. Use mintd data verify to check integrity without pushing.

Data Verify Options

mintd data verify checks that all .dvc files have valid md5 hashes. Files with stripped hashes (e.g. hash: md5 but no actual md5: value) are reported as errors. Exits non-zero when problems are found, making it usable in CI.

Option Description
TARGETS Specific .dvc files to verify (optional — defaults to all)
-p, --project-path PATH Path to project directory

Examples:

# Verify all .dvc files in the current project
mintd data verify

# Verify specific files
mintd data verify resources/fromdropbox/raw.dvc data/raw/nber/impact2024.dta.dvc

# Verify a different project
mintd data verify --project-path /projects/data_my_analysis

When hash-missing files are found, the output includes fix guidance:

2 .dvc file(s) have missing md5 hashes:
  resources/fromdropbox/raw.dvc
  resources/fromdropbox/clean.dvc

To fix, run:  mintd data add <path>
Or restore hashes from git history:  git log -p -- <file>.dvc

Data Remove Options

Option Description
-f, --force Remove even if dvc.yaml still has references
-p, --project-path PATH Path to project directory

Data Update Options

Option Description
--rev TEXT Specific git revision to update to
--dry-run Show what would be updated without making changes
-p, --project-path PATH Path to project directory

Registry Management

mintd registry register --path <path>                # Register existing project
mintd registry status <project_name>                 # Check registration status
mintd registry sync                                  # Process pending registrations
mintd registry update [project_name] [OPTIONS]       # Update project metadata in registry

Registry Update Options

Option Description
-p, --path PATH Path to project directory
--dry-run Show changes without creating PR

Enclave Management

mintd enclave add <product>           # Add data product to approved list
mintd enclave list                    # List approved/transferred products
mintd enclave pull                    # Pull data products from registry
mintd enclave package                 # Package data for transfer
mintd enclave unpack <archive>        # Unpack a transfer archive
mintd enclave verify                  # Verify transfer and update manifest
mintd enclave clean                   # Prune old versions, clean staging

Validation

mintd check                           # Check metadata.json vs DVC config consistency
mintd check --validate                # Also validate metadata completeness
mintd check --path <dir>              # Check a specific project directory

Check Options

Option Description
-p, --path PATH Path to project directory
--validate Run metadata completeness validation in addition to DVC consistency check

The check command compares the DVC remote name and URL in metadata.json against the actual .dvc/config file. With --validate, it also checks that all required metadata fields are populated and consistent with your mintd configuration.

Template Listing

mintd create --list-templates         # List available project templates
mintd create -l                       # Short form

Update Commands

mintd update hooks                    # Add or update pre-commit hooks
mintd update utils                    # Update mintd utility scripts
mintd update metadata                 # Update metadata.json to latest schema
mintd update storage                  # Update DVC storage configuration
mintd update schema                   # Add Frictionless Table Schema support
mintd update schema --generate        # Auto-generate schema from data files
mintd update schema --force           # Overwrite existing schema.json

Update Metadata Options

Option Description
-p, --path PATH Path to project directory
--classification TEXT Data classification (public\|private\|contract)
--mirror-url TEXT Mirror repository URL
--migrate Migrate v1 metadata to v1.1 schema

Update Storage Options

Option Description
-p, --path PATH Path to project directory
-y, --yes Skip confirmation

Update Schema Options

Option Description
-p, --path PATH Path to project directory
-g, --generate Auto-generate schema from data files
-f, --force Overwrite existing schema.json

Schema Generation

Generate Frictionless Table Schema from data files. Now lives under the data group. Used by Stata data projects as a standalone DVC pipeline stage (since Stata cannot call Python directly). Python and R projects use the scaffolded schemas/generate_schema.py script instead.

mintd data schema generate                           # Auto-detect project root
mintd data schema generate --project-dir /path       # Specify project root
mintd data schema generate --data-dir /custom/data   # Override data directory
mintd data schema generate --output /custom/out.json # Override output path

Schema Generate Options

Option Description
--project-dir PATH Project root directory. Auto-detected via metadata.json if not specified.
--data-dir PATH Directory containing data files. Default: <project>/data/final/
--output PATH Output schema file path. Default: <project>/schemas/v1/schema.json

The output follows the Frictionless Table Schema specification and is deterministic (no timestamps) for DVC caching compatibility.