Skip to content

Mintd: A Lightweight Data Product Framework for Research Labs

mintd helps social science researchers build reproducible, governed data products and research projects. It handles the lifecycle from creation to catalog so you can focus on the research.

The Data Product Lifecycle

Create --> Build --> Validate --> Push --> Catalog --> Reuse
  • Create a data product or research project with mintd create
  • Build reproducible pipelines (ingest, clean, validate) with DVC
  • Validate metadata and configuration with mintd check
  • Push data to S3-compatible cloud storage with mintd data push
  • Catalog your work in the Data Product Catalog with mintd registry register
  • Reuse data products as tracked dependencies with mintd data import

Why use mintd?

Built for Research Reproducibility

mintd automatically initializes version control for both your code (Git) and your data (DVC). Every data product has a versioned pipeline, a machine-readable schema, and governance metadata -- ensuring your results can be audited and replicated.

Data Products as First-Class Citizens

A data product is a versioned, validated, governed dataset with clear ownership. mintd scaffolds the pipeline (ingest -> clean -> validate), generates a Frictionless Table Schema, and tracks who produces and consumes each product.

Multi-Tool Compatibility

Whether you prefer Stata, R, or Python, mintd has you covered. It generates language-specific templates and utilities, including native Stata commands and automated logging, so your workflow stays consistent across tools.

Data Product Catalog

Register your data products in the lab's catalog for discoverability and access control. mintd uses a tokenless GitOps architecture -- no personal access tokens to manage, just SSH keys and the GitHub CLI.


Get Started in Seconds

# Install mintd
uv tool install git+https://github.com/health-care-affordability-lab/mintd.git

# Create a data product
mintd create data --name my-research-project --lang python

# Import an existing data product into your research project
mintd data import aha-annual-survey

Next: Installation Guide | Quick Start | GitHub


Design Notes

The notes/ directory in the repository contains design documents and decision records for significant changes:

  • notes/plan-metadata-cleanup.md — Rationale for the v1.0 → v1.1 metadata schema cleanup (10 redundancies removed)
  • notes/plan-version-aware-lineage.md — Design for version-aware data lineage tracking
  • notes/centralize-credential-injection.md — Migration from keyring to AWS profiles
  • notes/dvc-remote-config-consolidation.md — DVC remote configuration simplification
  • notes/guard-against-large-directories.md — Large directory detection for data add