Getting Started with the Daco CLI

The Daco CLI is the fastest way to start building data products with OpenDPI. In this guide, we will walk through installation, initializing a project, defining connections and ports, and translating schemas, everything you need to go from zero to a working data product definition.

Installation

Install with Homebrew (macOS/Linux):

brew install dacolabs/tap/daco

Or with Scoop (Windows):

scoop bucket add dacolabs https://github.com/dacolabs/scoop-bucket.git
scoop install daco

Or install directly with Go:

go install github.com/dacolabs/cli/cmd/daco@latest

Verify the installation:

daco version

Initialize Your First Data Product

Create a new directory and initialize a data product:

mkdir my-data-product && cd my-data-product
daco init

This creates an opendpi.yaml file, the core of your data product definition. It starts with basic metadata that you can customize:

opendpi: "1.0.0"
 
info:
  title: My Data Product
  version: "1.0.0"
  description: A short description of what this data product provides

Add a Connection

Connections describe where your data lives. Add one with the connections add command:

daco connections add

The CLI will guide you through selecting a connection type (PostgreSQL, Kafka, S3, BigQuery, and more) and configuring the required variables.

For example, a PostgreSQL connection might look like this in your opendpi.yaml:

connections:
  my_database:
    type: postgresql
    host: db.example.com

Define a Port

Ports describe the datasets your data product exposes. Add a port with:

daco ports add

The CLI creates a schema file under schemas/ and references it from your opendpi.yaml. For example, after adding a user_events port, your project structure looks like this:

my-data-product/
├── opendpi.yaml
└── schemas/
    └── user_events.schema.yaml

The port in opendpi.yaml references the schema file:

ports:
  user_events:
    description: Raw user interaction events
    connections:
      - connection: "#/connections/my_database"
        location: user_events
    schema:
      $ref: "./schemas/user_events.schema.yaml"

The generated schemas/user_events.schema.yaml starts empty. Open it and define the shape of your data using JSON Schema:

type: object
properties:
  event_id:
    type: string
  user_id:
    type: string
  event_type:
    type: string
  timestamp:
    type: string
    format: date-time

This separation keeps your opendpi.yaml clean and makes schemas easier to manage as your data product grows.

Translate Schemas

One of the most powerful features of the Daco CLI is schema translation. You can generate typed code and schemas from your port definitions in over 12 formats.

Run the translate command with your desired format:

daco ports translate --format pyspark

The CLI will prompt you to select which ports you want to translate. You can also specify the port directly with the --name flag:

daco ports translate --format pyspark --name user_events

This generates PySpark code from your schema:

from pyspark.sql.types import *
 
user_events = StructType([
    StructField("event_id", StringType(), True),
    StructField("user_id", StringType(), True),
    StructField("event_type", StringType(), True),
    StructField("timestamp", StringType(), True),
])

The same works for any supported format — Avro, Protobuf, Go types, Pydantic, Markdown, and more:

daco ports translate --format avro --name user_events
daco ports translate --format go --name user_events
daco ports translate --format markdown --name user_events

Your opendpi.yaml becomes the single source of truth, and you generate everything else from it.

What's Next

You now have a complete data product definition that is machine-readable, versionable, and ready for tooling. From here you can:

Add more ports to expose additional datasets
Use schema translation to generate code for your pipelines
Share your opendpi.yaml with consumers so they know exactly what your data product provides

Explore the full list of commands with:

daco --help

From CLI to Catalog

Everything we have built so far lives in your repository, versioned, reviewable, and portable. But a data product is only as valuable as its discoverability.

Daco Studio bridges that gap. Connect your repository and it picks up your opendpi.yaml files automatically, turning them into a code-first data catalog that the rest of your organization can browse and search. No manual entry, no sync scripts — just push your changes and the catalog stays up to date.

Visit dacolabs.com to explore the OpenDPI specification, try the Daco CLI, and join our community.

Getting Started with the Daco CLI

Installation

Initialize Your First Data Product

Add a Connection

Define a Port

Translate Schemas

What's Next

From CLI to Catalog

Stay up to date

More Posts

Why Your Data Catalog Should Live Next to Your Code

Standardize Your Data Definitions with OpenDPI

Daco CLI Changelog: v0.2.1