dbt (data build tool) is an open-source command-line tool that lets analysts and engineers transform data inside a cloud data warehouse using SQL, with version control, automated testing, documentation, and metric definitions built in.
Why dbt (Data Build Tool) Matters
dbt has become the de facto standard for data transformation in the modern data stack. Where ETL tools historically transformed data on intermediate servers before loading it into a warehouse, dbt enables ELT: load raw data first, then transform inside the warehouse with versioned SQL models.
The dbt project structure (models/, macros/, tests/, docs/) plus dbt Cloud or dbt Core gives data teams the same engineering rigour they had been missing — Git-versioned transformations, automated tests, documentation that auto-generates, and a metric layer (the dbt Semantic Layer) that exposes governed metrics to BI tools and AI agents.
How dbt (Data Build Tool) Works
A dbt project consists of:
- Sources: YAML descriptions of raw tables loaded into your warehouse by extract-load tools like Fivetran, Airbyte, or Stitch.
- Models: SQL files (one per table or view) that transform raw sources into analytics-ready tables. dbt compiles these into runnable SQL specific to your warehouse (Snowflake, BigQuery, Databricks, Redshift, Postgres).
- Tests: Schema and data tests that run after each model materialises. Catch nulls, duplicates, referential integrity issues automatically.
- Documentation: dbt auto-generates a documentation site showing column-level lineage, descriptions, and freshness.
- The dbt Semantic Layer: A separate feature that exposes dbt-defined metrics for BI tools to consume. The same metric definition is used by every dashboard, AI agent, and reverse ETL workflow.
Modern BI tools like Analytify integrate natively with dbt, reading dbt models directly without needing a translation layer.
Real-World Example
A data team writes a dbt model models/marts/active_users.sql that defines active_users as users with at least one event in the past 30 days. They add a test that asserts active_users is never null and never higher than total_users. They document each column. dbt compiles the SQL for Snowflake, runs the tests, materialises the table, and generates docs. Every BI tool, dashboard, and AI agent now queries active_users from the dbt-managed table.
Common dbt (Data Build Tool) Tools and Platforms in 2026
dbt ecosystem in 2026:
dbt Core
The open-source CLI. Free. Run from your laptop, GitHub Actions, or any orchestrator.
dbt Cloud
The paid managed service from dbt Labs. IDE, scheduling, hosted docs, semantic layer.
dbt Semantic Layer
Metric layer exposing dbt-defined metrics for BI tool consumption.
Lightdash
Open-source BI tool that reads dbt models directly as the semantic layer.
Analytify
Open-source GenBI platform with dbt-compatible semantic layer for SaaS embedded analytics.
Cube
Headless BI / semantic layer that complements dbt for API access to metrics.
Frequently Asked Questions About dbt (Data Build Tool)
What does dbt stand for?
dbt stands for “data build tool.” It is also the name of the company (dbt Labs) that maintains the open-source project and offers a paid cloud version.
Is dbt free?
dbt Core is open-source and free. dbt Cloud (the managed service) has a free Developer tier and paid Team / Enterprise tiers starting around $100/user/month.
Is dbt a semantic layer?
dbt Core is a transformation tool. The dbt Semantic Layer (formerly MetricFlow) is a separate feature that exposes dbt metrics to BI tools and AI agents.
What is the difference between dbt and Airflow?
Airflow is a general-purpose workflow orchestrator. dbt is a SQL transformation tool. They are complementary: Airflow can schedule and orchestrate dbt runs alongside other tasks (extracts, loads, ML training).
Do BI tools natively support dbt?
Modern BI tools like Analytify, Lightdash, Mode, Hex, and Preset read dbt models directly. Older BI tools typically need the dbt Semantic Layer or a separate adapter.
What is the difference between ETL and dbt-style ELT?
ETL transforms data before loading it into the warehouse, often on a separate server. dbt-style ELT loads raw data first and transforms inside the warehouse using SQL. dbt is the dominant tool for the T in modern ELT.
Can dbt replace stored procedures?
Yes. Most modern data teams replace warehouse stored procedures with dbt models because dbt provides version control, testing, and documentation that stored procedures lack.