dbt Testing

Purpose

Transform AI agents into experts on dbt testing strategies, providing guidance on implementing comprehensive data quality checks with database-enforced constraints, generic tests, and custom singular tests to ensure data integrity across all layers.

When to Use This Skill

Activate this skill when users ask about:

Implementing data quality tests
Adding primary key and foreign key constraints
Using dbt_constraints package for database-level enforcement
Creating generic (reusable) tests
Writing singular (one-off) tests
Testing strategies by layer (bronze/silver/gold)
Debugging test failures
Configuring test severity levels
Storing test failures for analysis

Official dbt Documentation: Testing

Testing Philosophy

Implement tests in this order for maximum data quality:

Primary Keys - Every dimension must have one
Foreign Keys - All fact relationships
Unique Keys - Business key constraints
Business Rules - Domain-specific validations
Data Quality - Completeness, accuracy, consistency

Why Use dbt_constraints?

The dbt_constraints package provides database-level enforcement (not just dbt tests):

✅ Database Enforcement - Creates actual constraints in the data warehouse ✅ Performance - Database-level constraints improve query optimization ✅ Data Integrity - Prevents invalid data at all access points (not just dbt) ✅ Documentation - Constraints visible in database metadata and BI tools ✅ Query Optimization - Database can use constraints for better execution plans

Standard dbt tests only validate during dbt test runs. dbt_constraints creates real database constraints that are enforced 24/7.

Official dbt_constraints Documentation: GitHub - Snowflake-Labs/dbt_constraints

Package Installation

# packages.yml
packages:
  - package: Snowflake-Labs/dbt_constraints
    version: [">=0.8.0", "<1.0.0"]

  - package: dbt-labs/dbt_utils
    version: [">=1.0.0", "<2.0.0"]

Install packages:

dbt deps

Official dbt Docs: Package Management

Primary Key Testing

Simple Primary Key (dbt_constraints)

Required for every dimension:

# models/gold/_models.yml
models:
  - name: dim_customers
    columns:
      - name: customer_id
        tests:
          - dbt_constraints.primary_key

Composite Primary Key

When primary key spans multiple columns:

models:
  - name: fct_order_lines
    tests:
      - dbt_constraints.primary_key:
          column_names:
            - order_id
            - line_number

Alternative: Built-in dbt Tests

Not recommended - no database enforcement:

columns:
  - name: product_id
    tests:
      - not_null
      - unique

Limitation: Only validates during dbt test runs, doesn't prevent bad data from other sources.

Foreign Key Testing

Simple Foreign Key (dbt_constraints)

Ensures referential integrity:

models:
  - name: fct_orders
    columns:
      - name: customer_id
        tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref('dim_customers')
              pk_column_name: customer_id

Multiple Foreign Keys

For facts with multiple dimension relationships:

models:
  - name: fct_order_lines
    columns:
      - name: order_id
        tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref('fct_orders')
              pk_column_name: order_id

      - name: product_id
        tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref('dim_products')
              pk_column_name: product_id

      - name: customer_id
        tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref('dim_customers')
              pk_column_name: customer_id

Alternative: Built-in dbt Relationships Test

Not recommended - no database enforcement:

columns:
  - name: customer_id
    tests:
      - relationships:
          to: ref('dim_customers')
          field: customer_id

Unique Key Testing

Simple Unique Key (dbt_constraints)

For business keys (non-primary keys that must be unique):

columns:
  - name: customer_email
    tests:
      - dbt_constraints.unique_key

Composite Unique Key

When uniqueness spans multiple columns:

models:
  - name: stg_orders
    tests:
      - dbt_constraints.unique_key:
          column_names:
            - order_number
            - order_source

Generic Tests (Reusable)

Built-in dbt Tests

columns:
  - name: order_status
    tests:
      - not_null
      - accepted_values:
          values: ["pending", "processing", "shipped", "delivered", "cancelled"]

  - name: order_amount
    tests:
      - not_null

dbt_utils Tests

Powerful generic tests from dbt_utils package:

columns:
  - name: customer_email
    tests:
      - dbt_utils.not_null_proportion:
          at_least: 0.95 # 95% of rows must have email

  - name: order_amount
    tests:
      - dbt_utils.accepted_range:
          min_value: 0
          max_value: 1000000

  - name: customer_status
    tests:
      - dbt_utils.not_empty_string

Official dbt_utils Documentation: dbt_utils - Generic Tests

Custom Generic Tests

Create reusable test for common patterns:

-- tests/generic/test_positive_values.sql
{% test positive_values(model, column_name) %}

select count(*)
from {{ model }}
where {{ column_name }} <= 0

{% endtest %}

Usage:

columns:
  - name: order_total
    tests:
      - positive_values

  - name: quantity
    tests:
      - positive_values

Another Example: Date Range Test

-- tests/generic/test_recent_data.sql
{% test recent_data(model, column_name, days_ago=30) %}

select count(*)
from {{ model }}
where {{ column_name }} < dateadd(day, -{{ days_ago }}, current_date())

{% endtest %}

Usage:

columns:
  - name: order_date
    tests:
      - recent_data:
          days_ago: 7 # Alert if no orders in last 7 days

Singular Tests (One-Off)

For complex business logic that doesn't fit generic tests:

-- tests/singular/test_order_dates_sequential.sql
with date_validation as (
    select
        o.order_id,
        o.order_date,
        c.signup_date
    from {{ ref('fct_orders') }} o
    join {{ ref('dim_customers') }} c
        on o.customer_id = c.customer_id
    where o.order_date < c.signup_date  -- Order before signup = invalid
)

select * from date_validation

Test fails if ANY rows are returned.

More Singular Test Examples

Revenue Reconciliation:

-- tests/singular/test_revenue_reconciliation.sql
-- Ensure fact table revenue matches source system
with fact_revenue as (
    select sum(order_amount) as total_revenue
    from {{ ref('fct_orders') }}
    where order_date = current_date() - 1
),

source_revenue as (
    select sum(amount) as total_revenue
    from {{ source('erp', 'orders') }}
    where order_date = current_date() - 1
),

comparison as (
    select
        f.total_revenue as fact_revenue,
        s.total_revenue as source_revenue,
        abs(f.total_revenue - s.total_revenue) as difference
    from fact_revenue f
    cross join source_revenue s
)

select * from comparison
where difference > 0.01  -- Tolerance of 1 cent

Referential Integrity Check:

-- tests/singular/test_orphaned_orders.sql
-- Find orders with invalid customer_id (not in dim_customers)
select
    o.order_id,
    o.customer_id
from {{ ref('fct_orders') }} o
left join {{ ref('dim_customers') }} c
    on o.customer_id = c.customer_id
where c.customer_id is null
  and o.customer_id != -1  -- Exclude ghost key

Official dbt Documentation: Singular Tests

Testing by Layer

Bronze Layer (Staging)

Focus: Basic data quality at source

models:
  - name: stg_tpc_h__customers
    columns:
      - name: customer_id
        tests:
          - dbt_constraints.primary_key

      - name: customer_email
        tests:
          - not_null

Keep it simple - just verify source data integrity.

Silver Layer (Intermediate)

Focus: Business rule validation, calculated fields

models:
  - name: int_customers__with_orders
    columns:
      - name: customer_id
        tests:
          - dbt_constraints.primary_key

      - name: lifetime_orders
        tests:
          - not_null
          - dbt_utils.accepted_range:
              min_value: 0

      - name: lifetime_value
        tests:
          - not_null
          - dbt_utils.accepted_range:
              min_value: 0

Add business logic validation - ensure calculated fields make sense.

Gold Layer (Marts)

Focus: Comprehensive constraint enforcement with dbt_constraints

models:
  - name: dim_customers
    description: "Customer dimension with full history and metrics"
    columns:
      - name: customer_id
        description: "Unique customer identifier"
        tests:
          - dbt_constraints.primary_key

      - name: customer_tier
        description: "Customer value classification"
        tests:
          - accepted_values:
              values: ["bronze", "silver", "gold", "platinum"]

      - name: customer_email
        tests:
          - dbt_constraints.unique_key

  - name: fct_orders
    description: "Order transactions fact table"
    columns:
      - name: order_id
        tests:
          - dbt_constraints.primary_key

      - name: customer_id
        tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref('dim_customers')
              pk_column_name: customer_id

      - name: product_id
        tests:
          - dbt_constraints.foreign_key:
              pk_table_name: ref('dim_products')
              pk_column_name: product_id

      - name: order_amount
        tests:
          - not_null
          - dbt_utils.accepted_range:
              min_value: 0

Maximum enforcement - use all constraint types to ensure production data quality.

Test Configuration

Store Test Failures

Analyze failed test records:

dbt test --store-failures

# dbt_project.yml
tests:
  +store_failures: true
  +schema: dbt_test_failures

Query failures:

select * from dbt_test_failures.not_null_dim_customers_customer_email

Test Severity Levels

Warn vs Error:

columns:
  - name: customer_email
    tests:
      - dbt_constraints.unique_key:
          config:
            severity: warn # or 'error' (default)

Severity Behavior:

error: Test failure stops dbt execution (exit code 1)
warn: Test failure logs warning but continues (exit code 0)

Use warn for:

Data quality checks that shouldn't block deployment
Known edge cases during migration
Monitoring tests

Limit Test Execution

Test specific model:

dbt test --select dim_customers

Test by type:

dbt test --select test_type:generic    # All generic tests
dbt test --select test_type:singular   # All singular tests

Test with dependencies:

dbt test --select +dim_customers+  # Test model and all dependencies

Official dbt Documentation: Test Selection

Running Tests

# Run all tests
dbt test

# Build models and test together (recommended)
dbt build  # Runs models, then tests

# Test specific model
dbt test --select dim_customers

# Test specific column
dbt test --select dim_customers,column:customer_id

# Test by layer
dbt test --select tag:gold

# Test with failures stored
dbt test --store-failures --select fct_orders

Best Practice: Use dbt build instead of dbt run + dbt test separately.

Testing Best Practices

1. Test Early and Often

Add tests as you build models, not after deployment.

2. Layer-Appropriate Testing

Bronze: Basic not_null and primary key tests
Silver: Business rule validation, range checks
Gold: Comprehensive constraint enforcement with dbt_constraints

3. Use dbt_constraints for Production

Database-level constraints provide:

24/7 enforcement (not just during dbt runs)
Performance optimization
Better integration with BI tools

4. Document Test Purpose

columns:
  - name: customer_tier
    description: "Customer segmentation based on lifetime value"
    tests:
      - accepted_values:
          values: ["bronze", "silver", "gold", "platinum"]
          config:
            severity: error

5. Balance Coverage vs Performance

Don't over-test trivial columns
Focus on business-critical fields
Use sampling for very large tables if needed

Testing Checklist

Before moving to production:

All dimensions have primary key tests
All facts have foreign key tests to dimensions
Business rules are validated with tests
Data quality tests are in place (not_null, accepted_values)
Tests run successfully in CI/CD pipeline
dbt_constraints enabled for all production marts
Test failures configured to store in database
Singular tests created for complex business logic

Helping Users with Testing

Strategy for Assisting Users

When users ask about testing:

Identify model type: Dimension? Fact? Intermediate?
Recommend appropriate tests: By layer and purpose
Prioritize constraints: Primary keys → Foreign keys → Business rules
Provide complete examples: Working YAML configurations
Explain benefits: Why dbt_constraints over standard tests
Show how to run: Commands and debugging approaches

Common User Questions

"What tests should I add?"

Start with dbt_constraints for primary/foreign keys
Add not_null for required fields
Use accepted_values for enums
Create singular tests for complex business logic

"Why use dbt_constraints instead of regular tests?"

Database-level enforcement (24/7, not just during dbt runs)
Better query performance
Prevents bad data from any source
Visible in database metadata

"How do I debug test failures?"

Use --store-failures to save failing records
Query the test failure table
Review actual data that failed the test
Add more specific tests to isolate issue

name	dbt-testing
description	dbt testing strategies using dbt_constraints for database-level enforcement, generic tests, and singular tests. Use this skill when implementing data quality checks, adding primary/foreign key constraints, creating custom tests, or establishing comprehensive testing frameworks across bronze/silver/gold layers.

dbt-testing

About dbt-testing