Model Materialization is a key concept in dbt that determines how models are stored and executed in the data warehouse. Choosing the right materialization strategy can significantly impact query performance, data freshness, and storage efficiency.
In this lesson, you’ll learn:
- What materialization is and why it matters in dbt.
- The different types of built-in materialization strategies (view, table, incremental, ephemeral, and materialized view).
- How to configure materialization in dbt using
dbt_project.yml
, model files, and properties files. - Best practices for selecting the right materialization strategy based on your data volume, update frequency, and performance needs.
By the end of this lesson, you’ll have a solid understanding of how dbt materialization works and how to apply it effectively in your projects.
Model Materialization Strategies
Materialization in dbt determines how models are stored in the data warehouse. dbt provides five built-in materialization types, each with different benefits and trade-offs.
Materialization | Definition | Pros | Cons |
---|---|---|---|
view |
CREATE VIEW AS statement. |
– No additional storage is required. – Views always reflect the latest records from the source data. |
– Slow query performance for complex transformations or when stacking multiple views. |
table |
dbt rebuilds the model as a table on each run using a |
– Provide fast query performance. | – Rebuilding tables can be time-consuming, especially for complex transformations. – New records in the source data do not update automatically in the table. |
incremental | Enables dbt to insert or update records in a table only since the last time the model was executed. | – Faster build times by processing only new records instead of the entire dataset. | – Requires additional configuration and is considered an advanced dbt feature. |
ephemeral | Models are not stored in the database. Instead, dbt inlines their code into dependent models using a Common Table Expression (CTE). |
– Supports reusable logic while keeping the data warehouse clean by reducing clutter. |
– Cannot be queried directly or referenced in dbt run-operation macros. – Overuse can make debugging queries difficult. – Does not support model contracts. |
materialized view | Enables the creation and management of materialized views in the target database. These views combine the benefits of tables and views, making them useful for incremental-style use cases. | – Faster query performance like tables, while maintaining data freshness like views. – Automatically refreshes on a schedule (depending on the database), unlike incremental models that require manual dbt runs. |
– Limited configuration options, as support varies by database platform. – Not universally supported across all databases. |
Configure Materialization
By default, dbt materializes models as views. However, you can customize this behavior in three ways:
1. Configuration in dbt_project.yml
You can define materialization settings in the dbt_project.yml
file. The example below specifies that:
- Models under
models/orders/
will be materialized as views. - Models under
models/consumers/
will be materialized as tables.
{{ config(materialized='table') }} SELECT order_id, customer_id, order_date, total_amount FROM raw.orders WHERE status = 'completed'
3. Configuration in the properties.yml
File
Materialization settings can also be defined in the model’s properties.yml
file. The example below materializes not_completed_orders
as a view:
version: 2 models: - name: not_completed_orders config: materialized: view
models: demo_project: orders: +materialized: view consumers: +materialized: table
2. Configuration in the Model File
Materialization can also be set directly within a model’s .sql
file using the config()
macro. The example below materializes the completed_orders
model as a table:
{{ config(materialized='table') }} SELECT order_id, customer_id, order_date, total_amount FROM raw.orders WHERE status = 'completed'
3. Configuration in the properties.yml
File
Materialization settings can also be defined in the model’s properties.yml
file. The example below materializes not_completed_orders
as a view:
version: 2 models: - name: not_completed_orders config: materialized: view