Skip to content
MasterData
  • Home
  • All Courses
  • Blog
    • Apache Spark
Start Learning
Start Learning
MasterData
  • Home
  • All Courses
  • Blog
    • Apache Spark

Apache Airflow

2
  • Apache Airflow: What, Why, and How?
  • How to Deploy Apache Airflow on Kubernetes

Apache Iceberg

3
  • [01] – Introduction to Apache Iceberg
  • [02] – Getting Started with Apache Iceberg
  • [03] – Apache Iceberg Architecture

Apache Spark

4
  • [00] – Apache Spark By Example Course
  • [01] – What is Apache Spark?
  • [02] – Installing Apache Spark Locally
  • [03] – Deploy Apache Spark with Kubernetes (K8s)

Data Build Tool (DBT)

7
  • [00] – dbt by Example Course
  • [01] – dbt : What it is, Why and How?
  • [02] – Install dbt in local
  • [03] – Explore dbt Models
  • [04] – Sources in dbt
  • [05] – Seeds in dbt
  • [06] – Jinja Templates and Macros in dbt

SQL - Advanced

2
  • [02] – View vs Materialized View
  • [03] – Window function in SQL

SQL - Basics

1
  • 02 – Understanding SQL Operations (DML, DDL, DCL, and TCL)

SQL - Intermediate

1
  • SQL Joins: Understanding INNER, LEFT, RIGHT, and FULL Joins
  • Home
  • Docs
  • Data Processing
  • Data Build Tool (DBT)
  • [04] – Sources in dbt
View Categories

[04] – Sources in dbt

kerrache.massipssa

In this tutorial, you will learn how to define and use sources in dbt. Sources allow you to reference raw data in your data warehouse in a structured and maintainable way, improving data lineage, documentation, and quality control.

This tutorial will guide you through step-by-step examples, making it easy to integrate sources into your dbt workflow. Whether you are new to dbt or looking to enhance your existing projects, this tutorial will provide the knowledge you need to organize and manage your raw data effectively.

What are sources in dbt ? #

Sources allow you to define and document the data that your Extract and Load (EL) tools bring into your data warehouse. By declaring tables as sources in dbt, you can:

  • Reference source tables in your models using the {{ source() }} function, making data lineage clear.
  • Validate assumptions about your source data through tests.
  • Monitor data freshness to ensure timely and accurate transformations.

Create a source #

To create a source, simply add a filename.yaml file inside the models directory where your dbt models are defined.

In the example below, we define two sources: dev and prod.

  • The dev source includes two tables from the dbt-demo database within the demo schema.
  • The prod source contains one table.
version: 2

sources:
  - name: dev
    database: dbt-demo
    schema: demo
    tables:
      - name: orders
      - name: customers

  - name: prod
    tables:
      - name: payments

ℹ️ By default, the schema matches the source name. Include the schema only if the source name differs from the actual schema in your database.

Using a source #

A source in dbt serves three main purposes:

  1. Selecting data within models from a source.
  2. Testing data quality of the source.
  3. Checking the freshness of the source data.

In this tutorial, we’ll focus only on using sources for selecting data. Dedicated chapters will cover the concepts of data quality testing and source checking freshness.

In the previous tutorial about dbt model, we created the completed_orders model by selecting data directly from demo.orders.

SELECT
    order_id,
    customer_id,
    order_date,
    total_amount
FROM demo.orders
WHERE status = 'Completed'

Now, let’s utilize the dev source we defined earlier. In the code below, the FROM statement has been updated to reference the source instead of directly selecting from the table.

SELECT
    order_id,
    customer_id,
    order_date,
    total_amount
FROM {{ source('dev', 'orders') }} 
WHERE status = 'Completed'

Updated on March 2, 2025
[00] – dbt by Example Course

Leave a Reply Cancel reply

You must be logged in to post a comment.

Table of Contents
  • What are sources in dbt ?
  • Create a source
  • Using a source

Copyright © 2025 MasterData

Powered by MasterData

Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
View preferences
{title} {title} {title}