Skip to content
MasterData
  • Home
  • All Courses
  • Blog
    • Apache Spark
Start Learning
Start Learning
MasterData
  • Home
  • All Courses
  • Blog
    • Apache Spark

Apache Airflow

2
  • Apache Airflow: What, Why, and How?
  • How to Deploy Apache Airflow on Kubernetes

Apache Iceberg

3
  • [01] – Introduction to Apache Iceberg
  • [02] – Getting Started with Apache Iceberg
  • [03] – Apache Iceberg Architecture

Apache Spark

4
  • [00] – Apache Spark By Example Course
  • [01] – What is Apache Spark?
  • [02] – Installing Apache Spark Locally
  • [03] – Deploy Apache Spark with Kubernetes (K8s)

Data Build Tool (DBT)

7
  • [00] – dbt by Example Course
  • [01] – dbt : What it is, Why and How?
  • [02] – Install dbt in local
  • [03] – Explore dbt Models
  • [04] – Sources in dbt
  • [05] – Seeds in dbt
  • [06] – Jinja Templates and Macros in dbt

SQL - Advanced

2
  • [02] – View vs Materialized View
  • [03] – Window function in SQL

SQL - Basics

1
  • 02 – Understanding SQL Operations (DML, DDL, DCL, and TCL)

SQL - Intermediate

1
  • SQL Joins: Understanding INNER, LEFT, RIGHT, and FULL Joins
  • Home
  • Docs
  • Data Processing
  • Data Build Tool (DBT)
  • [05] – Seeds in dbt
View Categories

[05] – Seeds in dbt

kerrache.massipssa

In this course, you’ll learn about Seed in dbt.

What are the Seeds? #

Seeds in dbt are CSV files (typically stored in the seeds directory) that can be loaded into a data warehouse using the dbt seed command. They can be referenced in models using the ref function, just like other dbt models. Since they reside in the dbt repository, they are version-controlled and subject to code review.

When use the Seed #

Best suited for:

  • Static data that rarely changes, such as country code mappings, test email lists, or employee account IDs.

Not suitable for:

  • Large raw data exports.
  • Sensitive production data (e.g., PII or passwords).

Steps to Create a Seed #

The typicall steps to create a seed are as follow:

  1. Create a seeds/ folder in your dbt project.
  2. Place CSV files inside (e.g., seeds/product_categories.csv).
  3. Run the command to load the seed into the database.

Seed Example #

Now, let’s put in place what learn above about the seeds. The Orders table might contain only product_id, but we want to enrich it with product categories. Instead of joining with an external database, we can load a small CSV file as a seed to map product_id to its category.

product_id,category
101,Electronics
102,Clothing
103,Home & Kitchen
104,Beauty

Once you add the file seeds/product_categories.csv, run the command bellow to create to load in the data wharehouse the csv as table.

dbt seed

Once the command complete, a table named product_categories should be created. Now, with the help the ref function we’re going to product_categories table the to improve our model stg_orders.

When to Use Seeds vs. Source Tables? #

FeatureUse a Seed (CSV)Use a Source Table
Data Changes Frequently?❌ No✅ Yes
Data Comes from an External System?❌ No✅ Yes
Static Reference Data?✅ Yes❌ No
Need to be Manually Updated?✅ Yes❌ No
Updated on March 2, 2025
[00] – dbt by Example Course

Leave a Reply Cancel reply

You must be logged in to post a comment.

Table of Contents
  • What are the Seeds?
  • When use the Seed
  • Steps to Create a Seed
  • Seed Example
    • When to Use Seeds vs. Source Tables?

Copyright © 2025 MasterData

Powered by MasterData

Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
View preferences
{title} {title} {title}