Overview
The VDL serves as the foundation for your Squirrels project’s data layer. When you run sqrl build, Squirrels:- Loads seed data from CSV files
- Extracts and stores source data (when
load_to_vdl: trueis configured) - Executes build models and materializes them as tables or views
- Stores all of this data in the VDL
vdl in DuckDB during realtime queries, making all your materialized models accessible for querying by federate models.
Architecture
The VDL uses DuckLake which consists of two main components:Catalog database
The catalog database (vdl_catalog.duckdb by default) stores metadata about the VDL, including table schemas, view definitions, and other structural information.
The catalog database can be configured via the SQRL_VDL__CATALOG_DB_PATH environment variable (see the Configuration section below for more details).
Data storage
The data storage location (target/vdl_data/ by default) contains the actual data files for the VDL. These files store the physical data for materialized tables.
The location can be configured via the SQRL_VDL__DATA_PATH environment variable (see the Configuration section below for more details).
Configuration
The catalog database and data location can be configured using environment variables in your.env or .env.local file:
Path to the DuckLake catalog database for the VDL. Supports the
{project_path} placeholder. Using a Postgres database for the catalog database is possible.For more information, see the DuckLake documentation on choosing a catalog database.Storage path for the DuckLake data files for the VDL. Supports the
{project_path} placeholder. Using a cloud storage system like AWS S3 is also possible.For more information, see the DuckLake documentation on choosing storage.Both the catalog database path and data storage path support the
{project_path} placeholder, allowing you to use relative paths that resolve to your project directory.Building the VDL
The VDL is created and updated when you run sqrl build:- Seeds are loaded from CSV files in the
seeds/directory - Sources with
load_to_vdl: trueare extracted and stored - Build models are executed in dependency order and materialized
- All data is stored in the VDL
Accessing VDL data
Once built, data in the VDL can be accessed in several ways:In data models
Build models and federate models in SQL can reference VDL tables using theref() macro:
sqrl.ref() method:
Direct SQL queries
You can also query the VDL directly using DuckDB SQL. When usingsqrl duckdb or sqrl duckdb --ui, the VDL is automatically attached as the vdl schema (in READ ONLY mode). This schema is in use by default such that you don’t need to run USE vdl; to use it.
Related pages
- Build models - Create materialized data models in the VDL
- Sources - Configure source tables and load them to the VDL
- Seeds - Load CSV data into the VDL
- Environment variables - Configure VDL paths and settings
- sqrl build - CLI reference for building the VDL
- sqrl duckdb - CLI reference for running DuckDB SQL queries against the VDL