Semantic Layer CI/CD with Dremio and dbt
implications for materialized data downstream. On the other hand, the creation of reflections
can be a very expensive job, which may be more economical to govern by a separate ETL-style
batch workflow rather than having it slow down the frequent, continuous deployment pipeline.
Suppose we include reflections in our main dbt model. In that case, we should avoid using raw
Dremio SQL via post-hooks and instead leverage the Dremio dbt connector’s built-in
materialization type.
To enable reflections in dbt, we must set the following variable in the
dbt_project.yml file:
vars:
dremio:reflections_enabled: true
This will allow us to create a nyc_taxi_trips_refl.sql file, which builds on our previous view via
the reference
{{ ref('nyc_taxi_trips') }}:
{{ config(
materialized='reflection',
reflection_type='aggregate',
dimensions=['passenger_count'],
measures=['trip_distance_mi'],
computations=['COUNT,SUM']
)}}
-- depends_on: {{ ref('nyc_taxi_trips') }}
A detailed guide on defining reflections in dbt can be found here.
Sources
Currently, it is impossible to create sources via dbt since Dremio sources can only be created
in the UI or
via REST API call. More importantly, we also do not recommend keeping the
creation of data sources in the same workflow as regular tables and views since sources
usually require authentication secrets, which should always be stored safely with only very
selective access for administrators.
Since most use cases involve only a handful of sources, we recommend setting up a separate
workflow to create and migrate those source configurations between environments, either via
REST API or manually.
User-defined functions UDFs)
User-defined functions can be defined and referenced in multiple ways inside a dbt model.