Customize dbt models database, schema, and alias
- 1 Introduction
- 2 How to customize this behavior
- 3 Example use cases
- 4 What not to do
- 5 Tips and tricks
Example use cases
Here are some typical examples we've encountered with dbt users leveraging those 3 macros and different logic.
Note that the following examples are not comprehensive and do not cover all the available options. These examples are meant to be templates for you to develop your own behaviors.
- Use custom schema without concatenating target schema in production
- Add developer identities to tables
- Use branch name as schema prefix
- Use a static schema for CI
1. Custom schemas without target schema concatenation in production
The most common use case is using the custom schema without concatenating it with the default schema name when in production.
To do so, you can create a new file called generate_schema_name.sql
under your macros folder with the following code:
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- elif env_var('DBT_ENV_TYPE','DEV') == 'PROD' -%}
{{ custom_schema_name | trim }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
This will generate the following outputs for a model called my_model
with a custom schema of marketing
, preventing any overlap of objects between dbt runs from different contexts.
Context | Target database | Target schema | Resulting object |
---|---|---|---|
Developer 1 | dev | dbt_dev1 | dev.dbt_dev1_marketing.my_model |
Developer 2 | dev | dbt_dev2 | dev.dbt_dev2_marketing.my_model |
CI PR 123 | ci | dbt_pr_123 | ci.dbt_pr_123_marketing.my_model |
CI PR 234 | ci | dbt_pr_234 | ci.dbt_pr_234_marketing.my_model |
Production | prod | analytics | prod.marketing.my_model |
We added logic to check if the current dbt run is happening in production or not. This is important, and we explain why in the What not to do section.
2. Static schemas: Add developer identities to tables
Occasionally, we run into instances where the security posture of the organization prevents developers from creating schemas and all developers have to develop in a single schema.
In this case, we can:
-
Create a new file called generate_schema_name.sql under your macros folder with the following code:
-
Change
generate_schema_name()
to use a single schema for all developers, even if a custom schema is set. -
Update
generate_alias_name()
to append the developer alias and the custom schema to the front of the table name in the dev environment.- This method is not ideal, as it can cause long table names, but it will let developers see in which schema the model will be created in production.
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- elif env_var('DBT_ENV_TYPE','DEV') != 'CI' -%}
{{ custom_schema_name | trim }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
{% macro generate_alias_name(custom_alias_name=none, node=none) -%}
{%- if env_var('DBT_ENV_TYPE','DEV') == 'DEV' -%}
{%- if custom_alias_name -%}
{{ target.schema }}__{{ custom_alias_name | trim }}
{%- elif node.version -%}
{{ target.schema }}__{{ node.name ~ "_v" ~ (node.version | replace(".", "_")) }}
{%- else -%}
{{ target.schema }}__{{ node.name }}
{%- endif -%}
{%- else -%}
{%- if custom_alias_name -%}
{{ custom_alias_name | trim }}
{%- elif node.version -%}
{{ return(node.name ~ "_v" ~ (node.version | replace(".", "_"))) }}
{%- else -%}
{{ node.name }}
{%- endif -%}
{%- endif -%}
{%- endmacro %}
This will generate the following outputs for a model called my_model
with a custom schema of marketing
, preventing any overlap of objects between dbt runs from different contexts.
Context | Target database | Target schema | Resulting object |
---|---|---|---|
Developer 1 | dev | dbt_dev1 | dev.marketing.dbt_dev1_my_model |
Developer 2 | dev | dbt_dev2 | dev.marketing.dbt_dev2_my_model |
CI PR 123 | ci | dbt_pr_123 | ci.dbt_pr_123_marketing.my_model |
CI PR 234 | ci | dbt_pr_234 | ci.dbt_pr_234_marketing.my_model |
Production | prod | analytics | prod.marketing.my_model |
3. Use branch name as schema prefix
For teams who prefer to isolate work based on the feature branch, you may want to take advantage of the DBT_CLOUD_GIT_BRANCH
special environment variable. Please note that developers will write to the exact same schema when they are on the same feature branch.
The DBT_CLOUD_GIT_BRANCH
variable is only available within the dbt Cloud IDE and not the Cloud CLI.
We’ve also seen some organizations prefer to organize their dev databases by branch name. This requires implementing similar logic in generate_database_name()
instead of the generate_schema_name()
macro. By default, dbt will not automatically create the databases.
Refer to the Tips and tricks section to learn more.
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if env_var('DBT_ENV_TYPE','DEV') == 'DEV' -%}
{#- we replace characters not allowed in the schema names by "_" -#}
{%- set re = modules.re -%}
{%- set cleaned_branch = re.sub("\W", "_", env_var('DBT_CLOUD_GIT_BRANCH')) -%}
{%- if custom_schema_name is none -%}
{{ cleaned_branch }}
{%- else -%}
{{ cleaned_branch }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
This will generate the following outputs for a model called my_model
with a custom schema of marketing
, preventing any overlap of objects between dbt runs from different contexts.
Context | Branch | Target database | Target schema | Resulting object |
---|---|---|---|---|
Developer 1 | featureABC | dev | dbt_dev1 | dev.featureABC_marketing.my_model |
Developer 2 | featureABC | dev | dbt_dev2 | dev.featureABC_marketing.my_model |
Developer 1 | feature123 | dev | dbt_dev1 | dev.feature123_marketing.my_model |
CI PR 123 | ci | dbt_pr_123 | ci.dbt_pr_123_marketing.my_model | |
CI PR 234 | ci | dbt_pr_234 | ci.dbt_pr_234_marketing.my_model | |
Production | prod | analytics | prod.marketing.my_model |
When developer 1 and developer 2 are checked out on the same branch, they will generate the same object in the data warehouse. This shouldn't be a problem as being on the same branch means the model's code will be the same for both developers.
4. Use a static schema for CI
Some organizations prefer to write their CI jobs to a single schema with the PR identifier prefixed to the front of the table name. It's important to note that this will result in long table names.
To do so, you can create a new file called generate_schema_name.sql
under your macros folder with the following code:
{% macro generate_schema_name(custom_schema_name=none, node=none) -%}
{%- set default_schema = target.schema -%}
{# If the CI Job does not exist in its own environment, use the target.name variable inside the job instead #}
{# {%- if target.name == 'CI' -%} #}
{%- if env_var('DBT_ENV_TYPE','DEV') == 'CI' -%}
ci_schema
{%- elif custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ default_schema }}_{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
{% macro generate_alias_name(custom_alias_name=none, node=none) -%}
{# If the CI Job does not exist in its own environment, use the target.name variable inside the job instead #}
{# {%- if target.name == 'CI' -%} #}
{%- if env_var('DBT_ENV_TYPE','DEV') == 'CI' -%}
{%- if custom_alias_name -%}
{{ target.schema }}__{{ node.config.schema }}__{{ custom_alias_name | trim }}
{%- elif node.version -%}
{{ target.schema }}__{{ node.config.schema }}__{{ node.name ~ "_v" ~ (node.version | replace(".", "_")) }}
{%- else -%}
{{ target.schema }}__{{ node.config.schema }}__{{ node.name }}
{%- endif -%}
{%- else -%}
{%- if custom_alias_name -%}
{{ custom_alias_name | trim }}
{%- elif node.version -%}
{{ return(node.name ~ "_v" ~ (node.version | replace(".", "_"))) }}
{%- else -%}
{{ node.name }}
{%- endif -%}
{%- endif -%}
{%- endmacro %}
This will generate the following outputs for a model called my_model
with a custom schema of marketing
, preventing any overlap of objects between dbt runs from different contexts.
Context | Target database | Target schema | Resulting object |
---|---|---|---|
Developer 1 | dev | dbt_dev1 | dev.dbt_dev1_marketing.my_model |
Developer 2 | dev | dbt_dev2 | dev.dbt_dev2_marketing.my_model |
CI PR 123 | ci | dbt_pr_123 | ci.ci_schema.dbt_pr_123_marketing_my_model |
CI PR 234 | ci | dbt_pr_234 | ci.ci_schema.dbt_pr_234_marketing_my_model |
Production | prod | analytics | prod.marketing.my_model |