Skip to content

Configuration

This page covers both adapter types

This adapter supports two Microsoft Fabric compute engines: Data Warehouse (type: fabric) and Lakehouse (type: fabricspark). Configuration options that are specific to one adapter type are marked accordingly. For a comprehensive guide on using the Lakehouse adapter, see the Lakehouse guide.

You'll need to create a profile in the profiles.yml file to connect to Microsoft Fabric. The adapter offers several ways to configure your connection so that it can be flexible to your needs.

Example profiles.yml

default:
  target: dev
  outputs:
    dev:
      type: fabric
      workspace: your workspace name
      database: name_of_your_data_warehouse
      schema: schema_to_build_models_in
default:
  target: dev
  outputs:
    dev:
      type: fabricspark
      workspace: your workspace name
      database: name_of_your_lakehouse
      schema: schema_in_your_lakehouse
Use environment variables anywhere

You can use environment variables in any configuration value in your profiles.yml file.

default:
  target: dev
  outputs:
    dev:
      type: fabric
      ...
      client_id: "{{ env_var('AZURE_CLIENT_ID', 'an optional default value') }}"
      client_secret: "{{ env_var('AZURE_CLIENT_SECRET') }}"

Make sure to surround the Jinja block with quotes.

All configuration options

type

Required configuration option.

Possible values: fabric, fabricspark

Value Compute engine SQL dialect
fabric Fabric Data Warehouse T-SQL
fabricspark Fabric Lakehouse Spark SQL

host

Alias: server
Example value: abc-123.datawarehouse.fabric.microsoft.com

The server part of your connection string. This is unique per Workspace in Fabric.

You can leave this empty and let the adapter find it automatically by providing information about your Workspace. See workspace_name and workspace_id.

Not used for FabricSpark

This option is not used when type is fabricspark. The Lakehouse adapter connects via the Fabric Livy API, not TDS. The Livy endpoint is resolved automatically from workspace_name or workspace_id.

database

Required configuration option.

Example value: gold_dwh

The name of your Data Warehouse in Fabric.

Example value: bronze_lakehouse

The name of your Lakehouse in Fabric. The adapter uses this as the target for the Livy API connection.

This IS the lakehouse

For type: fabricspark, the database field is how you specify which Lakehouse to connect to. There is no separate lakehouse or lakehouse_name option for this adapter type.

It's recommended to avoid using spaces in the name, although it's supported.

schema

Required configuration option.

Example value: dbt

The schema where dbt will build models. You must have write access to this schema. It's recommended to avoid using spaces in the schema name, although it's supported.

Override per model

The schema can be overridden per model/seed/test/folder/... using the schema config.

Further customization

You can even completely customize how dbt generates the schema name using the generate_schema_name macro.

workspace_name

Alias: workspace
Example value: My Workspace

The name of your Fabric Workspace.

  • For type: fabric: used to automatically find the host value. Not required if host is provided (except for Python models).
  • For type: fabricspark: required (unless workspace_id is provided). The Lakehouse adapter always needs the workspace to resolve the Livy API endpoint.

Not used if workspace_id is also provided.

Python models (Data Warehouse)

If you are using Python models with type: fabric, either workspace_name or workspace_id must be provided.

auth: ActiveDirectoryServicePrincipal

When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.

Behind the scenes, the adapter will do an API call to first find the Workspace ID, and then use that to find the server name.

workspace_id

Example value: 7275c94d-9280-438b-bd67-ffeb8c305c9b

The ID of your Fabric Workspace. Can be used instead of workspace_name.

  • For type: fabric: used to automatically find the host value. Not required if host is provided (except for Python models).
  • For type: fabricspark: required (unless workspace_name is provided). The Lakehouse adapter always needs the workspace.
Python models (Data Warehouse)

If you are using Python models with type: fabric, either workspace_name or workspace_id must be provided.

auth: ActiveDirectoryServicePrincipal

When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.

Behind the scenes, the adapter will do an API call to first find the server name.

authentication

Alias: auth
Possible values (case insensitive):

The adapter supports an authentication method for every use case. The default is auto, which will try to use the best available method depending on your environment.

If you can't find a suitable method for your use case, please open an issue.

ActiveDirectoryIntegrated

Authenticate with a Windows credential federated through Microsoft Entra ID with integrated authentication. This works on domain-joined machines.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

ActiveDirectoryPassword

Authenticate with a Microsoft Entra ID username and password. You must provide the username and password options.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

ActiveDirectoryServicePrincipal

Authenticate with a Microsoft Entra ID service principal using a client ID and client secret. You must provide the client_id and client_secret options.

Tenant ID required for Workspace info or Python models

If you are using workspace_name or workspace_id, you also need to provide the tenant_id option.

ActiveDirectoryInteractive

Authenticate with a Microsoft Entra ID username and password using an interactive prompt. You must provide the username option.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

ActiveDirectoryMsi

Authenticate with a managed identity configured in your environment. This is typically used when running in Azure.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

auto

Default authentication method.

This will try to authenticate using the best available method depending on your environment. It can automatically pick up configurations for managed identities, service principals, Azure CLI/PowerShell users, and more. The full list and order of methods is described on Microsoft Learn.

CLI

Authenticate using the credentials from the Azure CLI. You must be logged in using az login. There have been reports of issues when using an outdated version of the Azure CLI, so make sure to use the latest version. Your account does not need to have access to any Azure subscriptions or resources and the selected Azure subscription does not matter.

Since the Azure CLI supports a variety of authentication methods, this is a flexible option that works in many scenarios.

environment

Authenticate using environment variables. This works similarly to the auto method, but only uses environment variables. See Microsoft Learn for the list of supported environment variables.

notebookutils

This authentication method works inside a Fabric notebook. It uses NotebookUtils to get an access token for the current user.

Currently broken

This method is not working at the moment because Microsoft's Runtime in the Notebooks returns a credential with a scope that is not allowed to access Data Warehouses and SQL Endpoints. Use environment or ActiveDirectoryServicePrincipal inside notebooks instead.

workload_identity

Authenticate with Workload Identity Federation using a federated OIDC token. No client secret needed. Works with GitHub Actions, Kubernetes, and any OIDC provider. See the authentication guide for examples.

Requires tenant_id, client_id, and exactly one of federated_token_url or federated_token_file.

token_credential

Load any azure.core.credentials.TokenCredential implementation by its dotted import path. This is useful when the built-in methods don't cover your scenario -- for example, when using a custom OAuth flow, a token broker, or Workload Identity Federation with a non-standard setup. See the authentication guide for a full walkthrough.

Requires credential_class. Optionally accepts credential_kwargs.

username

Aliases: UID, user
Example value: satya.nadella@microsoft.com

The username to use for authentication. This is required if you are using the ActiveDirectoryPassword or ActiveDirectoryInteractive authentication methods.

password

Aliases: PWD, pass
Example value: IL0veC0p!lot!

The password to use for authentication. This is required if you are using the ActiveDirectoryPassword authentication method.

It's not recommended to hardcode this in your profiles.yml file. Instead, use an environment variable.

client_id

Alias: app_id
Example value: 123e4567-e89b-12d3-a456-426614174000

The client ID of the Microsoft Entra ID application (service principal) to use for authentication. This is required if you are using the ActiveDirectoryServicePrincipal authentication method.

client_secret

Alias: app_secret
Example value: 0123456789abcdef

The client secret of the Microsoft Entra ID application (service principal) to use for authentication. This is required if you are using the ActiveDirectoryServicePrincipal authentication method.

It's not recommended to hardcode this in your profiles.yml file. Instead, use an environment variable.

tenant_id

Example value: 72f988bf-86f1-41af-91ab-2d7cd011db47

When authentication is set to ActiveDirectoryServicePrincipal, the adapter needs to know your Microsoft Entra ID tenant ID to be able to authenticate. This is required if you are using workspace_name or workspace_id or if you are using Python models.

access_token

This option overrides all other authentication methods and directly uses the provided access token to authenticate. This can be useful if you want to fully manage the authentication yourself.

Token lifetime

This is not a recommended way to authenticate, as it requires you to manage the access token yourself. This is only meant for advanced use cases. In normal scenarios, the adapter manages the lifetime of the token for you and will automatically refresh it when needed. In this case, you will need to handle that yourself.

Token scope

Microsoft accepts multiple token scopes for Fabric. However, if you are using the workspace_name or workspace_id options or if you are using Python models, the token must have the https://analysis.windows.net/powerbi/api/.default scope.

token_scope

Example values:

  • https://analysis.windows.net/powerbi/api/.default
  • https://database.windows.net/.default
  • pbi
  • DW

Depending on the authentication method you are using, the adapter will request an access token for a specific scope. This scope will be automatically determined based on your configuration. However, if you need to override the scope for some reason, you can use this option to set a custom scope.

credential_class

Example value: my_pkg.auth.MyCredential

The fully qualified dotted import path to a Python class that implements azure.core.credentials.TokenCredential. This is required when authentication is set to token_credential, and must not be set for any other authentication method.

The path must be a valid dotted Python identifier (e.g. my_pkg.sub.MyCredential). The class must be importable from the Python environment where dbt runs.

credential_kwargs

Example value:

credential_kwargs:
  token_url: "{{ env_var('TOKEN_URL') }}"
  audience: "https://my-api.example.com"

A dictionary of keyword arguments passed to the constructor of the class specified in credential_class. This is optional and can only be used when authentication is set to token_credential.

federated_token_url

Example value: https://token.actions.githubusercontent.com

The URL to fetch a federated OIDC token from. The adapter performs a GET request to this URL and reads the token from the value field of the JSON response. Can only be used when authentication is set to workload_identity.

Mutually exclusive with federated_token_file — exactly one must be set.

federated_token_header

Example value: bearer ghs_xxxxxxxxxxxxxxxxxxxx

The value for the Authorization header when fetching the federated token from federated_token_url. Can only be used together with federated_token_url, not with federated_token_file.

federated_token_file

Example value: /var/run/secrets/azure/tokens/azure-identity-token

Path to a file containing a federated OIDC token. The adapter re-reads this file each time it needs a fresh token, so external processes (like kubelet) can refresh it. Can only be used when authentication is set to workload_identity.

Mutually exclusive with federated_token_url — exactly one must be set.

schema_auth

Alias: schema_authorization
Example value: some_group_or_user

If your dbt project is using a schema which does not exist yet, dbt will create it for you. Use this configuration option to set the owner of the schema after creation. This can be a user or a group.

lakehouse

Alias: lakehouse_name
Example value: My Lakehouse

Data Warehouse only

This option only applies to type: fabric. For type: fabricspark, use database instead — that field specifies your Lakehouse directly.

The name of the Lakehouse in Fabric you wish to use for running Python models. This is only relevant for Data Warehouse projects that need a Lakehouse as a Spark execution environment for Python models.

When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.

encrypt

Possible values: true, false
Default: true

Whether to use encryption for the connection. It's recommended to leave this enabled. This could be disabled for advanced networking scenarios.

Data Warehouse only

This option only applies to type: fabric. The FabricSpark adapter connects via HTTPS (Livy API), which is always encrypted.

trust_cert

Alias: TrustServerCertificate
Possible values: true, false
Default: false

Whether to trust the server certificate without validation. It's recommended to leave this disabled. This could be enabled for advanced networking scenarios.

Data Warehouse only

This option only applies to type: fabric. The FabricSpark adapter connects via HTTPS (Livy API).

retries

Possible values: any integer
Default: 3

The number of times to retry a failed connection before failing. This will not rerun a failed query, but will only be used for intermittent connection issues.

login_timeout

Possible values: any integer (seconds) ⏲

The timeout for establishing a connection to the server. This can be useful if you are receiving the Login timeout expired error. A value of 30 seconds could improve the connection resiliency. The adapter has no default value and will use the driver's default if not set.

Data Warehouse only

This option only applies to type: fabric. For FabricSpark, see spark_session_timeout.

query_timeout

Possible values: any integer (seconds) ⏲

The timeout for executing a query.

  • For type: fabric: this can be useful if you are receiving the Query timeout expired error. Default: 86400 seconds (24 hours).
  • For type: fabricspark: controls how long the adapter waits for a Livy statement to complete. Default: 86400 seconds (24 hours).

spark_session_timeout

Possible values: any integer (seconds) ⏲
Default: 900 (15 minutes)

The maximum time to wait for the Livy Spark session to become idle (ready to accept statements). This is relevant during the first statement of a dbt run, when a new session may need to be created.

FabricSpark and Python models

This option applies to type: fabricspark for all Livy session management, and to type: fabric when running Python models (which also use Livy sessions). For Data Warehouse SQL connection timeouts, see login_timeout.

livy_session_name

Default: dbt-fabric-samdebruyn

The name of the Livy session. Sessions are reused across statements within a dbt run. If an existing session with this name is found in an idle, starting, running, or busy state, it will be reused instead of creating a new one.

Used by both adapter types

This option is used by type: fabricspark for all SQL execution, and by type: fabric for Python model execution.

purview_endpoint

Alias: purview
Example value: https://your-account.purview.azure.com

The endpoint URL of your Microsoft Purview account. This is required to use the Purview integration.

You can find this in the Azure portal under your Purview account's Properties page (labeled "Atlas endpoint") or in the Purview governance portal settings.

Your authentication identity must have Data Curator and Data Reader roles in the Purview account's root collection.