Configuration¶

This page covers both adapter types

This adapter supports two Microsoft Fabric compute engines: Data Warehouse (type: fabric) and Lakehouse (type: fabricspark). Configuration options that are specific to one adapter type are marked accordingly. For a comprehensive guide on using the Lakehouse adapter, see the Lakehouse guide.

You'll need to create a profile in the profiles.yml file to connect to Microsoft Fabric. The adapter offers several ways to configure your connection so that it can be flexible to your needs.

Example profiles.yml¶

Data Warehouse (T-SQL)Lakehouse (Spark SQL)

default:
  target: dev
  outputs:
    dev:
      type: fabric
      workspace: your workspace name
      database: name_of_your_data_warehouse
      schema: schema_to_build_models_in

default:
  target: dev
  outputs:
    dev:
      type: fabricspark
      workspace: your workspace name
      database: name_of_your_lakehouse
      schema: schema_in_your_lakehouse

Use environment variables anywhere

You can use environment variables in any configuration value in your profiles.yml file.

default:
  target: dev
  outputs:
    dev:
      type: fabric
      ...
      client_id: "{{ env_var('AZURE_CLIENT_ID', 'an optional default value') }}"
      client_secret: "{{ env_var('AZURE_CLIENT_SECRET') }}"

Make sure to surround the Jinja block with quotes.

All configuration options¶

`type`¶

Required configuration option.

Possible values: fabric, fabricspark

Value	Compute engine	SQL dialect
`fabric`	Fabric Data Warehouse	T-SQL
`fabricspark`	Fabric Lakehouse	Spark SQL

`host`¶

Alias: server
Example value: abc-123.datawarehouse.fabric.microsoft.com

The server part of your connection string. This is unique per Workspace in Fabric.

You can leave this empty and let the adapter find it automatically by providing information about your Workspace. See workspace_name and workspace_id.

Not used for FabricSpark

This option is not used when type is fabricspark. The Lakehouse adapter connects via the Fabric Livy API, not TDS. The Livy endpoint is resolved automatically from workspace_name or workspace_id.

`database`¶

Required configuration option.

Data WarehouseLakehouse

Example value: gold_dwh

The name of your Data Warehouse in Fabric.

Example value: bronze_lakehouse

The name of your Lakehouse in Fabric. The adapter uses this as the target for the Livy API connection.

This IS the lakehouse

For type: fabricspark, the database field is how you specify which Lakehouse to connect to. There is no separate lakehouse or lakehouse_name option for this adapter type.

It's recommended to avoid using spaces in the name, although it's supported.

`schema`¶

Required configuration option.

Example value: dbt

The schema where dbt will build models. You must have write access to this schema. It's recommended to avoid using spaces in the schema name, although it's supported.

Override per model

The schema can be overridden per model/seed/test/folder/... using the schema config.

Further customization

You can even completely customize how dbt generates the schema name using the generate_schema_name macro.

`workspace_name`¶

Alias: workspace
Example value: My Workspace

The name of your Fabric Workspace.

For type: fabric: used to automatically find the host value. Not required if host is provided (except for Python models).
For type: fabricspark: required (unless workspace_id is provided). The Lakehouse adapter always needs the workspace to resolve the Livy API endpoint.

Not used if workspace_id is also provided.

Python models (Data Warehouse)

If you are using Python models with type: fabric, either workspace_name or workspace_id must be provided.

auth: ActiveDirectoryServicePrincipal

When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.

Behind the scenes, the adapter will do an API call to first find the Workspace ID, and then use that to find the server name.

`workspace_id`¶

Example value: 7275c94d-9280-438b-bd67-ffeb8c305c9b

The ID of your Fabric Workspace. Can be used instead of workspace_name.

For type: fabric: used to automatically find the host value. Not required if host is provided (except for Python models).
For type: fabricspark: required (unless workspace_name is provided). The Lakehouse adapter always needs the workspace.

Python models (Data Warehouse)

If you are using Python models with type: fabric, either workspace_name or workspace_id must be provided.

auth: ActiveDirectoryServicePrincipal

When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.

Behind the scenes, the adapter will do an API call to first find the server name.

`authentication` ¶

Alias: auth
Possible values (case insensitive):

ActiveDirectoryIntegrated
ActiveDirectoryPassword
ActiveDirectoryServicePrincipal
ActiveDirectoryInteractive
ActiveDirectoryMsi
auto (default)
CLI
environment
notebookutils
token_credential
workload_identity

The adapter supports an authentication method for every use case. The default is auto, which will try to use the best available method depending on your environment.

If you can't find a suitable method for your use case, please open an issue.

`ActiveDirectoryIntegrated`¶

Authenticate with a Windows credential federated through Microsoft Entra ID with integrated authentication. This works on domain-joined machines.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

`ActiveDirectoryPassword`¶

Authenticate with a Microsoft Entra ID username and password. You must provide the username and password options.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

`ActiveDirectoryServicePrincipal`¶

Authenticate with a Microsoft Entra ID service principal using a client ID and client secret. You must provide the client_id and client_secret options.

Tenant ID required for Workspace info or Python models

If you are using workspace_name or workspace_id, you also need to provide the tenant_id option.

`ActiveDirectoryInteractive`¶

Authenticate with a Microsoft Entra ID username and password using an interactive prompt. You must provide the username option.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

`ActiveDirectoryMsi`¶

Authenticate with a managed identity configured in your environment. This is typically used when running in Azure.

Workspace info and Python models

This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.

`auto`¶

Default authentication method.

This will try to authenticate using the best available method depending on your environment. It can automatically pick up configurations for managed identities, service principals, Azure CLI/PowerShell users, and more. The full list and order of methods is described on Microsoft Learn.

`CLI`¶

Authenticate using the credentials from the Azure CLI. You must be logged in using az login. There have been reports of issues when using an outdated version of the Azure CLI, so make sure to use the latest version. Your account does not need to have access to any Azure subscriptions or resources and the selected Azure subscription does not matter.

Since the Azure CLI supports a variety of authentication methods, this is a flexible option that works in many scenarios.

`environment`¶

Authenticate using environment variables. This works similarly to the auto method, but only uses environment variables. See Microsoft Learn for the list of supported environment variables.

`notebookutils`¶

This authentication method works inside a Fabric notebook. It uses NotebookUtils to get an access token for the current user.

Currently broken

This method is not working at the moment because Microsoft's Runtime in the Notebooks returns a credential with a scope that is not allowed to access Data Warehouses and SQL Endpoints. Use environment or ActiveDirectoryServicePrincipal inside notebooks instead.

`workload_identity`¶

Authenticate with Workload Identity Federation using a federated OIDC token. No client secret needed. Works with GitHub Actions, Kubernetes, and any OIDC provider. See the authentication guide for examples.

Requires tenant_id, client_id, and exactly one of federated_token_url or federated_token_file.

`token_credential`¶

Load any azure.core.credentials.TokenCredential implementation by its dotted import path. This is useful when the built-in methods don't cover your scenario -- for example, when using a custom OAuth flow, a token broker, or Workload Identity Federation with a non-standard setup. See the authentication guide for a full walkthrough.

Requires credential_class. Optionally accepts credential_kwargs.

`username` ¶

Aliases: UID, user
Example value: satya.nadella@microsoft.com

The username to use for authentication. This is required if you are using the ActiveDirectoryPassword or ActiveDirectoryInteractive authentication methods.

`password` ¶

Aliases: PWD, pass
Example value: IL0veC0p!lot!

The password to use for authentication. This is required if you are using the ActiveDirectoryPassword authentication method.

It's not recommended to hardcode this in your profiles.yml file. Instead, use an environment variable.

`client_id` ¶

Alias: app_id
Example value: 123e4567-e89b-12d3-a456-426614174000

The client ID of the Microsoft Entra ID application (service principal) to use for authentication. This is required if you are using the ActiveDirectoryServicePrincipal authentication method.

`client_secret` ¶

Alias: app_secret
Example value: 0123456789abcdef

The client secret of the Microsoft Entra ID application (service principal) to use for authentication. This is required if you are using the ActiveDirectoryServicePrincipal authentication method.

It's not recommended to hardcode this in your profiles.yml file. Instead, use an environment variable.

`tenant_id` ¶

Example value: 72f988bf-86f1-41af-91ab-2d7cd011db47

When authentication is set to ActiveDirectoryServicePrincipal, the adapter needs to know your Microsoft Entra ID tenant ID to be able to authenticate. This is required if you are using workspace_name or workspace_id or if you are using Python models.

`access_token` ¶

This option overrides all other authentication methods and directly uses the provided access token to authenticate. This can be useful if you want to fully manage the authentication yourself.

Token lifetime

This is not a recommended way to authenticate, as it requires you to manage the access token yourself. This is only meant for advanced use cases. In normal scenarios, the adapter manages the lifetime of the token for you and will automatically refresh it when needed. In this case, you will need to handle that yourself.

Token scope

Microsoft accepts multiple token scopes for Fabric. However, if you are using the workspace_name or workspace_id options or if you are using Python models, the token must have the https://analysis.windows.net/powerbi/api/.default scope.

`token_scope` ¶

Example values:

https://analysis.windows.net/powerbi/api/.default
https://database.windows.net/.default
pbi
DW

Depending on the authentication method you are using, the adapter will request an access token for a specific scope. This scope will be automatically determined based on your configuration. However, if you need to override the scope for some reason, you can use this option to set a custom scope.

`credential_class`¶

Example value: my_pkg.auth.MyCredential

The fully qualified dotted import path to a Python class that implements azure.core.credentials.TokenCredential. This is required when authentication is set to token_credential, and must not be set for any other authentication method.

The path must be a valid dotted Python identifier (e.g. my_pkg.sub.MyCredential). The class must be importable from the Python environment where dbt runs.

`credential_kwargs`¶

Example value:

credential_kwargs:
  token_url: "{{ env_var('TOKEN_URL') }}"
  audience: "https://my-api.example.com"

A dictionary of keyword arguments passed to the constructor of the class specified in credential_class. This is optional and can only be used when authentication is set to token_credential.

`federated_token_url`¶

Example value: https://token.actions.githubusercontent.com

The URL to fetch a federated OIDC token from. The adapter performs a GET request to this URL and reads the token from the value field of the JSON response. Can only be used when authentication is set to workload_identity.

Mutually exclusive with federated_token_file — exactly one must be set.

`federated_token_header`¶

Example value: bearer ghs_xxxxxxxxxxxxxxxxxxxx

The value for the Authorization header when fetching the federated token from federated_token_url. Can only be used together with federated_token_url, not with federated_token_file.

`federated_token_file`¶

Example value: /var/run/secrets/azure/tokens/azure-identity-token

Path to a file containing a federated OIDC token. The adapter re-reads this file each time it needs a fresh token, so external processes (like kubelet) can refresh it. Can only be used when authentication is set to workload_identity.

Mutually exclusive with federated_token_url — exactly one must be set.

`schema_auth`¶

Alias: schema_authorization
Example value: some_group_or_user

If your dbt project is using a schema which does not exist yet, dbt will create it for you. Use this configuration option to set the owner of the schema after creation. This can be a user or a group.

`lakehouse`¶

Alias: lakehouse_name
Example value: My Lakehouse

Data Warehouse only

This option only applies to type: fabric. For type: fabricspark, use database instead — that field specifies your Lakehouse directly.

The name of the Lakehouse in Fabric you wish to use for running Python models. This is only relevant for Data Warehouse projects that need a Lakehouse as a Spark execution environment for Python models.

When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.

`encrypt`¶

Possible values: true, false
Default: true

Whether to use encryption for the connection. It's recommended to leave this enabled. This could be disabled for advanced networking scenarios.

Data Warehouse only

This option only applies to type: fabric. The FabricSpark adapter connects via HTTPS (Livy API), which is always encrypted.

`trust_cert`¶

Alias: TrustServerCertificate
Possible values: true, false
Default: false

Whether to trust the server certificate without validation. It's recommended to leave this disabled. This could be enabled for advanced networking scenarios.

Data Warehouse only

This option only applies to type: fabric. The FabricSpark adapter connects via HTTPS (Livy API).

`retries`¶

Possible values: any integer
Default: 3

The number of times to retry a failed connection before failing. This will not rerun a failed query, but will only be used for intermittent connection issues.

`login_timeout`¶

Possible values: any integer (seconds)

The timeout for establishing a connection to the server. This can be useful if you are receiving the Login timeout expired error. A value of 30 seconds could improve the connection resiliency. The adapter has no default value and will use the driver's default if not set.

Data Warehouse only

This option only applies to type: fabric. For FabricSpark, see spark_session_timeout.

`query_timeout`¶

Possible values: any integer (seconds)

The timeout for executing a query.

For type: fabric: this can be useful if you are receiving the Query timeout expired error. Default: 86400 seconds (24 hours).
For type: fabricspark: controls how long the adapter waits for a Livy statement to complete. Default: 86400 seconds (24 hours).

`spark_session_timeout`¶

Possible values: any integer (seconds)
Default: 900 (15 minutes)

The maximum time to wait for the Livy Spark session to become idle (ready to accept statements). This is relevant during the first statement of a dbt run, when a new session may need to be created.

FabricSpark and Python models

This option applies to type: fabricspark for all Livy session management, and to type: fabric when running Python models (which also use Livy sessions). For Data Warehouse SQL connection timeouts, see login_timeout.

`livy_session_name`¶

Default: dbt-fabric-samdebruyn

The name of the Livy session. Sessions are reused across statements within a dbt run. If an existing session with this name is found in an idle, starting, running, or busy state, it will be reused instead of creating a new one.

Used by both adapter types

This option is used by type: fabricspark for all SQL execution, and by type: fabric for Python model execution.

`purview_endpoint`¶

Alias: purview
Example value: https://your-account.purview.azure.com

The endpoint URL of your Microsoft Purview account. This is required to use the Purview integration.

You can find this in the Azure portal under your Purview account's Properties page (labeled "Atlas endpoint") or in the Purview governance portal settings.

Your authentication identity must have Data Curator and Data Reader roles in the Purview account's root collection.

Configuration¶

Example profiles.yml¶

All configuration options¶

type¶

host¶

database¶

schema¶

workspace_name¶

workspace_id¶

authentication ¶

ActiveDirectoryIntegrated¶

ActiveDirectoryPassword¶

ActiveDirectoryServicePrincipal¶

ActiveDirectoryInteractive¶

ActiveDirectoryMsi¶

auto¶

CLI¶

environment¶

notebookutils¶

workload_identity¶

token_credential¶

username ¶

password ¶

client_id ¶

client_secret ¶

tenant_id ¶

access_token ¶

token_scope ¶

credential_class¶

credential_kwargs¶

federated_token_url¶

federated_token_header¶

federated_token_file¶

schema_auth¶

lakehouse¶

encrypt¶

trust_cert¶

retries¶

login_timeout¶

query_timeout¶

spark_session_timeout¶

livy_session_name¶

purview_endpoint¶

`type`¶

`host`¶

`database`¶

`schema`¶

`workspace_name`¶

`workspace_id`¶

`authentication` ¶

`ActiveDirectoryIntegrated`¶

`ActiveDirectoryPassword`¶

`ActiveDirectoryServicePrincipal`¶

`ActiveDirectoryInteractive`¶

`ActiveDirectoryMsi`¶

`auto`¶

`CLI`¶

`environment`¶

`notebookutils`¶

`workload_identity`¶

`token_credential`¶

`username` ¶

`password` ¶

`client_id` ¶

`client_secret` ¶

`tenant_id` ¶

`access_token` ¶

`token_scope` ¶

`credential_class`¶

`credential_kwargs`¶

`federated_token_url`¶

`federated_token_header`¶

`federated_token_file`¶

`schema_auth`¶

`lakehouse`¶

`encrypt`¶

`trust_cert`¶

`retries`¶

`login_timeout`¶

`query_timeout`¶

`spark_session_timeout`¶

`livy_session_name`¶

`purview_endpoint`¶