Configuration¶
This page covers both adapter types
This adapter supports two Microsoft Fabric compute engines: Data Warehouse (type: fabric) and Lakehouse (type: fabricspark). Configuration options that are specific to one adapter type are marked accordingly. For a comprehensive guide on using the Lakehouse adapter, see the Lakehouse guide.
You'll need to create a profile in the profiles.yml file to connect to Microsoft Fabric. The adapter offers several ways to configure your connection so that it can be flexible to your needs.
Example profiles.yml¶
Use environment variables anywhere
You can use environment variables in any configuration value in your profiles.yml file.
default:
target: dev
outputs:
dev:
type: fabric
...
client_id: "{{ env_var('AZURE_CLIENT_ID', 'an optional default value') }}"
client_secret: "{{ env_var('AZURE_CLIENT_SECRET') }}"
Make sure to surround the Jinja block with quotes.
All configuration options¶
type¶
Required configuration option.
Possible values: fabric, fabricspark
| Value | Compute engine | SQL dialect |
|---|---|---|
fabric |
Fabric Data Warehouse | T-SQL |
fabricspark |
Fabric Lakehouse | Spark SQL |
host¶
Alias: server
Example value: abc-123.datawarehouse.fabric.microsoft.com
The server part of your connection string. This is unique per Workspace in Fabric.
You can leave this empty and let the adapter find it automatically by providing information about your Workspace. See workspace_name and workspace_id.
Not used for FabricSpark
This option is not used when type is fabricspark. The Lakehouse adapter connects via the Fabric Livy API, not TDS. The Livy endpoint is resolved automatically from workspace_name or workspace_id.
database¶
Required configuration option.
Example value: gold_dwh
The name of your Data Warehouse in Fabric.
Example value: bronze_lakehouse
The name of your Lakehouse in Fabric. The adapter uses this as the target for the Livy API connection.
This IS the lakehouse
For type: fabricspark, the database field is how you specify which Lakehouse to connect to. There is no separate lakehouse or lakehouse_name option for this adapter type.
It's recommended to avoid using spaces in the name, although it's supported.
schema¶
Required configuration option.
Example value: dbt
The schema where dbt will build models. You must have write access to this schema. It's recommended to avoid using spaces in the schema name, although it's supported.
Override per model
The schema can be overridden per model/seed/test/folder/... using the schema config.
Further customization
You can even completely customize how dbt generates the schema name using the generate_schema_name macro.
workspace_name¶
Alias: workspace
Example value: My Workspace
The name of your Fabric Workspace.
- For
type: fabric: used to automatically find thehostvalue. Not required ifhostis provided (except for Python models). - For
type: fabricspark: required (unlessworkspace_idis provided). The Lakehouse adapter always needs the workspace to resolve the Livy API endpoint.
Not used if workspace_id is also provided.
Python models (Data Warehouse)
If you are using Python models with type: fabric, either workspace_name or workspace_id must be provided.
auth: ActiveDirectoryServicePrincipal
When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.
Behind the scenes, the adapter will do an API call to first find the Workspace ID, and then use that to find the server name.
workspace_id¶
Example value: 7275c94d-9280-438b-bd67-ffeb8c305c9b
The ID of your Fabric Workspace. Can be used instead of workspace_name.
- For
type: fabric: used to automatically find thehostvalue. Not required ifhostis provided (except for Python models). - For
type: fabricspark: required (unlessworkspace_nameis provided). The Lakehouse adapter always needs the workspace.
Python models (Data Warehouse)
If you are using Python models with type: fabric, either workspace_name or workspace_id must be provided.
auth: ActiveDirectoryServicePrincipal
When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.
Behind the scenes, the adapter will do an API call to first find the server name.
authentication ¶
Alias: auth
Possible values (case insensitive):
ActiveDirectoryIntegratedActiveDirectoryPasswordActiveDirectoryServicePrincipalActiveDirectoryInteractiveActiveDirectoryMsiauto(default)CLIenvironmentnotebookutilstoken_credentialworkload_identity
The adapter supports an authentication method for every use case. The default is auto, which will try to use the best available method depending on your environment.
If you can't find a suitable method for your use case, please open an issue.
ActiveDirectoryIntegrated¶
Authenticate with a Windows credential federated through Microsoft Entra ID with integrated authentication. This works on domain-joined machines.
Workspace info and Python models
This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.
ActiveDirectoryPassword¶
Authenticate with a Microsoft Entra ID username and password. You must provide the username and password options.
Workspace info and Python models
This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.
ActiveDirectoryServicePrincipal¶
Authenticate with a Microsoft Entra ID service principal using a client ID and client secret. You must provide the client_id and client_secret options.
Tenant ID required for Workspace info or Python models
If you are using workspace_name or workspace_id, you also need to provide the tenant_id option.
ActiveDirectoryInteractive¶
Authenticate with a Microsoft Entra ID username and password using an interactive prompt. You must provide the username option.
Workspace info and Python models
This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.
ActiveDirectoryMsi¶
Authenticate with a managed identity configured in your environment. This is typically used when running in Azure.
Workspace info and Python models
This is not compatible with the workspace_name or workspace_id options or with Python models. In this case, it's recommended to look at the auto or CLI options as alternatives.
auto¶
Default authentication method.
This will try to authenticate using the best available method depending on your environment. It can automatically pick up configurations for managed identities, service principals, Azure CLI/PowerShell users, and more. The full list and order of methods is described on Microsoft Learn.
CLI¶
Authenticate using the credentials from the Azure CLI. You must be logged in using az login. There have been reports of issues when using an outdated version of the Azure CLI, so make sure to use the latest version. Your account does not need to have access to any Azure subscriptions or resources and the selected Azure subscription does not matter.
Since the Azure CLI supports a variety of authentication methods, this is a flexible option that works in many scenarios.
environment¶
Authenticate using environment variables. This works similarly to the auto method, but only uses environment variables. See Microsoft Learn for the list of supported environment variables.
notebookutils¶
This authentication method works inside a Fabric notebook. It uses NotebookUtils to get an access token for the current user.
Currently broken
This method is not working at the moment because Microsoft's Runtime in the Notebooks returns a credential with a scope that is not allowed to access Data Warehouses and SQL Endpoints. Use environment or ActiveDirectoryServicePrincipal inside notebooks instead.
workload_identity¶
Authenticate with Workload Identity Federation using a federated OIDC token. No client secret needed. Works with GitHub Actions, Kubernetes, and any OIDC provider. See the authentication guide for examples.
Requires tenant_id, client_id, and exactly one of federated_token_url or federated_token_file.
token_credential¶
Load any azure.core.credentials.TokenCredential implementation by its dotted import path. This is useful when the built-in methods don't cover your scenario -- for example, when using a custom OAuth flow, a token broker, or Workload Identity Federation with a non-standard setup. See the authentication guide for a full walkthrough.
Requires credential_class. Optionally accepts credential_kwargs.
username ¶
Aliases: UID, user
Example value: satya.nadella@microsoft.com
The username to use for authentication. This is required if you are using the ActiveDirectoryPassword or ActiveDirectoryInteractive authentication methods.
password ¶
Aliases: PWD, pass
Example value: IL0veC0p!lot!
The password to use for authentication. This is required if you are using the ActiveDirectoryPassword authentication method.
It's not recommended to hardcode this in your profiles.yml file. Instead, use an environment variable.
client_id ¶
Alias: app_id
Example value: 123e4567-e89b-12d3-a456-426614174000
The client ID of the Microsoft Entra ID application (service principal) to use for authentication. This is required if you are using the ActiveDirectoryServicePrincipal authentication method.
client_secret ¶
Alias: app_secret
Example value: 0123456789abcdef
The client secret of the Microsoft Entra ID application (service principal) to use for authentication. This is required if you are using the ActiveDirectoryServicePrincipal authentication method.
It's not recommended to hardcode this in your profiles.yml file. Instead, use an environment variable.
tenant_id ¶
Example value: 72f988bf-86f1-41af-91ab-2d7cd011db47
When authentication is set to ActiveDirectoryServicePrincipal, the adapter needs to know your Microsoft Entra ID tenant ID to be able to authenticate. This is required if you are using workspace_name or workspace_id or if you are using Python models.
access_token ¶
This option overrides all other authentication methods and directly uses the provided access token to authenticate. This can be useful if you want to fully manage the authentication yourself.
Token lifetime
This is not a recommended way to authenticate, as it requires you to manage the access token yourself. This is only meant for advanced use cases. In normal scenarios, the adapter manages the lifetime of the token for you and will automatically refresh it when needed. In this case, you will need to handle that yourself.
Token scope
Microsoft accepts multiple token scopes for Fabric. However, if you are using the workspace_name or workspace_id options or if you are using Python models, the token must have the https://analysis.windows.net/powerbi/api/.default scope.
token_scope ¶
Example values:
https://analysis.windows.net/powerbi/api/.defaulthttps://database.windows.net/.defaultpbiDW
Depending on the authentication method you are using, the adapter will request an access token for a specific scope. This scope will be automatically determined based on your configuration. However, if you need to override the scope for some reason, you can use this option to set a custom scope.
credential_class¶
Example value: my_pkg.auth.MyCredential
The fully qualified dotted import path to a Python class that implements azure.core.credentials.TokenCredential. This is required when authentication is set to token_credential, and must not be set for any other authentication method.
The path must be a valid dotted Python identifier (e.g. my_pkg.sub.MyCredential). The class must be importable from the Python environment where dbt runs.
credential_kwargs¶
Example value:
A dictionary of keyword arguments passed to the constructor of the class specified in credential_class. This is optional and can only be used when authentication is set to token_credential.
federated_token_url¶
Example value: https://token.actions.githubusercontent.com
The URL to fetch a federated OIDC token from. The adapter performs a GET request to this URL and reads the token from the value field of the JSON response. Can only be used when authentication is set to workload_identity.
Mutually exclusive with federated_token_file — exactly one must be set.
federated_token_header¶
Example value: bearer ghs_xxxxxxxxxxxxxxxxxxxx
The value for the Authorization header when fetching the federated token from federated_token_url. Can only be used together with federated_token_url, not with federated_token_file.
federated_token_file¶
Example value: /var/run/secrets/azure/tokens/azure-identity-token
Path to a file containing a federated OIDC token. The adapter re-reads this file each time it needs a fresh token, so external processes (like kubelet) can refresh it. Can only be used when authentication is set to workload_identity.
Mutually exclusive with federated_token_url — exactly one must be set.
schema_auth¶
Alias: schema_authorization
Example value: some_group_or_user
If your dbt project is using a schema which does not exist yet, dbt will create it for you. Use this configuration option to set the owner of the schema after creation. This can be a user or a group.
lakehouse¶
Alias: lakehouse_name
Example value: My Lakehouse
Data Warehouse only
This option only applies to type: fabric. For type: fabricspark, use database instead — that field specifies your Lakehouse directly.
The name of the Lakehouse in Fabric you wish to use for running Python models. This is only relevant for Data Warehouse projects that need a Lakehouse as a Spark execution environment for Python models.
When using this option together with authentication set to ActiveDirectoryServicePrincipal, you also need to provide the tenant_id option.
encrypt¶
Possible values: true, false
Default: true
Whether to use encryption for the connection. It's recommended to leave this enabled. This could be disabled for advanced networking scenarios.
Data Warehouse only
This option only applies to type: fabric. The FabricSpark adapter connects via HTTPS (Livy API), which is always encrypted.
trust_cert¶
Alias: TrustServerCertificate
Possible values: true, false
Default: false
Whether to trust the server certificate without validation. It's recommended to leave this disabled. This could be enabled for advanced networking scenarios.
Data Warehouse only
This option only applies to type: fabric. The FabricSpark adapter connects via HTTPS (Livy API).
retries¶
Possible values: any integer
Default: 3
The number of times to retry a failed connection before failing. This will not rerun a failed query, but will only be used for intermittent connection issues.
login_timeout¶
Possible values: any integer (seconds)
The timeout for establishing a connection to the server. This can be useful if you are receiving the Login timeout expired error. A value of 30 seconds could improve the connection resiliency. The adapter has no default value and will use the driver's default if not set.
Data Warehouse only
This option only applies to type: fabric. For FabricSpark, see spark_session_timeout.
query_timeout¶
Possible values: any integer (seconds)
The timeout for executing a query.
- For
type: fabric: this can be useful if you are receiving theQuery timeout expirederror. Default: 86400 seconds (24 hours). - For
type: fabricspark: controls how long the adapter waits for a Livy statement to complete. Default: 86400 seconds (24 hours).
spark_session_timeout¶
Possible values: any integer (seconds)
Default: 900 (15 minutes)
The maximum time to wait for the Livy Spark session to become idle (ready to accept statements). This is relevant during the first statement of a dbt run, when a new session may need to be created.
FabricSpark and Python models
This option applies to type: fabricspark for all Livy session management, and to type: fabric when running Python models (which also use Livy sessions). For Data Warehouse SQL connection timeouts, see login_timeout.
livy_session_name¶
Default: dbt-fabric-samdebruyn
The name of the Livy session. Sessions are reused across statements within a dbt run. If an existing session with this name is found in an idle, starting, running, or busy state, it will be reused instead of creating a new one.
Used by both adapter types
This option is used by type: fabricspark for all SQL execution, and by type: fabric for Python model execution.
purview_endpoint¶
Alias: purview
Example value: https://your-account.purview.azure.com
The endpoint URL of your Microsoft Purview account. This is required to use the Purview integration.
You can find this in the Azure portal under your Purview account's Properties page (labeled "Atlas endpoint") or in the Purview governance portal settings.
Your authentication identity must have Data Curator and Data Reader roles in the Purview account's root collection.