dbt-external-tables¶
Tested version: 0.11.0 | Integration tested: Yes
dbt-external-tables enables defining external data sources that dbt can reference via {{ source() }}. This adapter overrides the package's Fabric plugin to use OPENROWSET(BULK ...) instead of the Synapse-style CREATE EXTERNAL TABLE syntax that Fabric Data Warehouse does not support.
For full setup instructions, source configuration examples, and format-specific options, see the external tables guide.
Dispatch configuration¶
Note
Unlike other packages, you do not need your project name in the search order for dbt-external-tables. The dbt entry is sufficient to route dispatch to the adapter's built-in overrides.
Macro compatibility¶
Legend: ✅ = supported on Fabric, ❌ = not supported on Fabric
Macros marked with (override) have a Fabric-specific implementation in this adapter. All other supported macros work without any adapter-specific override.
| Macro | Status | Notes |
|---|---|---|
stage_external_sources |
✅ | Entry point run-operation |
create_external_table |
✅ (override) | Creates a view wrapping OPENROWSET(BULK ...) instead of CREATE EXTERNAL TABLE |
refresh_external_table |
✅ (override) | No-op (OPENROWSET reads live data on every query) |
get_external_build_plan |
✅ | |
create_external_schema |
✅ | |
dropif |
✅ (override) | Drops the OPENROWSET view with DROP VIEW IF EXISTS |
exit_transaction |
✅ | |
update_external_table_columns |
❌ | Not implemented for Fabric |
recover_partitions |
❌ | Hive-style partitions not supported |
How it works¶
When you run dbt run-operation stage_external_sources, the overridden macros create views that wrap OPENROWSET queries. This means:
- External sources are queryable as regular views via
{{ source('my_external', 'my_table') }} - Data is always fresh --
OPENROWSETreads directly from the file on each query - No refresh is needed (the
refresh_external_tablemacro is a no-op)
Supported file formats¶
| Format | Extension detection | Notes |
|---|---|---|
| Parquet | .parquet |
Column schema inferred automatically if columns not defined |
| CSV | .csv, .tsv |
Supports header_row, fieldterminator, parser_version options |
| JSONL | .jsonl, .ldjson, .ndjson |
Line-delimited JSON |
Supported storage locations¶
- Fabric OneLake:
https://onelake.dfs.fabric.microsoft.com/<workspace-id>/<lakehouse-id>/Files/<path> - Azure Data Lake Storage Gen2:
https://<account>.dfs.core.windows.net/<container>/<path> - Azure Blob Storage:
https://<account>.blob.core.windows.net/<container>/<path>
Wildcards are supported (e.g., *.parquet to read all Parquet files in a folder).
Quick example¶
# sources.yml
sources:
- name: my_external
schema: dbo
tables:
- name: sales
external:
location: "https://onelake.dfs.fabric.microsoft.com/<workspace-id>/<lakehouse-id>/Files/data/sales.parquet"
file_format: parquet
columns:
- name: id
data_type: int
- name: amount
data_type: "decimal(10,2)"
Limitations¶
- Fabric Data Warehouse does not support the
DELTAformat inOPENROWSET. For Delta Lake tables, use cross-database queries to a Lakehouse instead. OPENROWSETqueries may be slower than querying data stored directly in the warehouse. For frequently accessed data, consider ingesting it into warehouse tables.- Authentication to external storage is handled by Fabric. The files must be accessible from your Fabric workspace.