Skip to content

Foreign Data Wrappers

import { Aside } from ‘@astrojs/starlight/components’;

Foreign Data Wrappers (FDW) let a Postgres or Supabase database query another system as if it were a local table. SQLAnvil’s wrapper() action declares the whole bridge — extension, wrapper, server, and ref()-able foreign tables — in a single call, so you can join data that lives in BigQuery (or any FDW-backed source) with your operational tables and never leave SQL.

The cleanest way to read a foreign source is to define a named connection in workflow_settings.yaml and tag a declaration with it. SQLAnvil generates the FDW bridge for you, and the declared table becomes ref()-able like any other source.

# workflow_settings.yaml (committed)
warehouse: my_supabase
connections:
my_supabase:
platform: supabase
defaultSchema: public
bigquery_public:
platform: bigquery
project: bigquery-public-data
dataset: geo_us_boundaries
saKeyId: <vault-secret-id> # non-secret Vault pointer
-- definitions/sources/zip_codes.sqlx
config {
type: "declaration",
connection: "bigquery_public",
name: "zip_codes",
columnTypes: { zip_code: "text", internal_point_lat: "float8", internal_point_lon: "float8" }
}

Now ${ref("zip_codes")} resolves to a foreign table SQLAnvil created behind the scenes (in a bigquery_public_ext schema), and you join it with your Supabase tables in plain SQL. Connection definitions (non-secret: platform, project, dataset, saKeyId, host/port/db) live in workflow_settings.yaml; the actual secrets stay in the gitignored .df-credentials.json, keyed by connection name. connection: is valid only on declarations — tables and views always build into your one read/write warehouse.

columnTypes is required on a connection-tagged declaration (the FDW foreign table needs column types) — but you rarely hand-write it; see below.

sqlanvil introspect reads a source table’s columns and writes the declaration for you, with columnTypes (and any column descriptions) filled in:

Terminal window
sqlanvil introspect bigquery_public geo_us_boundaries.zip_codes \
--output definitions/sources/zip_codes.sqlx

It connects directly to the source with that connection’s credentials, maps the source types to your warehouse dialect (BigQuery → Postgres; Postgres → Postgres), and prints the declaration (or writes it with --output). It’s a dev-time command — compile and run never touch the network for schema.

For full control (or sources without a connection preset), wrapper() is the lower-level JavaScript-API action that emits the FDW setup directly — put it in a .js file under definitions/:

definitions/sources/bigquery_zip_codes.js
wrapper({
name: "bq_setup",
provider: "bigquery", // preset: wrappers ext + bigquery handler/validator
server: "bq_geo_server",
serverOptions: {
project_id: "bigquery-public-data",
dataset_id: "geo_us_boundaries"
},
credential: {
// A Vault secret id — a non-secret pointer. The service-account key JSON
// is stored in Vault, never in your repo. (See "Credentials" below.)
saKeyId: sqlanvil.projectConfig.vars.bq_sa_key_id
},
foreignTables: [
{
name: "zip_codes", // → ref("zip_codes") works downstream
schema: "bq_ext",
options: { table: "zip_codes", location: "US" },
columns: {
zip_code: "text",
internal_point_lat: "float8",
internal_point_lon: "float8"
}
}
]
});

This compiles to the full setup, in dependency order:

create extension if not exists "wrappers" cascade;
-- idempotent foreign data wrapper creation
do $$ begin
if not exists (select 1 from pg_foreign_data_wrapper where fdwname = 'bigquery_wrapper') then
create foreign data wrapper bigquery_wrapper
handler big_query_fdw_handler validator big_query_fdw_validator;
end if;
end $$;
drop server if exists "bq_geo_server" cascade;
create server "bq_geo_server" foreign data wrapper "bigquery_wrapper"
options (project_id 'bigquery-public-data', dataset_id 'geo_us_boundaries', sa_key_id '<vault-id>');
-- one per foreignTables[] entry:
drop foreign table if exists "bq_ext"."zip_codes";
create foreign table "bq_ext"."zip_codes" ("zip_code" text, "internal_point_lat" float8, "internal_point_lon" float8)
server "bq_geo_server" options (table 'zip_codes', location 'US');

Each entry in foreignTables[] becomes a ref()-able target that depends on the server setup, so downstream models consume it like any other source:

-- definitions/staging/stg_zip_codes.sqlx
config { type: "view", schema: "bq_ext" }
SELECT zip_code, internal_point_lat AS lat, internal_point_lon AS lon
FROM ${ref("zip_codes")}

A common pattern is to materialize the foreign data into a real table once, so downstream joins don’t re-hit the remote warehouse on every query:

-- definitions/staging/zip_codes_cache.sqlx
config { type: "table", schema: "public" }
SELECT * FROM ${ref("stg_zip_codes")}

wrapper() never handles your service-account key. Store it once in Supabase Vault and reference its id:

-- Run once in the Supabase SQL editor:
select vault.create_secret('<paste service-account JSON>', 'bigquery_sa');
select id from vault.secrets where name = 'bigquery_sa';

Pass the returned id as credential.saKeyId (e.g. via a var so it stays out of committed config).

wrapper() config:

FieldRequiredDescription
nameName of the server-setup action
providerPreset that infers the extension + handler/validator (e.g. "bigquery")
wrapper / handler / validatorExplicit FDW for generic Postgres FDWs (required when provider is omitted)
serverForeign server name
serverOptionsMap of server options (e.g. project_id, dataset_id)
credential.saKeyIdVault secret id passed to the server as sa_key_id (Supabase)
foreignTables[]Foreign tables to expose (name, schema, options, columns) — each is ref()-able

The supabase_bigquery_mailing_list example builds a proximity mailing list — customers who purchased recently and live within a radius of a target ZIP — by joining Supabase operational data with Google’s public ZIP geo data over a live BigQuery FDW, with PostGIS distance math.