Integration Guide

LayerExchange for Databricks

LayerExchange is a no-code enterprise application platform. This guide describes how it connects to the Databricks Data Intelligence Platform: how data moves between the two systems, how schema is discovered, how geometry is encoded, how operational write-back works, and how Databricks AI agents can invoke LayerExchange tools.

The integration is bidirectional. LayerExchange reads from Delta tables, writes to Delta tables, and exposes its own operational data and actions back to Databricks notebooks, jobs, and AI agents. There is no separate sync engine to install — every behavior described here is configuration on the LayerExchange side.

1. Overview

LayerExchange brings three things to a Databricks deployment:

  • Spatial-aware ingestion. Geographic feature classes from any GIS of record — Esri ArcGIS, QGIS server, MapInfo, GeoServer, or any OGC-compliant source — flow into Bronze Delta tables with geometry encoded in an open, queryable format and with full lineage in Unity Catalog.
  • Governed write-back. Analytics outputs computed in Databricks (risk scores, predictions, classifications) are pushed back to source systems — GIS feature attributes, work-order systems, EAM, SCADA — through configurable governance gates.
  • Operational applications on top of the lakehouse. Forms, dashboards, workflows, and map-centric UIs built in LayerExchange read directly from Gold Delta tables. End users act on lakehouse data without leaving their operational tools.

The integration uses public, supported Databricks interfaces only: SQL Warehouse JDBC for reads and small writes, the Statement Execution API and Volumes API for bulk writes, Job webhooks for event-driven flows, and the Model Context Protocol for AI agent interoperability.

2. Architecture

   Source systems of record                          Databricks
   ─────────────────────────                         ──────────
   GIS (any vendor)              ┌──── LX ────┐      Unity Catalog
   Work-order / EAM / MIS  ◀───▶ │ Integration│ ◀──▶ Bronze ▶ Silver ▶ Gold
   SCADA / IoT historians        │     Hub    │              (Delta tables)
   Documents / files             └──────┬─────┘                    │
                                        │                          │
                                        ▼                          ▼
                                  Forms · Dashboards · Map UI · AI Agent
                                  (end-user surfaces, no code)

Four operational components run in parallel:

ComponentDirectionPurpose
A — Spatial IngestionSource → Delta BronzeLand authoritative source data into the lakehouse with geometry preserved
B — GovernanceIn-flightQA/QC rules gate records before they reach Bronze; quarantine path is observable
C — Read-backDelta Gold → LXLX consumes analytics outputs and renders them in operational apps
D — Write-backLX → Source systemsAnalytics outputs land on the systems users already operate from
E — AI Agent BridgeBidirectionalDatabricks AI agents call LX tools; LX agents call Databricks Model Serving

Each component is configured through declarative templates. None of them require code in the LayerExchange runtime.

3. Connectivity

3.1 Databricks SQL Warehouse (JDBC)

The primary read path and the simplest write path use the Databricks SQL Warehouse JDBC interface. A single connection profile in LayerExchange holds:

  • JDBC URL (jdbc:databricks://:443;httpPath=/sql/1.0/warehouses/;transportMode=http;ssl=1;AuthMech=3)
  • Authentication credential (OAuth M2M service principal recommended; Personal Access Token supported for development)
  • Default Unity Catalog and schema
  • Connection pool sizing

The connection profile is reused across every integration that talks to that workspace. Schema discovery uses standard JDBC DatabaseMetaData — Unity Catalog catalogs and schemas appear as a browsable tree, columns are introspected with their Delta types, and a type map translates them to LayerExchange's internal type system (STRING, INTEGER, LONG, BIG_DECIMAL, DATE, DATETIME, BOOLEAN; complex types are stored as JSON strings).

User-Agent telemetry is emitted on every JDBC call so that activity attributable to LayerExchange is identifiable in Databricks audit logs.

3.2 Statement Execution API (REST)

For operational writes — feedback events, state changes, audit signals — LayerExchange uses the Databricks SQL Statement Execution API. This avoids the small-file problem that pure row-by-row JDBC inserts create on Delta tables: writes are batched and submitted as a single MERGE statement.

3.3 Volumes API + COPY INTO (Bulk)

For high-volume ingestion (tens of thousands of records per batch — typical for the initial load of a feature class), LayerExchange writes Parquet to a Unity Catalog Volume and issues COPY INTO against the target Delta table. This is the same pattern professional ELT tools use against Delta Lake, and it is idempotent — re-running a load does not duplicate rows.

3.4 Job Webhooks

Databricks Workflows can be configured to send on_success and on_failure notifications to a LayerExchange webhook endpoint. This enables event-driven downstream flows — for example, "when the nightly risk-scoring job succeeds, trigger an inbound sync of the new Gold table."

4. Spatial Data

4.1 Geometry Encoding

Delta Lake has no native geometry type, so geometries are stored in an open binary or text format. LayerExchange supports four encodings, selected per connection:

  • WKB (Well-Known Binary) — default, OGC standard, lossless, queryable with Apache Sedona's spatial SQL functions.
  • WKT (Well-Known Text) — human-readable, useful for debugging and lightweight tooling.
  • H3 — hexagonal grid index, recommended as a secondary column alongside WKB for accelerated point-in-region queries.
  • GeoJSON — for downstream tooling that expects it.

On the way in, source geometries (Esri JSON, OGC GML, Shapefile records, WKT from PostGIS) are normalized to the configured encoding by LayerExchange's field-mapping transforms — no notebook code required.

On the way out, the same transforms convert Delta-stored geometries back to the format the destination system expects (Esri JSON for ArcGIS feature attributes, WKT for PostGIS write-back, and so on).

4.2 Lineage and Audit Columns

Every Bronze table written by LayerExchange carries four audit columns added automatically:

  • lx_ingest_ts — when the row entered Bronze
  • lx_source_system — symbolic identifier of the system of record
  • lx_validated_status — outcome of the governance gate (PASS, QUARANTINE, PROMOTED)
  • lx_record_hash — content hash for change detection

These columns are populated by the integration runtime and are visible in Unity Catalog with descriptions, so lineage queries return meaningful results without additional configuration.

5. Configuration Model

Everything in this guide is configured through three declarative artifacts. None of them require a code deployment.

Connection. Connects to one Databricks workspace. Holds JDBC URL, authentication, default catalog/schema, pool size, and geometry preference.

Integration. Binds a source dataset to a Delta table (or vice-versa). Holds the target table (fully qualified catalog.schema.table), a direction (INBOUND or OUTBOUND), an optional WHERE filter, conflict-key columns for upserts, an incremental watermark column, and a field map between source and target columns.

Connector template. A JSON document describing the operations the system supports (query_table, query_custom, insert_records, upsert_records), type mappings, schema-discovery rules, and authentication. The Databricks connector template ships with the platform; new connectors are added as new JSON files, never as code.

Adding a new source system on either side of the integration is a configuration task. Adding a new GIS vendor, a new EAM, or a new IoT historian follows the same pattern as the existing ones.

6. Data Flow Patterns

6.1 Inbound — Delta to LayerExchange

A scheduled job pulls rows from a Delta table using the connection's SQL Warehouse, applies a watermark filter (WHERE > ), maps columns through the integration's field map, and upserts into the LayerExchange data structure that backs the operational app. The watermark advances atomically with the batch.

The schedule is driven by a cron expression on the integration. Full reloads are triggered manually or by webhook.

6.2 Outbound — LayerExchange to Delta

For bulk loads, rows are batched in memory, serialized to Parquet, uploaded to a staging Volume, and consumed by COPY INTO. For incremental writes, the same path is used in smaller batches. For single-record operational writes (a user transitions a work order to COMPLETED), the Statement Execution API runs a MERGE statement.

6.3 Outbound — LayerExchange to a System of Record

The same integration primitives target non-Databricks systems. A typical pattern: a Databricks notebook computes a risk score per asset, writes it to a Gold Delta table; LayerExchange reads the Gold table, evaluates a governance rule (only push records with confidence above a threshold), and pushes the resulting attribute updates to the GIS that owns the feature class. The user sees the updated attribute the next time they open the map — in their existing GIS client, with no migration.

7. AI Agents

LayerExchange exposes its operational tools — search, record creation, record updates, state transitions, attachment management, BPM task operations — through a Model Context Protocol (MCP) server. Databricks Mosaic AI agents can be configured to call this server as an external tool source through the Databricks AI Gateway.

This enables prompts of the form "create work orders for every asset in district 4 whose predicted failure risk is above 80%, assign them to the standby team, and attach the latest inspection photo" to execute end-to-end: the agent retrieves predictions from Delta, calls LayerExchange MCP tools to create the work orders and attachments, and returns confirmation. The same MCP server can be used from any compliant agent runtime — it is not tied to Databricks.

In the reverse direction, the LayerExchange AI agent can call Databricks Model Serving endpoints for inference (running a deployed model on demand from inside a form or workflow).

Authentication on the bridge is OAuth M2M via the AI Gateway. Permissions on what an agent can do in LayerExchange are governed by the same security groups and data-structure-level ACLs that govern human users.

8. Security and Governance

  • Authentication. OAuth M2M (service principal) is the recommended path for all connections. Personal Access Tokens are supported for development but should not be used in production.
  • Authorization. Each integration runs as a configured service identity. Unity Catalog permissions apply normally — LayerExchange cannot read or write what its identity is not permitted to.
  • Audit. Every JDBC and REST call carries a LayerExchange User-Agent header for attribution. Every integration run produces an execution record in LayerExchange with row counts, timing, and error details. Bronze tables carry the audit columns described in §4.2.
  • QA/QC gates. The Business Rules Engine evaluates configurable rules before any record reaches a Bronze table. Failing records go to a quarantine path with the failing rule recorded; they remain visible to operators and can be corrected and re-promoted without manual reprocessing.
  • PII and sensitive fields. Connection credentials are stored encrypted. Sensitive columns can be masked in LayerExchange UI by data-structure-level column policies independent of Unity Catalog masking.

9. Operations

  • Scheduling. Integrations run on cron schedules, on demand from the UI, or in response to webhooks. Run history is queryable.
  • Watermarking. Each integration tracks its own watermark; advancing it atomically prevents missed or duplicated rows across restarts.
  • Idempotency. All write operations are designed to be safely re-runnable. Upserts use the configured conflict key. Bulk loads via COPY INTO are tracked by Databricks at the file level.
  • Monitoring. Execution status, row counts, durations, and error messages are visible per integration. Failed runs can be inspected and replayed from the UI without redeploying.
  • Schema evolution. Re-running schema discovery shows new columns as a delta. The administrator chooses which new columns to add to the bound data structure; existing data is unaffected.

10. Deployment Footprint

The Databricks side requires only:

  • A SQL Warehouse (any size; Serverless is supported).
  • A Unity Catalog catalog and schema for Bronze tables.
  • A Unity Catalog Volume for bulk staging.
  • A service principal with USE CATALOG, USE SCHEMA, and MODIFY on the target schema and WRITE VOLUME on the staging Volume.
  • (For AI agent integration) An AI Gateway external tool registration pointing at the LayerExchange MCP endpoint.

The LayerExchange side requires only the standard platform deployment plus the Databricks connector template (shipped) and the Databricks JDBC driver (Maven dependency, registered automatically).

There is no agent, no sidecar, and no notebook code to install or maintain. Every integration described in this guide is configuration.

LayerExchange · Integration Guide · Databricks Data Intelligence Platform