NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Databricks SQL Warehouse
Databricks SQL is the lakehouse's warehouse impersonation -- a high-performance SQL engine (Photon) and BI surface built specifically to compete with Snowflake.
A Databricks SQL Warehouse is a cluster (or serverless pool) that runs SQL queries against Delta Lake tables, optimized for the BI workloads that Databricks historically lost to Snowflake. It is what you point Tableau, Power BI, Looker, dbt, or a JDBC client at when you want to query the lakehouse without using a notebook.
The simplest way to think about it: Databricks SQL is a warehouse-shaped UI and execution mode bolted onto a Spark-and-Delta-Lake platform. Underneath, it's still the lakehouse architecture — Parquet files in object storage, Delta transactions, Unity Catalog governance — but the user experience and the engine (Photon) are tuned for low-latency SQL the way Snowflake is.
For most of Databricks' history (2013—2020), it was a notebook company. Customers used PySpark in a notebook, ran ETL jobs, did data science, and trained ML models. The BI team — the analysts who lived in Tableau and Looker — did not use Databricks. They used Snowflake. This was an enormous structural problem. The BI workload is where the recurring revenue lives, where the seat counts are highest, and where the C-suite forms its opinion of which platform is "the data platform." Databricks was being shut out of it.
Databricks SQL was the response. It was previewed in November 2020 as SQL Analytics, went GA in 2021, and was rebranded Databricks SQL later that year. The launch had three coupled pieces, all of which had to work for the strategy to make sense:
The reason Databricks SQL exists is the mirror image of why Snowpark exists: Databricks needed a warehouse story to fight Snowflake. Just as Snowflake couldn't credibly sell to data engineers without a Python story, Databricks couldn't credibly sell to BI buyers without a low-latency SQL story. Databricks SQL is that story.
A SQL Warehouse is a managed compute resource sized in T-shirt terms (X-Small through 4X-Large). You pick a size, optionally make it serverless, and connect a BI tool to its JDBC endpoint. Behind the scenes:
Databricks SQL is the most strategically important product Databricks has ever shipped, and it's the product that most clearly proves the lakehouse thesis. For years, "lakehouse" was a slide in a deck. Databricks SQL plus Photon plus Unity Catalog plus Delta Lake is the actual implementation — a system that does warehouse things on lake storage, well enough that you can run Tableau against it without apology.
The convergence story is now impossible to deny. Snowflake added Iceberg tables on customer-managed object storage; Databricks added Photon and a serverless SQL endpoint. Both companies are building the same thing from opposite directions. A 2026 Snowflake deployment with Iceberg looks shockingly similar to a 2026 Databricks deployment with SQL Warehouses: open table format on cloud storage, fast vectorized SQL engine, governance layer, BI surface. The differences are real but increasingly secondary.
The honest assessment of where Databricks SQL still trails Snowflake: developer experience polish, the "credit-card-and-go" simplicity of provisioning, and the smoothness of small-cluster cold-start. Where it leads: it doesn't lock your data in proprietary storage, it shares one metadata layer with your ML and Spark workloads, and it's typically cheaper at the high end on heavy ELT jobs. Which one wins for a given customer comes down to organizational gravity (BI team vs. ML team) more than it does to product capability.
TextQL Ana connects to Databricks via the SQL Warehouse JDBC endpoint and treats it as a first-class target alongside Snowflake. The Unity Catalog metadata layer is particularly valuable for an AI analyst — column descriptions, table tags, and lineage all come through in a structured way that helps TextQL generate correct queries on the first attempt. For Databricks customers running BI workloads, TextQL is best deployed against a dedicated SQL Warehouse rather than a general-purpose cluster, both for performance isolation and for cost transparency.
See TextQL in action