NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
dbt Labs
dbt Labs (formerly Fishtown Analytics) is the company behind dbt -- the SQL-based transformation tool that defined the analytics engineering category. Founded in 2016 by Tristan Handy, dbt Labs dominates transformation mindshare while struggling with the classic open-core monetization problem.
dbt Labs is the company behind dbt — the SQL-based transformation tool that, more than any other single product, defined the "modern data stack" and the role of the analytics engineer. dbt Labs was founded in 2016 by Tristan Handy as Fishtown Analytics — a Philadelphia-based data consultancy that built dbt as an internal tool and then open-sourced it. The consultancy quietly turned into a venture-backed software company, the open-source project quietly took over the data world, and the company was eventually renamed dbt Labs in 2021 to reflect the obvious.
In plain English: dbt is the tool you use to write the SQL that transforms raw data inside your warehouse into the clean, modeled tables your dashboards and ML models actually consume. Before dbt, that work happened in a tangled mess of stored procedures, Airflow tasks, and Jinja-templated SQL files maintained by whichever data engineer hadn't quit yet. dbt made it look like software engineering: version-controlled, tested, documented, modular, with dependencies between models declared explicitly. That's the entire trick, and it changed how a generation of data teams works.
dbt Labs has one of the more interesting origin stories in the modern data stack. In 2016, Tristan Handy was a consultant at RJMetrics (an early Looker competitor). RJMetrics was acquired by Magento that year, Handy left, and along with co-founders Drew Banin and Connor McArthur, he started Fishtown Analytics — a Philadelphia data consultancy that helped early-stage companies set up Redshift and BigQuery warehouses.
The consultancy needed a tool. Specifically, they needed a way to write transformation SQL across many client engagements without each engagement turning into a unique snowflake (no pun) of stored procedures and ETL scripts. Handy and the team built dbt ("data build tool") as that internal tool: a command-line application that took a folder of SQL files with {{ ref() }} macros, figured out the dependency graph, and ran them in the right order against a warehouse. Tests, documentation, and reusable macros came along for the ride. They open-sourced dbt the same year.
The thing they did not initially understand — but figured out fast — was that dbt was not just a tool. It was a new role: the analytics engineer, a hybrid between a data engineer and a data analyst, whose job is to model data inside the warehouse using SQL and software engineering practices. Tristan Handy started writing about this in his newsletter ("The Analytics Dispatch") starting in 2017-2018, and the term spread organically through the early modern-data-stack community. By 2019-2020, "analytics engineer" was a standard job title at thousands of companies, and dbt was the canonical tool for the role.
The financial story tracks the cultural one. Fishtown raised a Series A in 2020 (~$12M), a Series B in 2021 ($29M led by a16z), a Series C in June 2021 ($150M at $1.5B valuation), and a Series D in February 2022 — approximately $222M at a $4.2 billion valuation, in the last big round before the rate-rise market reset. The company was renamed dbt Labs in 2021 to reflect the fact that the consultancy was no longer the business — the software was.
dbt Labs has three core product lines, plus a few adjacent acquisitions:
In addition, dbt Labs has made smaller acquisitions and product expansions, including the dbt Mesh features for multi-project orchestration (announced 2023, GA 2024) and the dbt Explorer / dbt catalog features that bring lineage and documentation closer to a lightweight catalog product. The strategic direction is clear: turn dbt Cloud from "a hosted IDE for dbt Core" into "a full transformation, semantic layer, lineage, and governance platform."
dbt Labs has a strategic problem that most open-source companies face and very few solve well: the open-source product is so good that the commercial upsell is harder than it should be. Vanilla dbt Core, run from a developer laptop or scheduled in Airflow, is enough for a substantial fraction of data teams. dbt Cloud is genuinely better — the IDE is nice, the scheduling is convenient, the observability is real — but for many teams the gap is "convenient" rather than "essential."
dbt Labs has been addressing this in two ways since roughly 2023:
1. Expand beyond pure transformation. The Transform acquisition and the resulting dbt Semantic Layer are an explicit attempt to make dbt the canonical metrics definition layer for the modern data stack — a place where metrics live once and propagate to every BI tool, notebook, and AI agent downstream. If that works, the semantic layer becomes a much stickier product than the transformation layer alone, because metrics changes ripple through everything that consumes them.
2. Enterprise governance and dbt Mesh. dbt Mesh is the feature set for splitting a giant dbt project across multiple teams, with explicit interfaces (data contracts) between projects. This is an enterprise-focused capability that does not exist in open-source dbt Core in the same form, and it gives dbt Cloud a real "you cannot run this at Fortune 500 scale without us" pitch.
Both bets are reasonable. Neither has fully solved the monetization problem. dbt Labs is not quite at the $250M ARR threshold that would make an IPO obvious in the current market, and the company has been operating in a more disciplined growth mode since the 2022 round, like every other modern data stack vendor that raised at peak valuations.
dbt is the rare data product that genuinely changed how an entire generation of data teams works. Pre-dbt, transformation was a nightmare mix of stored procedures and Airflow tasks. Post-dbt, transformation is version-controlled, tested SQL with explicit dependencies, owned by an "analytics engineer" who didn't even have a job title five years ago. That cultural shift is real, durable, and almost entirely attributable to dbt Labs.
The harder question is whether mindshare leadership can be converted into revenue leadership at the scale a $4.2B-valuation venture-backed company needs. The candid answer in 2026: maybe, but it's not yet a sure thing. dbt Core is too good, dbt Cloud is moderate at best as a forced upgrade, the Transform / Semantic Layer expansion is real but slow to land, and the enterprise governance story is still early. Meanwhile, the warehouse vendors — particularly Snowflake and Databricks — have started adding their own transformation features (Snowflake's Snowpark, Databricks' DBT-style workflows in Workflows), which puts pressure on the standalone transformation product over the long run.
The most likely outcome: dbt Labs continues to be the obvious leader of the transformation and semantic-layer category, slowly grows into a respectable enterprise software business, and eventually goes public or gets acquired by one of the warehouse vendors. The category dbt invented is permanent. The economics of the company that invented it are still being worked out.
TextQL Ana connects natively to dbt projects, both via the dbt Cloud API and by reading dbt project metadata directly from a Git repository. The connection is one of the most strategically important integrations Ana has, because a well-built dbt project is the best possible grounding for a natural-language analyst agent: the models are documented, the metrics are defined, the column descriptions exist, and the tests catch the obvious mistakes. The dbt Semantic Layer in particular gives Ana a clean, reliable definition of every metric, so that when a business user asks "what was revenue last quarter," the answer comes from the same definition the dashboards use. dbt is, in many ways, the canonical input for how Ana should reason about a customer's data.
See TextQL in action