NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Cube
Cube is the leading open-source headless BI and semantic layer. It exposes business metrics over SQL, REST, GraphQL, and MDX so any tool can query the same definitions.
Cube is the open-source headless BI and semantic layer maintained by Cube Dev, a company founded in 2019 by Artyom Keydunov and Pavel Tiunov. It is the most popular standalone semantic layer in the modern data stack, with roughly 17K GitHub stars and adoption ranging from solo developers building embedded analytics into their SaaS products to enterprises trying to standardize metric definitions across multiple BI tools.
The pitch is simple: define your metrics once in Cube, query them from anywhere. Cube exposes the same metric definitions through SQL, REST, GraphQL, and MDX, so a Tableau dashboard, a React app, an LLM agent, and an Excel pivot table can all ask "give me revenue by region last month" and get the same answer.
Before Cube was Cube, it was an open-source project called Cube.js, released by Artyom Keydunov in 2019 as a framework for building embedded analytics — the kind of dashboards SaaS companies put inside their own products to show customers their data. Keydunov had been a founder at Statsbot (a Slack-based business intelligence tool) and noticed that every developer trying to build in-product analytics ran into the same problem: writing analytical SQL by hand, dealing with caching, optimizing for performance, and somehow keeping metric definitions consistent.
Cube.js bundled the answers to all of those into a single open-source framework. It took off quickly with developers because the alternative — building this stack from scratch — was painful. Within two years, the founders realized that the same primitive (a centralized metric definition layer) was useful far beyond embedded analytics. It was, in fact, exactly what enterprise data teams meant when they said "semantic layer." They renamed the project from Cube.js to Cube, raised a Series B from Bain Capital Ventures, and started selling Cube Cloud to enterprises.
The pivot from "embedded analytics framework" to "headless BI / semantic layer" is the same playbook Vercel ran with Next.js: start as a framework developers love, then sell hosting and enterprise features to the companies built on top. It is one of the cleaner open-source-to-commercial stories in the data infrastructure space.
You define your data model in Cube using YAML (or JavaScript, for legacy projects). Each model is a cube — a logical entity that maps to one or more tables in your warehouse, with declared dimensions, measures, and joins. A simple example:
cubes:
- name: orders
sql_table: public.orders
measures:
- name: total_revenue
sql: "{CUBE}.amount - {CUBE}.refund_amount"
type: sum
- name: order_count
type: count
dimensions:
- name: status
sql: status
type: string
- name: created_at
sql: created_at
type: time
joins:
- name: customers
sql: "{CUBE}.customer_id = {customers}.id"
relationship: many_to_one
Once your cubes are defined, Cube exposes them through several APIs at the same time:
psql shell) can connect to Cube as if it were a database. The "tables" they see are your metric definitions.Behind the scenes, Cube compiles incoming queries into optimized SQL against your warehouse, handles pre-aggregations (a clever caching layer that materializes common rollups for sub-second response times), and enforces row-level security and access control.
Cube is the most credible standalone semantic layer in 2026. It is the only project in the category that has both serious open-source momentum and serious enterprise customers. The multi-API approach (SQL/REST/GraphQL/MDX from one definition) is technically clever and commercially smart — it lets Cube slip into existing stacks without forcing customers to rip out their BI tool.
The two strategic risks are real, however:
1. dbt Labs is the natural competitor. When dbt Labs acquired Transform/Supergrain in 2023 and shipped the dbt Semantic Layer on MetricFlow, they put a free, integrated semantic layer inside the most-used transformation tool in data. Most teams already write dbt. Adding metrics: blocks in their existing dbt repo is lower friction than adopting Cube as a separate service. Cube has to keep proving it is meaningfully better, not just different.
2. Warehouse-native semantic layers. Snowflake Cortex Analyst, Databricks AI/BI Genie, and BigQuery's Looker Studio integration all ship "good enough" semantic features for free. The same commoditization-from-below pressure that threatens Monte Carlo threatens Cube. The defense is that Cube is portable across warehouses, which is exactly the thing the warehouse vendors will never offer.
The likely winning play for Cube is to be the default choice for embedded analytics and AI agents — the use cases where having a programmatic, multi-API, warehouse-agnostic semantic layer is a hard requirement, not a nice-to-have. The traditional BI semantic layer market may be lost to dbt and the warehouses, but the embedded and AI segments are growing and play to Cube's strengths.
Cube sits between the warehouse and the consumption layer:
A typical Cube user is either a developer building embedded analytics into a SaaS product, or a data platform team that wants a single source of truth for metrics across multiple BI tools.
TextQL Ana integrates with Cube as a semantic layer source. When a user asks "what was revenue by region last quarter," Ana can route the request through Cube's SQL or REST API, which guarantees the answer uses the customer's canonical metric definitions. Customers running Cube get more reliable AI-generated answers because the semantic layer enforces consistency. Cube and TextQL are complementary: Cube defines the metrics, Ana lets users ask for them in natural language.
See TextQL in action