NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Confluent
Confluent is the commercial company founded in 2014 by Apache Kafka's original creators -- Jay Kreps, Neha Narkhede, and Jun Rao -- to commercialize Kafka. It IPO'd in June 2021 and is the canonical example of an open-source-commercial company in modern data infrastructure.
Confluent is the company that the original Apache Kafka creators founded in 2014 to commercialize Kafka. In plain English, Confluent's job is to make money from a piece of software that anyone can download for free — and to do it well enough to support a public company with thousands of employees and a multi-billion dollar valuation. That tension is the entire Confluent story.
This page is about the company, not the open-source project. For Kafka-the-protocol and Kafka-the-Apache-project, see Apache Kafka. The two pages exist separately because Kafka and Confluent are not the same thing, and conflating them leads to muddled thinking about both technical adoption and commercial strategy.
In 2014, Jay Kreps, Neha Narkhede, and Jun Rao — the three engineers who had built Kafka at LinkedIn — left LinkedIn to start Confluent. The pitch was straightforward: Kafka was already being adopted at scale across the industry, the underlying engineering was hard, and large companies would pay for someone to operate, support, and extend it. Benchmark led the Series A. By 2015 Confluent had a commercial distribution (Confluent Platform) and a paying enterprise customer base; by 2017 it had launched Confluent Cloud, a fully managed Kafka service running on AWS, GCP, and Azure.
Jay Kreps remained CEO. Narkhede left Confluent in 2020 and went on to found Oscilar. Jun Rao stayed in a technical role for years and became one of the most prolific Kafka committers in history. The founding team has the unusual distinction of being both the creators of the underlying technology and the commercial steward of it — a combination that gives Confluent enormous credibility in the Kafka community but does not, by itself, guarantee a healthy business.
Confluent sells essentially two things, plus a constellation of value-add features bundled with them.
Confluent Platform. The on-premises distribution of Kafka, packaged with enterprise features (RBAC, audit logging, multi-region clusters, tiered storage, control plane UI) and Confluent's commercial support. This is the original product, dating back to 2014, and it remains a meaningful part of Confluent's revenue — particularly with banks, telcos, and government customers who cannot or will not run in public cloud.
Confluent Cloud. Fully managed Kafka as a service, available on AWS, GCP, and Azure. Customers pay by ingress, egress, retention, and partition count. This is the strategic priority and the fastest-growing line of the business. Confluent Cloud competes directly with self-hosted Kafka, AWS MSK, Aiven, and the newer Kafka-compatible alternatives Redpanda and WarpStream (which Confluent later acquired).
The value-add layer wrapped around both products includes:
Schema Registry. A separate service that stores Avro, Protobuf, and JSON Schema definitions per topic and enforces compatibility rules. Without a schema registry, a Kafka deployment at any real scale becomes a JSON swamp — producers change field names, consumers break, no one knows what is in any topic. Confluent's Schema Registry was the first widely adopted solution and remains the de facto standard. It is licensed under the Confluent Community License (source-available, with restrictions on competing managed services), not Apache 2.0.
Connect Hub. A marketplace of hundreds of Kafka Connect connectors — some open-source, some Confluent-only — for moving data between Kafka and Postgres, Snowflake, S3, Salesforce, Elasticsearch, and so on. The unsexy plumbing that makes Kafka actually useful in enterprise environments.
ksqlDB. Confluent's SQL-on-Kafka stream processor. Strategically de-emphasized in 2024-2025 in favor of Flink SQL.
Confluent Cloud Flink. Managed Apache Flink, integrated tightly with Confluent Cloud Kafka. The product Confluent built on top of its Immerok acquisition (more on that below).
Confluent went public on NASDAQ in June 2021 at a valuation of around $11 billion, in the middle of a frothy moment for cloud infrastructure stocks. The stock surged in 2021, then traded down sharply through 2022 and 2023 as the broader infrastructure-stock correction hit. By 2024-2025 Confluent had stabilized at a much lower multiple than its IPO peak, with Wall Street fixated on a few questions: how fast can Confluent Cloud grow, what are the margins under per-GB pricing, and how durable is Confluent's moat against AWS MSK, Redpanda, and the broader Kafka-compatible ecosystem?
Confluent's revenue growth has remained healthy by absolute standards but has decelerated from the hyper-growth investors priced in at IPO. The company is the single largest commercial steward of an open-source data infrastructure project, and that role is both its biggest asset and its biggest structural challenge.
For most of its history, Confluent's stream processing pitch was ksqlDB — a SQL layer that compiled queries into Kafka Streams topologies. ksqlDB was easy to use for simple transformations but never matched Apache Flink for serious stateful processing, and the broader market was clearly converging on Flink as the dominant stream processing engine.
In January 2023, Confluent acquired Immerok, a Flink-focused company founded in 2022 by a group of senior Flink committers (many ex-Ververica). The acquisition was Confluent's signal that it had accepted Flink as the right answer for stream processing, even at the cost of cannibalizing its own ksqlDB story. By 2024, Confluent Cloud Flink (a managed Flink SQL service) was generally available and had become the centerpiece of Confluent's stream processing pitch.
The strategic logic: if Flink is going to win stream processing anyway, Confluent would rather sell managed Flink alongside managed Kafka than watch customers go to Ververica, Decodable, or AWS Managed Service for Apache Flink for the compute layer.
In September 2024, Confluent announced its acquisition of WarpStream, a Kafka-compatible streaming platform that pushes storage entirely onto S3 (eliminating local disks, broker rebalancing, and inter-AZ traffic costs). WarpStream's pitch had been straightforward: for many workloads, S3-backed storage is dramatically cheaper than running brokers on EBS volumes, and the latency tradeoff is acceptable.
The acquisition was a defensive move. WarpStream was eating into Confluent Cloud's pricing story for cost-sensitive workloads, and rather than compete with it, Confluent bought it. WarpStream now sits in Confluent's product lineup as a "BYOC" (bring your own cloud) and cost-optimized offering alongside Confluent Cloud's classic per-broker model.
Here is the honest version of the Confluent story that no analyst report will state quite this plainly: Confluent's revenue depends on Kafka, but Kafka itself is open-source and cloneable. That is the entire structural challenge of the company.
The Kafka API has become more durable than the Kafka codebase. That is a sign of category maturity, not weakness, but it does mean Confluent's moat is the ecosystem (Schema Registry, Connect, Flink integration, enterprise support, governance) rather than the underlying protocol. The bet Jay Kreps and team are making is that the ecosystem and the operational experience are what enterprises actually pay for, and that pure-protocol clones cannot win the high-value workloads. So far that bet is largely working, but it requires Confluent to keep out-shipping the alternatives on every dimension that is not the wire protocol.
Confluent is the canonical "OSS commercial company" story in modern data infrastructure — arguably the cleanest example of the entire pattern alongside Databricks and HashiCorp. The founders created the technology, donated it to a foundation, built a company around commercializing it, and took it public. That arc gives Confluent enormous credibility and a deeply entrenched position in the Kafka ecosystem.
But it is also the structural challenge. A company that commercializes an open-source project lives or dies on its ability to add value above the core that customers cannot easily replicate. Confluent's add-ons (Schema Registry, Connect, Flink integration, governance, support) are real and valuable — but they have to keep being more valuable than the cost premium over self-hosted Kafka, AWS MSK, and the Kafka-compatible alternatives. So far Confluent has executed well. Whether it can continue to do so for the next decade is the central question for both customers and shareholders.
For customers, the practical advice is: if you want managed Kafka with the deepest feature set and the strongest support story, Confluent Cloud is the canonical choice. If you are cost-sensitive or already deep in AWS, MSK is a credible alternative. If you want the lowest-latency Kafka experience with the simplest operations, look at Redpanda. The right answer depends on which dimension matters most for your workload.
TextQL does not connect to Confluent or Kafka directly — streaming platforms are transports, not query engines. Instead, TextQL Ana queries the systems downstream of Confluent Cloud: the warehouses, lakehouses, and real-time OLAP databases where Confluent-managed Kafka events eventually land. Confluent's Flink SQL and Kafka Connect ecosystem is often part of the pipeline that lands those events into Snowflake, Databricks, Iceberg tables, or ClickHouse — and that landing zone is where TextQL plugs in.
See TextQL in action