NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Apache Flink
Apache Flink is the dominant open-source stream processing engine. It originated from the Stratosphere research project at TU Berlin in 2010, became an Apache top-level project in 2014, and is the engine of choice for serious real-time computation at companies like Netflix, Uber, Stripe, and Alibaba.
Apache Flink is the dominant open-source stream processing engine. If you have a real-time data pipeline at any serious scale — joining streams, maintaining state, computing windowed aggregations, doing ML feature engineering on live events — the odds are very high that you are running Flink, or you are evaluating it, or you are using a managed service that runs Flink under the hood.
The shortest version of the Flink story is: it is the only open-source streaming engine that took the hard problems of stream processing seriously from day one and got them right. Event time, watermarks, exactly-once semantics, stateful processing with checkpointing, true low-latency operation — Flink shipped credible answers to all of these years before its competitors. That technical lead has compounded into category dominance.
Flink began life around 2010 as Stratosphere, an academic research project at Technical University of Berlin led by Volker Markl, with Stephan Ewen and Kostas Tzoumas as key contributors. Stratosphere was funded by the German Research Foundation and aimed to build a next-generation big data platform that would go beyond MapReduce — supporting iterative algorithms, in-memory computation, and a more general dataflow model than Hadoop offered.
In 2014, the Stratosphere team renamed the project Flink (German for "nimble" or "quick," with a kingfisher as the logo) and donated it to the Apache Software Foundation. It graduated to a top-level project in December 2014. The same team founded data Artisans (later renamed Ververica) as the commercial sponsor, in the same way Confluent stewards Kafka or Databricks stewards Spark. Ververica was acquired by Alibaba in early 2019, after Alibaba had become one of the largest Flink users in the world.
The German academic origin matters culturally. Flink was built by people deeply immersed in database research and stream processing theory — people who had read all the papers and were determined to do streaming "correctly" rather than retrofit it onto a batch engine. The contrast with Spark Streaming, which started as a micro-batch hack on top of Spark and only later evolved into Structured Streaming, is sharp. Flink was streaming-native.
At its core, Flink is a distributed dataflow engine. You define a pipeline as a graph of operators (sources, transformations, sinks), and Flink deploys that graph across a cluster of workers (TaskManagers), with a coordinator (JobManager) handling scheduling, checkpointing, and failure recovery. The engine handles the hard parts:
Stateful processing. Operators can maintain keyed state (per-key counters, sets, lists) and operator state (per-task buffers). State is stored in a configurable state backend — typically RocksDB on local disk for large state, or in JVM heap for small state.
Checkpointing and exactly-once. Flink periodically takes a consistent snapshot of all operator state across the cluster using the Chandy-Lamport algorithm (adapted for dataflow). On failure, the cluster restarts from the last checkpoint and replays events from the source. Combined with transactional or idempotent sinks, this delivers true end-to-end exactly-once semantics.
Event time and watermarks. Flink was the first open-source engine to take event time seriously as the primary time dimension for stream processing. Watermarks flow through the dataflow graph alongside events, telling each operator when it is safe to close a window.
Windowing. Tumbling, sliding, session, and global windows, all with configurable triggers and allowed lateness. The semantics are precise enough that you can express almost any windowed aggregation you would express in batch SQL, plus things batch SQL cannot express.
Multiple APIs. Flink supports the low-level DataStream API (Java/Scala), the higher-level Table API, and Flink SQL, which compiles SQL queries into the same dataflow runtime. Flink SQL has become the most strategically important interface in recent years — it is what Confluent and AWS expose to users.
By 2017-2019 it was clear Flink had won the open-source streaming war. The reasons:
Flink is not always the right answer:
The most important strategic shift in stream processing in 2024-2025 was Confluent's adoption of Flink as its preferred stream processing engine, replacing its earlier focus on ksqlDB. Confluent acquired Immerok (a Flink company founded by ex-Ververica engineers) in early 2023 and made Flink SQL the centerpiece of Confluent Cloud's stream processing offering. The signal: even Kafka's commercial steward has accepted that Flink is the right answer for serious streaming compute.
Flink reads from sources (Kafka, Kinesis, Pulsar, Iceberg, Pulsar, files, CDC streams from Debezium), processes events through a dataflow graph, and writes results to sinks (Kafka topics, warehouses, real-time OLAP databases, Iceberg tables, key-value stores). It is the compute layer between event transport and analytical serving.
TextQL does not query Flink directly — Flink is a compute engine, not a queryable database. Instead, TextQL Ana queries the destinations Flink writes into: warehouses, lakehouses, and real-time analytics databases. The role Flink plays in a TextQL stack is to make sure the data those destinations contain is fresh, joined, enriched, and shaped the way analysts and AI systems need it. Flink is the engine that turns raw event streams into the well-modeled tables that TextQL's natural-language interface can reason about.
See TextQL in action