NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Reverse ETL
Reverse ETL tools sync data out of the warehouse and into the operational tools where business teams actually work — Salesforce, HubSpot, Marketo, Zendesk, ad platforms. The category was created in 2019 by Census and Hightouch and is now table stakes for modern data teams.
Reverse ETL is the category of tools that does the exact opposite of normal ETL. Where Fivetran and Airbyte pull data from SaaS apps into the warehouse, reverse ETL pushes data from the warehouse back into SaaS apps. The same Salesforce account record that started in Salesforce, got synced to Snowflake, got enriched with product usage data, and got modeled by dbt — reverse ETL is what shoves the enriched version back into Salesforce so the sales rep can actually see it.
The simple way to think about it: the warehouse is where you compute the truth, and reverse ETL is the delivery truck that takes the truth to wherever it needs to be used. A modeled customer health score in Snowflake is useless if it stays in Snowflake. The customer success rep lives in Gainsight or HubSpot. The marketer lives in Iterable. The ad buyer lives in Google Ads. Reverse ETL is the wire that connects warehouse intelligence to the places work happens.
Until about 2019, "operational analytics" — the idea of acting on warehouse data inside operational tools — was a mess of one-off Python scripts, hand-rolled API integrations, and Zapier hacks. Every data team had a folder of cron jobs that pushed CSVs to Salesforce or fired webhooks at HubSpot. These scripts broke constantly, had no observability, and were owned by one engineer who would inevitably leave.
In 2018-2019, two companies independently noticed the same opportunity and built the same product: a managed pipeline that takes a SQL query against your warehouse and syncs the result to a SaaS destination, with idempotency, error handling, field mapping, and a UI that didn't require a Python engineer.
The term "reverse ETL" was coined around the same time — often attributed to Astasia Myers at Redpoint Ventures, who wrote some of the first analyst pieces on the category, and to the founders of both companies who were trying to explain what they did. The name is technically a misnomer (it's really "ELT in reverse"), but it stuck because it told you exactly what the product did in two words.
By 2021, reverse ETL was the hottest new category in the data stack. By 2024, it had become so obviously useful that the warehouses themselves started absorbing the functionality. Snowflake announced reverse-ETL-style features. Databricks built outbound connectors. Salesforce shipped Data Cloud, which is essentially a reverse ETL pipeline pretending not to be one.
Three properties:
1. The warehouse is the source of truth. A reverse ETL tool reads from your data warehouse — not from raw production databases, not from a separate CDP, not from event streams. The whole architectural premise is that your warehouse already contains the modeled, joined, deduped version of your customer data, and the job is to ship that out the door.
2. SQL is the interface. You define what to sync by writing a SQL query (or pointing at a dbt model). The result of that query becomes the rows that get pushed to the destination. This is the killer design choice: it puts data and analytics engineers in the driver's seat, using the language they already speak, rather than requiring Salesforce admin skills.
3. Field mapping and idempotent upserts. The hard part of syncing data to SaaS apps is the tedium: matching warehouse columns to destination fields, handling create-vs-update logic, dealing with rate limits, retrying failed rows. Reverse ETL tools manage all of this so you don't have to.
In 2019, reverse ETL was a venture-backed novel category with two startups. By 2026, it has become table stakes for warehouses themselves. Snowflake, Databricks, and BigQuery have all shipped or partnered for outbound sync features. The Salesforce Data Cloud product — their flagship 2023-2024 launch — is essentially "reverse ETL but Salesforce charges for it." This is what happens to every successful data infrastructure category: the underlying platform absorbs it.
This does not mean Hightouch and Census are dead. Two reasons:
SaaS apps (Salesforce, Stripe, Zendesk)
↓ ETL (Fivetran, Airbyte)
Data warehouse (Snowflake, BigQuery, Databricks)
↓ dbt models, analytics engineering
Modeled tables (customer_health_score, ltv, churn_risk)
↓ Reverse ETL (Hightouch, Census)
Back to SaaS apps (Salesforce, HubSpot, Iterable, Google Ads)
The full loop is sometimes called the "composable CDP" or the "warehouse-native customer data platform": instead of buying a monolithic CDP like Segment that owns ingestion, identity, modeling, and activation, you assemble the same functionality from best-of-breed tools that all share the warehouse as the source of truth. Reverse ETL is the activation half of that loop.
Polytomic is a smaller third entrant focused on B2B sync use cases. Grouparoo was an open-source player acquired by Airbyte in 2022 and effectively wound down.
TextQL Ana is upstream of reverse ETL and complements it. Where reverse ETL pushes warehouse data into operational tools on a schedule, Ana lets business users ask questions of warehouse data in natural language at any moment. The two solve different halves of the same problem: reverse ETL handles the predictable, recurring syncs ("update every account's health score in Salesforce nightly"), while Ana handles the ad hoc questions ("which accounts in EMEA have the highest churn risk this week?"). Many TextQL customers run both.
See TextQL in action
Related topics