NEW: Scale AI Case Study — ~1,900 data requests per week across 4 business units Read now →
Contents
Google Cloud Storage
Google Cloud Storage (GCS) is Google Cloud's object storage service. Launched in 2010, it competes with Amazon S3 as the storage substrate for data lakes and lakehouses, with native integration into BigQuery.
Google Cloud Storage (GCS) is Google's answer to Amazon S3. It does the same job — store arbitrary blobs in a flat key-value namespace, accessed over HTTP, with high durability and pay-as-you-go pricing — but it lives inside Google Cloud and is most interesting as the storage substrate beneath BigQuery and Vertex AI. If your data stack runs on GCP, GCS is where your data lake lives.
GCS launched in May 2010 (originally as "Google Storage for Developers"), four years after S3. By that point, the S3 API had already become the de facto standard, and Google had to make a decision: build a totally new API, or accept that the world had standardized on S3. They split the difference. GCS has its own native API (gs:// paths, JSON/XML methods, IAM-style permissions), and it offers an "interoperability mode" that speaks the S3 protocol so existing S3 client libraries work against GCS buckets. This was a pragmatic and correct call.
GCS is an object store with the same fundamental shape as S3: buckets contain objects, objects are immutable blobs identified by a key, you PUT/GET/LIST/DELETE. Like S3, it has multiple storage classes for different access patterns. Like S3, it has 11 nines of durability for the standard tier. Like S3, it integrates with the rest of its parent cloud (in GCS's case, GCP) for IAM, encryption, logging, and monitoring.
The differences from S3 that matter:
US, EU, ASIA) that automatically replicate across regions for higher availability. AWS only added a similar feature later with multi-region access points.Same idea as S3, different names:
You can set lifecycle rules to automatically transition objects between classes as they age, and you can use Autoclass for the "I don't want to think about it" option (the GCS equivalent of S3 Intelligent-Tiering).
GCS is the second-place object store and that's fine. It is a competent, well-designed, well-priced product that does what an object store should do. It will never be the de facto standard — S3 won that war — but it doesn't need to be. GCS exists to be the storage layer for Google Cloud, and inside Google Cloud it's the only choice that makes sense. The tight BigQuery integration is the actual reason a team picks GCS, and that integration is good enough that GCS-on-BigQuery is a genuinely competitive lakehouse architecture.
The honest version: if you are on GCP, use GCS. If you are on AWS, use S3. If you are picking a cloud, this layer is not the deciding factor. All three major clouds have object stores that are good enough for any reasonable workload, and the differences between them matter much less than the differences in the surrounding ecosystem (the warehouses, the ML platforms, the data engineering tools).
The one place GCS has a real edge is egress pricing relative to BigQuery. Because BigQuery and GCS are in the same network, queries from BigQuery against GCS data incur no data transfer cost, which simplifies the pricing model considerably. AWS has similar internal-network pricing for S3-to-Athena, but the BigQuery + GCS combination is particularly clean.
GCS sits at the bottom of the data stack on GCP, beneath warehouses, query engines, and table formats. The typical pattern:
Source data
↓
GCS bucket (raw zone, often Parquet)
↓
BigQuery (external table or BigLake-managed)
↓
dbt models in BigQuery
↓
Looker / Hex / dashboards
GCS is also the standard storage backend for Google's other data products: Dataflow (managed Apache Beam), Dataproc (managed Spark/Hadoop), Vertex AI (ML training data and model artifacts), Pub/Sub (overflow storage), and Cloud Composer (managed Airflow).
TextQL Ana connects to GCS indirectly through BigQuery (or whichever query engine sits above it). When a business user asks Ana a question about data stored in a GCS-backed BigQuery external table, Ana sends the query to BigQuery and returns the result in natural language. Ana doesn't replace GCS or BigQuery — it sits on top, translating questions into the right SQL against the right tables.
See TextQL in action