2026-05-26·5 min read·sota.io Team

Starburst Galaxy EU Alternative 2026: GDPR, CLOUD Act & Trino SaaS Sovereignty

Post #4 in the sota.io EU Data Lakehouse Tools Series

Starburst Galaxy EU Alternative 2026 — Trino SaaS CLOUD Act sovereignty analysis

Starburst Galaxy is the managed SaaS offering built on top of Trino (formerly PrestoSQL) — the open-source distributed SQL query engine originally created at Facebook. For EU data teams running federated analytics across data lakes, object stores, and operational databases, Galaxy offers powerful query-federation capabilities. But underneath the convenience lies a critical sovereignty problem: the control plane that orchestrates every federated query across your EU data sources runs in US-jurisdiction infrastructure.

This is the fourth post in our five-part EU Data Lakehouse Tools series. We've already covered Databricks, Snowflake, and dbt Cloud. Starburst presents a distinct sovereignty challenge: unlike warehouse platforms that store data, Starburst federates queries — which means your EU personal data transits a US-controlled orchestration layer even when the underlying data never leaves European servers.

What Is Starburst Galaxy?

Starburst Data Inc. is a Delaware C-Corp headquartered in Boston, Massachusetts. Founded in 2017 by the original Presto team from Facebook, Starburst raised $164M Series D (2021) led by Andreessen Horowitz. The company provides:

Galaxy's core value proposition is query federation: connect to S3, Delta Lake, Iceberg, Hive, PostgreSQL, MySQL, Elasticsearch, Kafka, and dozens of other sources — and run a single SQL query across all of them simultaneously. For EU organizations with data spread across multiple cloud stores and on-premises databases, this capability is genuinely powerful.

CLOUD Act Score: 16/25

DimensionScoreAssessment
D1 — HQ Jurisdiction5/5Delaware C-Corp, Boston MA — full US jurisdiction
D2 — Data Routing3/5EU AWS/Azure regions available, but control plane runs US-side
D3 — Subprocessors3/5AWS us-east-1, Azure eastus as primary control plane infrastructure
D4 — Personnel Access3/5US-based SRE team can access Galaxy cluster configurations, query plans
D5 — Legal Framework2/5SCCs only, no BCR, no CLOUD Act Shield commitment, no FISA challenge history
Total16/25Significant CLOUD Act exposure

A score of 16/25 places Starburst in the "meaningful risk" zone for GDPR Art.46 transfer mechanism compliance. The specific risk is not raw data storage — Galaxy doesn't inherently store your EU data — but rather the query orchestration layer that processes, routes, and caches query plans and result metadata.

Three Named Risk Patterns

1. Trino Query Federation Control Plane Pattern

Every SQL query executed in Galaxy follows this path: your client submits the query → Galaxy's coordinator (running in US jurisdiction) parses and optimizes the query plan → the coordinator distributes split tasks to workers → workers pull data from your EU sources → results aggregate in the coordinator → output returns to your client.

The critical CLOUD Act exposure point is the Galaxy coordinator cluster. Even when all your underlying data lives in EU AWS regions (S3 eu-west-1, Aurora eu-central-1), the coordinator that:

...all operates in US-jurisdiction infrastructure. A CLOUD Act compelled disclosure targeting Starburst could expose every SQL query your EU data team has run for the past 90 days, including queries that enumerate EU personal data attributes.

GDPR Art.32 implication: Processing operations that reveal your schema design, query patterns, and data access frequencies constitute "processing of personal data" when the queries operate on personal data. EDPB guidelines recognize that metadata about personal data processing is itself subject to Art.32 technical safeguards.

DORA Art.28 implication: For financial institutions using Galaxy for regulatory reporting or risk analytics, the coordinator's US jurisdiction creates a critical ICT third-party dependency that requires contractual documentation of "location of data processing" — which Galaxy cannot honestly represent as EU-only.

2. Ranger Policy Propagation Gap

Galaxy integrates with Apache Ranger for fine-grained access control across connected data sources. When you configure row-level security (RLS) policies in Galaxy — for example, restricting EU data analysts to only query rows where data_residency = 'EU' — those Ranger policies are stored and managed in Galaxy's US-hosted management plane.

The operational CLOUD Act gap:

For GDPR Art.5(1)(f) "integrity and confidentiality" compliance, your access control system's own policy state being US-accessible undermines the sovereign access control model required for EU personal data.

3. Iceberg REST Catalog Exposure

Galaxy's deepest integration is with Apache Iceberg — the open table format for data lakehouses. Galaxy includes a managed Iceberg REST Catalog that stores:

This REST Catalog — which functions as the metadata brain of your Lakehouse — runs as a Galaxy-managed service in US-jurisdiction infrastructure. Under CLOUD Act compelled disclosure, a US government request could obtain the complete structural blueprint of your EU personal data Lakehouse: which columns exist, how the data is physically organized, how the schema has evolved over time, and statistical distributions that can reveal the demographics of your EU data subjects.

The Partition Specification Risk: Iceberg partition specs are particularly sensitive. A financial institution partitioning by customer_country_of_residence and risk_tier reveals both geographic distribution and creditworthiness modeling of EU data subjects. A healthcare provider partitioning by diagnosis_code_range exposes the structure of special category health data (GDPR Art.9). Galaxy's managed REST Catalog makes these partition specs US-jurisdiction-accessible by design.

EU-Native Alternatives

For EU data teams needing Trino's federated query power without US control plane exposure:

Trino Self-Hosted (CLOUD Act Score: 0/25)

The obvious starting point. Trino is Apache 2.0 licensed — free to deploy on EU infrastructure. Key considerations:

Apache Spark + Iceberg (CLOUD Act Score: 0/25)

For batch-first use cases where Trino's low-latency federation isn't essential:

DuckDB (CLOUD Act Score: 0/25)

For smaller-scale federated analytics within a single node:

Ahana Cloud for Presto (EU Deployable)

Ahana (acquired by IBM) offers managed Presto (the original Facebook fork, separate from Trino) with deployment options that can be configured for EU-only data residency. However, Ahana's control plane is still IBM US-jurisdiction — evaluate with the same CLOUD Act framework before deployment.

Contractual Protections: What Galaxy's DPA Provides (and Doesn't)

Starburst offers a Data Processing Addendum that includes:

What Galaxy's DPA does not provide:

For GDPR Art.46 compliance documentation, you can implement SCCs — but the Transfer Impact Assessment (TIA) required by EDPB guidelines must acknowledge that Galaxy's US-jurisdiction control plane creates a realistic risk of CLOUD Act exposure for EU query metadata, policy state, and catalog information.

Decision Framework: When Galaxy is Acceptable vs. When to Self-Host

ScenarioGalaxy Acceptable?Recommendation
EU personal data analytics (GDPR Art.9 excluded)ConditionallyImplement SCC + TIA, document residual risk
Special category data (health, financial, biometric)NoSelf-host Trino or Apache Spark
DORA-regulated financial institutionAssessLegal review required; Art.28 documentation complex
NIS2-critical infrastructureNoOn-premises Trino with EU-only policy plane
Non-EU non-personal data analyticsYesFull Galaxy deployment appropriate
GDPR Art.9 data — minimized (only aggregates, no PII)Yes with cautionEnsure no raw PII transits coordinator

The Trino Governance Stack: What EU Self-Hosters Actually Need

If you decide to self-host Trino rather than use Galaxy, here's the complete governance stack you need to replicate Galaxy's enterprise features:

Query Engine:

Catalog & Metadata:

Access Control:

Monitoring & Operations:

EU Cloud Providers with Trino-Ready Infrastructure:

GDPR Technical Safeguard Checklist for Trino Deployments

For EU data protection authorities reviewing Trino-based data lakehouse deployments, these are the technical safeguards expected under GDPR Art.32:

KRITIS-Dachgesetz Context (July 2026)

German organizations designated as critical infrastructure under KRITIS-Dachgesetz (effective 2026-07-17) face additional requirements for ICT systems that process critical infrastructure operational data. Starburst Galaxy deployments for operational analytics in energy, water, transport, or healthcare sectors require:

Self-hosted Trino with EU-only infrastructure satisfies KRITIS-Dachgesetz ICT requirements by design. Galaxy requires additional contractual and technical measures that may be difficult to obtain from a US-incorporated vendor.

Conclusion

Starburst Galaxy offers genuine value for EU data teams: managed Trino removes significant operational complexity, and the connector ecosystem makes cross-source federation genuinely fast to implement. The sovereignty gap is not trivial to close with contractual protections alone.

The CLOUD Act risk pattern for Galaxy is distinctive: it's not about data storage but about query orchestration. Every SQL statement your EU data team executes — including queries that enumerate, count, or analyze EU personal data — transits a US-jurisdiction coordinator that is CLOUD Act-compellable. Combined with the Iceberg REST Catalog exposure and Ranger policy plane gap, Galaxy creates a three-layer sovereignty deficit that SCCs alone cannot remediate under EDPB's TIA guidelines.

For EU organizations with GDPR Art.9 special category data or DORA-regulated analytics workloads: Self-host Trino on EU infrastructure with EU-jurisdiction governance components. The operational investment is significant but represents the only path to genuine data lakehouse sovereignty.

For EU organizations with less sensitive workloads: Galaxy with SCCs, a completed TIA, and documented residual risk acknowledgment may be acceptable — with legal and DPO sign-off.

The next and final post in this series will be our comprehensive EU Data Lakehouse Comparison Finale: Databricks vs. Snowflake vs. dbt Cloud vs. Starburst vs. EU-native OSS — with a decision framework for architects and DPOs choosing their 2026 data lakehouse strategy.


Part 4 of 5 in the sota.io EU Data Lakehouse Tools Series. See Databricks | See Snowflake | See dbt Cloud

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.