2026-04-30·12 min read

AWS OpenSearch EU Alternative 2026: Search Queries, Log Analytics, and the GDPR Blind Spot

Post #722 in the sota.io EU Compliance Series

AWS OpenSearch Service is the managed successor to Amazon Elasticsearch Service — the default infrastructure for search functionality, log aggregation, security event analysis, and behavioral analytics across the AWS ecosystem. European applications use OpenSearch to power product search, site-wide full-text search, CloudWatch log routing, SIEM pipelines, and real-time dashboards via OpenSearch Dashboards (formerly Kibana).

Amazon operates OpenSearch Service in European regions: eu-west-1 (Ireland), eu-central-1 (Frankfurt), eu-west-3 (Paris), eu-north-1 (Stockholm). The data — search indices, log records, dashboards — resides on European infrastructure. Most development teams treat this as a GDPR-compliant baseline.

It is not. Amazon Web Services, Inc. is a Delaware corporation headquartered in Seattle, Washington. The CLOUD Act (18 U.S.C. § 2713) compels US companies to produce data stored anywhere in the world when a valid US government order is served. Your OpenSearch cluster in Frankfurt — containing every search query your users ever typed, every log record your application ever emitted, every behavioral event your analytics pipeline ever indexed — is reachable by a US authority serving a request on Amazon in Seattle.

This is the identical structural problem documented across the AWS stack: AWS Redshift, AWS Kinesis, AWS Glue, AWS RDS, AWS S3. OpenSearch adds a dimension that makes it particularly sensitive: search queries are personal data, and your log data almost certainly contains personal data hidden inside operational records.

What AWS OpenSearch Stores About Your Users

OpenSearch is not a passive storage layer. It actively indexes, tokenizes, and structures every document you push into it. Understanding what ends up in an OpenSearch index requires thinking beyond the documents you explicitly send.

When a user types a search query into your application and your application forwards that query to OpenSearch, the query itself constitutes personal data under GDPR Art. 4(1) if it can be linked to an identifiable natural person.

Consider what users search for in a typical e-commerce, healthcare, or financial application:

"metformin side effects" (health data — Art. 9 special category)
"divorce lawyers Berlin"
"unemployment benefits eligibility"
"HIV test locations near me" (health data — Art. 9)
"debt consolidation for €45000"

If your application logs search queries alongside any user identifier — a session ID, a user ID, an IP address, a browser fingerprint — those queries become personal data. If you store search history for autocomplete or analytics purposes, you have created a personal data record of every sensitive topic your users have explored.

Many teams route this data into OpenSearch for analytics: "what are users searching for?", "which queries return zero results?", "what are the top searches by category?". This is standard product analytics practice. Under GDPR, it is processing of personal data that requires a lawful basis, a retention policy, and — potentially — protection against US government access.

Full-Text Index as Implicit Personal Data Store

An OpenSearch index is not a simple copy of your documents. The indexing process creates an inverted index: a data structure mapping every tokenized term to every document that contains it. The inverted index is stored as Lucene segment files inside your OpenSearch cluster.

If your application indexes customer orders, the inverted index contains the customer's name, address, product preferences, and purchase history — even if you never explicitly intended to make that data searchable. If you index application logs, the inverted index contains every email address, phone number, IP address, and user identifier that appeared anywhere in a log line.

The OpenSearch index may contain more personal data than your primary database — and it may retain that data longer, because index cleanup is harder than database row deletion.

Log Data and the Art. 9 Risk in Application Logs

The most common OpenSearch deployment pattern for AWS users routes application logs from CloudWatch Logs or Kinesis Data Firehose into an OpenSearch domain for search and visualization via OpenSearch Dashboards. This creates a searchable log analytics system.

Application logs routinely contain:

User email addresses in authentication events
IP addresses (personal data under GDPR for identifiable users)
User IDs and session tokens in request logs
Stack traces containing user-submitted form values
API request bodies inadvertently logged by debug tooling
Payment processor responses containing masked card data

None of this was deliberately sent to OpenSearch. It arrived as part of log records that developers never reviewed for personal data content. The log pipeline ingests everything. OpenSearch indexes everything. The result is an uncontrolled personal data store growing without a defined retention policy, purpose limitation, or deletion mechanism.

If any of those log records contain health data, biometric data, or data about criminal convictions, you may have created an Art. 9 special category data store without realizing it.

Deletion Is a Search Engineering Problem

GDPR Art. 17 grants data subjects the right to erasure. Implementing erasure in a relational database is a DELETE statement. Implementing erasure in an OpenSearch index is a search engineering problem.

To delete a specific user's data from OpenSearch, you must:

Issue a Delete By Query request targeting every index that might contain data about the user
Hope that your application indexed user identifiers consistently enough that the query finds all relevant documents
Wait for the merge process to physically remove the deleted segments from disk — OpenSearch marks documents as deleted but does not immediately reclaim disk space
Handle the fact that partial matches (log records containing a user ID inside a JSON payload) may not be captured by your deletion query

For applications that index logs, the deletion problem becomes nearly unsolvable. A user's email address appears in thousands of log records across dozens of indices. Finding and deleting all of them without disrupting ongoing log ingestion requires careful engineering that most teams never implement at the time they build the logging pipeline.

If your OpenSearch cluster is under US CLOUD Act jurisdiction, you now have a category of personal data you cannot reliably erase and cannot protect from US government access.

Purpose Limitation and the Analytics Accumulation Problem

GDPR Art. 5(1)(b) requires that personal data be collected for specified, explicit, and legitimate purposes and not further processed in a manner incompatible with those purposes.

OpenSearch is the natural destination for analytics data — because it makes that data easily queryable via OpenSearch Dashboards. Once data is in OpenSearch for one purpose (debugging, monitoring), it is trivially queried for other purposes (user behavior analysis, commercial profiling). The low friction of OpenSearch Dashboards makes purpose creep structurally likely.

A log sent to OpenSearch for debugging purposes ends up in a dashboard used for marketing analysis. A search query sent to OpenSearch for "what are users searching for" analytics ends up in a user-level behavior profile. This is purpose limitation failure in practice, and it happens because OpenSearch makes cross-purpose queries easy rather than hard.

OpenSearch Dashboards and Art. 22 Profiling Risk

OpenSearch Dashboards (formerly Kibana) enables non-technical users to build visualizations and dashboards over indexed data. A product manager can create a dashboard showing individual user search histories, purchase patterns, and session behaviors without writing any code.

If those dashboards are used to make decisions about users — which users to target with promotions, which users to flag for review, which users to restrict — you may have crossed into Art. 22 territory: automated decision-making based on profiling. Art. 22 requires either explicit consent, necessity for contract performance, or authorization by EU/Member State law. Most teams building OpenSearch Dashboards have not assessed this risk.

OpenSearch Serverless: Zero Infrastructure Visibility

Amazon OpenSearch Serverless removes the management of clusters entirely — you push documents in and queries return results. The underlying infrastructure is entirely managed by AWS with no customer visibility into data placement within the selected region, no control over which physical machines store which index shards, and no mechanism for customers to inspect infrastructure-level access logs.

This opacity is problematic for GDPR Art. 32 compliance, which requires "appropriate technical and organisational measures" to ensure data security. You cannot audit what you cannot see.

EU-Native OpenSearch Alternatives

The EU alternative landscape for OpenSearch covers two distinct use cases: application search (product search, full-text search, autocomplete) and log analytics / observability (log storage, search, dashboards). These are best served by different tools.

Meilisearch — EU-Native Application Search

Meilisearch is an open-source search engine from Millensys SAS, a French company headquartered in Paris. It was designed from the ground up for fast, typo-tolerant application search.

Why Meilisearch is GDPR-native by design:

Built-in document deletion: A single API call deletes a document and all its indexed representations. No merge process, no soft-delete, no segment cleanup delay.
Tenant tokens: Meilisearch supports per-user search tokens that restrict search results to data belonging to a specific tenant — preventing cross-user data exposure at the API level.
Self-hostable: Run Meilisearch on Hetzner, OVHcloud, or any EU-sovereign infrastructure. No AWS dependency, no US jurisdiction.
Privacy-first search analytics: Meilisearch Cloud (hosted in Frankfurt) offers search analytics without user-level tracking by default.

Meilisearch is released under the MIT license. For applications that need product search, documentation search, or site-wide full-text search, Meilisearch replaces OpenSearch with lower operational complexity and better GDPR alignment.

Typesense — Self-Hosted EU Alternative

Typesense is an open-source typo-tolerant search engine optimized for low-latency instant search. It is released under the GPL v3 license and is self-hostable on any EU cloud provider.

Typesense is particularly strong for:

Instant search with sub-10ms response times
Multi-tenant SaaS applications with per-tenant data isolation
E-commerce product search with faceted filtering
Applications requiring GDPR-compliant search without external data transfer

Typesense Cloud runs on Google Cloud — teams requiring pure EU sovereignty should self-host Typesense on Hetzner or OVHcloud rather than using the managed service.

Quickwit — EU-Native Log Analytics

Quickwit is an open-source log management and search engine built by a French team (Quickwit SAS). It was designed specifically to replace Elasticsearch and OpenSearch for log analytics use cases at petabyte scale.

Quickwit's GDPR-relevant architectural differences from OpenSearch:

Decoupled storage: Quickwit separates indexing compute from storage. Index data lives in object storage (e.g., Scaleway Object Storage, OVHcloud Object Storage) that you control. Deleting a user's data requires updating the index metadata and removing the relevant objects — a deterministic operation, not an OpenSearch merge process.
Immutable index segments: Quickwit's index format is append-only and split-aware, making it easier to implement time-based retention policies that physically delete old data.
Sub-second query latency on terabyte-scale indices using columnar storage (similar to ClickHouse's approach).

Quickwit is a natural EU alternative for teams currently routing CloudWatch logs or Kinesis streams into OpenSearch for operational analytics. Pair Quickwit with Grafana for dashboards and you have a full EU-sovereign observability stack.

Self-Hosted OpenSearch on EU Infrastructure

OpenSearch itself is open-source (Apache 2.0 license). Running OpenSearch on EU-sovereign infrastructure (Hetzner Dedicated, OVHcloud Bare Metal, Scaleway Instances) removes the CLOUD Act jurisdiction problem entirely.

The trade-offs versus Amazon OpenSearch Service:

You manage cluster sizing, scaling, patching, and backups
No managed integrations with CloudWatch, Kinesis, or other AWS services
Higher operational overhead than a fully managed service
Lower cost at scale (hardware is cheaper than managed service markup)

For teams with existing OpenSearch expertise and operational capacity, self-hosted OpenSearch on Hetzner is the lowest-friction migration path.

Manticore Search — MySQL-Compatible EU Alternative

Manticore Search is an open-source search engine with a MySQL-compatible wire protocol. Teams that use OpenSearch via SQL-like queries or need to integrate with MySQL tooling can migrate to Manticore Search running on EU infrastructure.

Manticore Search supports full-text search, faceted search, and basic aggregations — suitable for application search use cases. It is not a drop-in replacement for OpenSearch's full analytics capabilities.

Migration Architecture: OpenSearch to EU-Sovereign Search

Migrating from Amazon OpenSearch Service involves two separate migration tracks depending on your use case.

Track 1: Application Search Migration (OpenSearch → Meilisearch or Typesense)

Application search typically indexes structured documents (products, articles, users) and serves low-latency search queries with typo tolerance and faceting.

# Step 1: Export your OpenSearch index to NDJSON
curl -X GET "https://your-domain.eu-central-1.es.amazonaws.com/products/_search?scroll=1m&size=1000" \
  -H "Content-Type: application/json" \
  -d '{"query": {"match_all": {}}}' \
  | jq '.hits.hits[]._source' > products.ndjson

# Step 2: Import into Meilisearch
curl -X POST "http://localhost:7700/indexes/products/documents" \
  -H "Content-Type: application/x-ndjson" \
  --data-binary @products.ndjson

Update your application to use the Meilisearch SDK instead of the OpenSearch SDK. The query model is simpler: Meilisearch handles typo tolerance, relevance ranking, and faceting without complex query DSL.

Track 2: Log Analytics Migration (OpenSearch → Quickwit + Grafana)

Log analytics migration requires replacing the ingestion pipeline, not just the storage layer.

Current AWS pattern:

Application → CloudWatch Logs → Kinesis Firehose → OpenSearch → Dashboards

EU-sovereign pattern:

Application → Vector/Fluent Bit → Quickwit → Grafana

Vector (from Datadog, MIT license, self-hostable) is the recommended log shipping agent. It replaces both CloudWatch Logs Agent and Kinesis Firehose:

# vector.toml — route application logs to Quickwit
[sources.app_logs]
type = "file"
include = ["/var/log/app/*.log"]

[transforms.parse_json]
type = "remap"
inputs = ["app_logs"]
source = '''
. = parse_json!(.message)
'''

[sinks.quickwit]
type = "http"
inputs = ["parse_json"]
uri = "http://quickwit.internal:7280/api/v1/my-index/ingest"
encoding.codec = "ndjson"

Grafana connects to Quickwit via the Quickwit data source plugin, providing dashboards equivalent to OpenSearch Dashboards.

If you are currently running Amazon OpenSearch Service and have not performed a personal data audit, GDPR requires several immediate actions:

1. Data mapping (Art. 30 Records of Processing): Document what personal data your OpenSearch indices contain, under what legal basis it is processed, and what retention period applies. This requires inspecting your index mappings and understanding what data your ingestion pipeline sends.

2. Implement retention policies: OpenSearch Index State Management (ISM) can automatically delete indices older than a specified period. Configure ISM policies based on your documented retention periods — not on disk capacity.

3. Implement erasure capability: Before Art. 17 requests arrive, test whether you can actually delete a specific user's data from all relevant indices. If you cannot, this is a compliance gap requiring remediation.

4. Assess the CLOUD Act exposure: If your OpenSearch cluster contains special category data (health data, data about political opinions, biometric data) and is operated as a managed service by AWS, you have a structural compliance problem that ISM policies and encryption do not solve. Only moving to EU-sovereign infrastructure resolves it.

5. Review Dashboards access controls: OpenSearch Dashboards with Fine-Grained Access Control can prevent unauthorized users from querying personal data. Ensure dashboards that expose user-level data require appropriate authentication and are scoped to legitimate data processors.

Conclusion

AWS OpenSearch Service is the most underappreciated personal data risk in the AWS ecosystem. It is used for two purposes that both create GDPR exposure: application search (where users' queries are personal data) and log analytics (where logs contain personal data that was never intended to be searchable). Both use cases accumulate data under US CLOUD Act jurisdiction without deterministic deletion mechanisms.

For new applications in 2026, the EU-sovereign path is clear:

Application search: Meilisearch (EU company, self-hosted or Meilisearch Cloud Frankfurt) or Typesense self-hosted on Hetzner
Log analytics: Quickwit (French company, self-hosted) + Vector + Grafana on EU infrastructure

For teams migrating from Amazon OpenSearch Service, the migration complexity depends heavily on how deeply your application is coupled to the OpenSearch query DSL. Application search migrations to Meilisearch or Typesense are typically straightforward — the query model is simpler. Log analytics migrations require rearchitecting the ingestion pipeline but eliminate an entire category of GDPR compliance risk.

Part of the sota.io EU Compliance Series — covering every AWS service through the lens of GDPR and the CLOUD Act.

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View plans