2026-05-01·13 min read

AWS Comprehend Medical EU Alternative 2026: Health NLP, Article 9 Exposure, and GDPR-Compliant Alternatives

Post #738 in the sota.io EU Compliance Series

AWS Comprehend Medical is Amazon's managed NLP service purpose-built for clinical text. It extracts medical entities — diagnoses, medications, dosages, procedures, anatomical regions, test results — from free-form text such as physician notes, discharge summaries, radiology reports, and patient questionnaires. It maps those entities to standardised ontologies (ICD-10-CM, RxNorm, SNOMED CT) and can detect Protected Health Information (PHI) for de-identification purposes. Teams building clinical decision support tools, healthcare analytics pipelines, insurance claims processing, or medical record digitisation are drawn to Comprehend Medical for its accuracy and its seamless integration with S3, Lambda, and the broader AWS ecosystem.

That clinical precision creates a GDPR exposure that is fundamentally different from general-purpose NLP services. When you send a free-form physician note to Comprehend Medical, you are not sending a business document or a customer email. You are sending structured health information — the type of data that GDPR Article 9 defines as a special category requiring explicit consent or a specific lawful basis that is considerably harder to establish than the standard Article 6 grounds. The output Comprehend Medical returns is more sensitive than the input: it converts ambiguous clinical prose into structured, machine-readable diagnoses, drug names, and conditions that can be aggregated, cross-referenced, and used for profiling in ways the original document could not.

AWS Comprehend Medical is operated by Amazon Web Services, Inc., a Delaware corporation headquartered in Seattle, Washington. The CLOUD Act (18 U.S.C. § 2713) requires US cloud providers to produce data on demand from US federal agencies, regardless of the AWS region selected. Running Comprehend Medical in eu-central-1 (Frankfurt) changes where the computation occurs. It does not change the legal framework under which Amazon can be compelled to produce outputs, logs, and model inference records.

This guide covers six GDPR exposure points that healthcare developers and digital health teams must understand before routing clinical text through the Comprehend Medical API.

What AWS Comprehend Medical Actually Does

At its core, Comprehend Medical exposes two primary API surfaces. DetectEntitiesV2 takes a block of clinical text and returns a list of entities, each with a type (MEDICATION, DIAGNOSIS, TEST_NAME, ANATOMY, etc.), a confidence score, and optional trait annotations. DetectPHI scans text for Protected Health Information fields — names, dates, phone numbers, addresses, medical record numbers — and returns their positions for redaction.

Beyond entity detection, Comprehend Medical provides InferICD10CM and InferRxNorm, which map detected diagnoses to ICD-10-CM billing codes and medications to RxNorm identifiers respectively. InferSNOMEDCT provides mappings to SNOMED CT clinical terminology. These inference endpoints convert clinical prose into the structured ontology codes used in healthcare data exchange — transforming a note that says "patient presented with crushing chest pain, diaphoresis, and ST elevation" into a structured record containing ICD-10 code I21.9 (Acute myocardial infarction, unspecified).

The service processes text synchronously for documents under 20,000 characters and asynchronously via S3 for batch jobs covering entire medical record archives. Both paths send clinical text to AWS compute infrastructure and return structured entity records that remain in AWS logging systems after the API call completes.

Risk 1: ICD-10 and RxNorm Outputs Are Structured Article 9 Data

The most important thing to understand about Comprehend Medical is that its outputs are health data under GDPR Article 9, not merely its inputs. A physician note is already health data — but it is often ambiguous, contextual, and requires clinical interpretation. The ICD-10-CM codes returned by InferICD10CM are unambiguous: I21.9 is a heart attack diagnosis, F32.1 is major depressive disorder, Z87.891 is a personal history of anorexia nervosa.

This matters for two reasons. First, structured diagnostic codes are far more amenable to automated processing, aggregation, and profiling than free-form clinical prose. Second, they are machine-readable in a way that facilitates cross-referencing with other datasets — insurance records, employment data, prescription histories — in ways that the original note did not enable. Article 22 of GDPR restricts decisions based solely on automated processing that significantly affect data subjects. Running insurance claims through InferICD10CM and using the output to flag high-risk patients for increased premiums or coverage denials is likely to trigger Article 22 in addition to Article 9.

The lawful basis for processing special category data under Article 9(2) must be established before submitting clinical text to Comprehend Medical. For most healthcare applications, this requires explicit consent under Article 9(2)(a) or processing necessary for the provision of health care under Article 9(2)(h). Neither basis is automatically satisfied by the fact that a patient interacted with a healthcare provider.

Risk 2: Confidence Scores Create Probabilistic Health Records

Comprehend Medical returns a confidence score (0.0–1.0) for every detected entity. A note that mentions "possible MS" or "rule out lupus" may return MULTIPLE_SCLEROSIS or SYSTEMIC_LUPUS_ERYTHEMATOSUS as entities with confidence scores of 0.65 or 0.70 — below the threshold a clinician would use for a confirmed diagnosis, but high enough for an analytics pipeline to incorporate into a patient risk profile.

Probabilistic health records are a specific GDPR concern. The EDPB's guidance on health data processing emphasises that the likely inference of a health condition constitutes health data under Article 4(15), even where no confirmed diagnosis exists. A patient who presented with neurological symptoms and was evaluated for MS but not diagnosed still has a probabilistic MS entity in their Comprehend Medical output — and that record, once created, follows the same Article 9 obligations as a confirmed diagnosis.

If your pipeline stores Comprehend Medical outputs without distinguishing confirmed diagnoses from probabilistic or differential diagnoses — or if your downstream systems process them identically — you may be maintaining health records that assert conditions patients do not have, with no mechanism for the data subject to exercise their Article 16 right of rectification.

Comprehend Medical's DetectPHI endpoint is designed to identify Protected Health Information fields for HIPAA de-identification — a US regulatory framework. The HIPAA Safe Harbor de-identification standard and the GDPR anonymisation standard are not equivalent.

HIPAA Safe Harbor requires removal of 18 specified identifier types. GDPR anonymisation requires that data be processed in a manner that makes re-identification of the data subject "impossible" in light of all means "reasonably likely to be used." The EDPB's guidance on anonymisation (Opinion 05/2014) sets a substantially higher bar: even if all direct identifiers are removed, data remains personal if re-identification is reasonably possible through combination with other available datasets.

Clinical text processed through DetectPHI and redacted at the HIPAA level retains quasi-identifiers — rare diagnoses, unusual procedure combinations, specific anatomical references — that enable re-identification at scale. Running Comprehend Medical's PHI detector and treating the output as GDPR-anonymised data is a compliance error that will not withstand regulatory scrutiny.

Risk 4: Batch Processing Archives Creates Long-Duration CLOUD Act Exposure

The asynchronous batch endpoint (StartEntitiesDetectionV2Job, StartPHIDetectionJob) is designed for processing entire medical record archives stored in S3. Healthcare teams use it to digitise legacy records, process historical discharge summaries, or run bulk ICD-10 coding passes across years of clinical documentation.

Each batch job creates a prolonged period of CLOUD Act exposure. The clinical text lands in S3, triggers the Comprehend Medical job, generates intermediate processing logs, and produces output files — all of which exist in AWS infrastructure under US jurisdiction for the duration of the job. A batch job processing 100,000 discharge summaries may run for hours, creating a window during which the entire archive is simultaneously present in AWS compute and storage systems.

For a healthcare provider processing years of patient records, this means the entire clinical history of potentially tens of thousands of patients is transiently accessible under US legal authority. The Article 44 prohibition on transfers to third countries applies to this scenario: sending EU patient health data to US-operated infrastructure for processing, even temporarily, constitutes a transfer requiring adequate safeguards under Chapter V.

Risk 5: Cross-Service Data Flows Expand Jurisdiction

Comprehend Medical integrates directly with other AWS services. Common architectures pipe clinical text from S3 through Lambda to Comprehend Medical, store outputs in DynamoDB or Redshift, and query them via Athena. Each service in this chain is independently subject to the CLOUD Act, and each creates its own logging trail.

The practical consequence is that Article 30 Record of Processing Activities (RoPA) for a Comprehend Medical pipeline must document not just Comprehend Medical itself but every downstream service that touches the outputs. A DynamoDB table storing ICD-10 codes extracted from patient notes is a special-category data store requiring the same Article 9 documentation as the source records. A Redshift analytics cluster running queries across those codes is processing health data at scale under US jurisdiction. An Athena query log contains the query text, which may itself reveal the diagnostic categories being investigated.

Teams that document "we use Comprehend Medical for NLP" in their RoPA while omitting the downstream S3/DynamoDB/Redshift/Athena chain are maintaining incomplete records that will be inadequate in a GDPR audit.

Risk 6: Model Training and Customisation Creates Persistent Exposure

Comprehend Medical allows customisation through the broader Amazon Comprehend custom classification and custom entity recognition capabilities. Healthcare teams sometimes fine-tune models on proprietary clinical datasets — specialty-specific terminology, institution-specific coding conventions, rare disease nomenclature — by submitting annotated training corpora to AWS.

Training data submitted to AWS model customisation endpoints becomes part of AWS's model training infrastructure. AWS's data processing terms permit use of submitted data for service improvement purposes unless explicitly opted out, and the opt-out mechanisms have varied across service versions. Clinical text used to fine-tune a Comprehend Medical model enters a longer-duration processing relationship than a single inference call — one that may persist through multiple model training iterations, evaluation runs, and model snapshots.

Under Article 17 GDPR, data subjects have the right to erasure. For a healthcare provider that has submitted clinical training data to AWS model customisation endpoints, erasure requests create a near-impossible compliance obligation: AWS does not provide mechanisms to identify and remove specific patient records from trained model weights. The Article 17 right cannot be fulfilled for data encoded into model parameters.

EU-Native Alternatives for Medical NLP

scispaCy

scispaCy is an open-source library built on spaCy, providing pre-trained models specifically designed for biomedical and clinical NLP. The en_core_sci_lg model provides general biomedical entity recognition; specialised models cover clinical abbreviations, biomedical entity linking to UMLS, MedMentions, and BC5CDR. scispaCy runs entirely on-premises or on EU-hosted infrastructure — it is a Python library with no external API calls during inference.

For ICD-10 mapping, scispaCy's entity linking component connects to locally hosted UMLS concept databases. The entire pipeline — text processing, entity extraction, ontology linking — operates within the EU jurisdiction you choose. This eliminates CLOUD Act exposure by design, not by contractual assurance. scispaCy is maintained by the Allen Institute for AI and is used in academic medical centres and digital health companies across Europe.

John Snow Labs Spark NLP for Healthcare

Spark NLP for Healthcare is a commercial NLP library providing production-grade clinical NLP capabilities that match or exceed Comprehend Medical's accuracy on most medical entity recognition benchmarks. It provides pre-trained models for NER, assertion status detection (confirmed/possible/negated), relation extraction between clinical entities, de-identification, ICD-10-CM coding, RxNorm normalisation, and SNOMED CT mapping.

Spark NLP for Healthcare can be deployed on any infrastructure — on-premises Spark clusters, EU-hosted Kubernetes, or EU data-sovereign cloud providers. It processes data where you deploy it, with no external API calls during inference. John Snow Labs, headquartered in New Jersey but with extensive EU customer deployments, provides models that are trained on publicly available clinical corpora (MIMIC, i2b2, MTSamples) and can be fine-tuned on proprietary EU clinical datasets without submitting data to a US cloud provider.

Bio-BERT and Clinical BERT Variants

The broader ecosystem of BERT-based clinical language models provides a foundation for building custom medical NLP pipelines without cloud API dependencies. Bio-BERT (pre-trained on PubMed abstracts and PMC full-text articles) and Clinical BERT (pre-trained on MIMIC-III clinical notes) are available through Hugging Face and can be fine-tuned for specific clinical NLP tasks using EU-hosted GPU compute.

For teams building custom medical NLP systems, this approach provides maximum flexibility and data sovereignty. Fine-tuning happens on infrastructure you control. Model weights remain in your custody. The inference pipeline runs within your EU security perimeter. The trade-off is engineering investment: a production-grade clinical NLP system built on Bio-BERT requires more effort than a Comprehend Medical API integration. For regulated healthcare applications where data sovereignty is non-negotiable, this trade-off is typically justified.

Medspacy and Rule-Based Augmentation

For specific, high-precision clinical NLP tasks — detecting negation, identifying temporal references in clinical notes, extracting medication dosages with high precision — medspacy provides a rule-based NLP framework built on spaCy. Rule-based systems have no dependency on cloud APIs and their inference logic is fully auditable, which matters for healthcare applications subject to clinical validation requirements.

Medspacy integrates with scispaCy for named entity recognition and adds context detection, section classification, and temporality analysis. For clinical de-identification tasks, the mSpaCy package and the MIMIC de-identification system provide HIPAA-compliant and GDPR-compatible de-identification without routing text to external APIs. These tools can be combined to build a complete clinical NLP pipeline that operates entirely within EU infrastructure.

Flair with Clinical Models

Flair is an open-source NLP framework that provides strong sequence labelling capabilities for clinical entity recognition. The Flair ecosystem includes pre-trained models for biomedical NER (trained on BC2GM, BC4CHEMD, BC5CDR, JNLPBA corpora) and supports fine-tuning on custom clinical datasets. Flair models run on standard PyTorch and can be deployed on EU-hosted GPU or CPU infrastructure.

For European digital health companies that need to process clinical text in multiple European languages — German medical notes, French discharge summaries, Dutch radiology reports — Flair's multilingual capabilities and the availability of multilingual clinical NER models (trained on corpora from European academic medical centres) make it particularly well-suited to EU deployment contexts.

Deployment Pattern: EU-Sovereign Clinical NLP Stack

A GDPR-compliant clinical NLP pipeline combining these tools might operate as follows:

Clinical text arrives from the source system (EHR, PACS, document management) and is pre-processed using medspacy's section classifier to identify the relevant clinical sections. scispaCy or Spark NLP for Healthcare performs named entity recognition, identifying medications, diagnoses, procedures, and anatomical references. Entity linking maps detected entities to UMLS, ICD-10-CM, RxNorm, or SNOMED CT using locally hosted concept databases. Assertion status detection (using Spark NLP's assertion models or medspacy's ConText algorithm) identifies negated, possible, and historical mentions, preventing probabilistic entities from being treated as confirmed diagnoses. De-identification using the mSpaCy or Spark NLP de-identification models processes the original text for redaction pipelines.

The entire stack runs on EU-hosted infrastructure — a Kubernetes cluster in a Frankfurt data centre, an on-premises GPU server in a hospital data room, or a managed compute cluster at a German or French cloud provider subject to EU jurisdiction only. No clinical text leaves the EU regulatory perimeter at any stage of processing.

Before deploying any clinical NLP pipeline, EU healthcare teams should verify each of the following:

Lawful basis under Article 9(2): Has a specific Article 9(2) lawful basis been identified and documented for each processing activity? Explicit consent, healthcare provision, and public health research each have different requirements. Can you demonstrate the lawful basis for every category of clinical text processed?

Article 30 RoPA completeness: Does your Record of Processing Activities document the complete data flow — from source clinical text through NLP processing to downstream storage and analytics? Are all intermediate processing steps (batch jobs, temporary storage, model inference logs) included?

Jurisdiction analysis: Where does NLP processing physically occur? Who operates that infrastructure? Are there contractual mechanisms (Standard Contractual Clauses, Binding Corporate Rules) in place for any cross-border transfers? Is the processing covered by an adequacy decision?

Probabilistic health data policy: How does your system distinguish confirmed diagnoses from probabilistic or differential entities detected by NLP? Are confidence thresholds documented? Can data subjects exercise Article 16 rectification rights for probabilistic health records?

Anonymisation standard: If clinical outputs are de-identified and processed as non-personal data, has the de-identification been validated against GDPR's re-identification standard — not merely HIPAA Safe Harbor? Are quasi-identifier risks documented?

Erasure obligation coverage: For any NLP system trained on clinical data, can you fulfil Article 17 erasure requests for individual patients whose data was used in training? If not, the training data approach must be reconsidered before deployment.

Conclusion

AWS Comprehend Medical is a capable clinical NLP service. Its GDPR problem is not primarily one of security controls or contractual protections — it is one of jurisdiction and output sensitivity. Clinical text processed by a US-operated service, even in a Frankfurt region, remains subject to US federal authority. And the ICD-10 codes, RxNorm identifiers, and SNOMED CT mappings that Comprehend Medical returns are structured special-category health data that carries the full weight of GDPR Article 9 — more so than the clinical prose from which they were extracted.

EU digital health teams building clinical NLP pipelines have mature, production-grade alternatives: scispaCy for fast, open-source biomedical NLP; Spark NLP for Healthcare for commercial accuracy with full data sovereignty; Bio-BERT and Clinical BERT variants for custom models on EU infrastructure; medspacy and Flair for rule-based and sequence labelling tasks. None of these alternatives require sending clinical text to US cloud infrastructure.

The choice is not between capability and compliance. EU-native clinical NLP tools have reached the accuracy levels required for production healthcare applications. The choice is between a convenience-first approach that creates structural GDPR Article 9 exposure and an architecture that builds data sovereignty in from the start.

For healthcare applications, there is no ambiguity about which approach is correct.

Running EU-sovereign workloads alongside NLP processing pipelines? sota.io is a European PaaS that keeps your infrastructure — and your clinical data — in EU jurisdiction.

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View plans