2026-04-16·12 min read

EU AI Act Art.59 Personal Data Processing for AI Development in Regulatory Sandboxes — Developer Guide (2026)

EU AI Act Article 59 solves one of the hardest practical problems in AI development within the EU: where does the training data come from? High-quality AI systems require large, representative datasets — but the personal data that makes datasets representative is typically collected for purposes other than AI development. GDPR's purpose limitation principle (Art.5(1)(b)) normally prohibits using that data for new, incompatible purposes.

Art.59 creates a targeted exception: within AI regulatory sandboxes established under Art.57, providers may further process personal data originally collected for other purposes — provided the processing satisfies six cumulative conditions designed to protect data subjects while enabling legitimate AI innovation.

Art.59 is not a blanket GDPR derogation. It does not override GDPR's lawful basis requirements for the original collection, does not remove data subject rights, and does not permit processing outside the sandbox boundary. What it does is resolve the purpose compatibility question for a carefully bounded category of AI development activity — sandbox testing and validation — where the public interest in reliable, representative AI systems justifies the further use of existing data.

This guide covers Art.59(1)–(4) in full, the GDPR Art.6(4) compatibility framework, each of the six conditions, the sandbox-boundary constraint, special category data interaction, CLOUD Act jurisdiction risk for sandbox training data, and Python implementation for sandbox data processing governance.

Art.59 became applicable on 2 August 2025 as part of Chapter VI (Measures in Support of Innovation) of the EU AI Act (Regulation (EU) 2024/1689).

Art.59 in the Chapter VI Innovation Framework

Article	Mechanism	Approach
Art.57	AI Regulatory Sandbox	Controlled environment, sandbox plan, authority partnership
Art.58	Real-World Testing Outside Sandbox	Notification-based, 30-day implicit consent, independent testing
Art.59	Personal Data Processing in Sandbox	Further-processing exception for sandbox training/testing data
Art.60–63	Further Innovation Support	SME measures, regulatory guidance, codes of practice

Art.59 is architecturally dependent on Art.57: it only operates within a formally established AI regulatory sandbox. Providers using the Art.58 real-world testing pathway outside the sandbox do not gain Art.59's further-processing exception — their data processing remains governed entirely by standard GDPR rules and original consent/lawful basis.

The Core Problem Art.59 Solves

GDPR Art.5(1)(b) establishes the purpose limitation principle: personal data shall be collected for specified, explicit, and legitimate purposes and not further processed in a manner incompatible with those purposes. This principle is fundamental to GDPR and not subject to general exemption.

GDPR Art.6(4) provides a compatibility assessment framework for determining whether further processing is compatible with the original purpose. The Art.6(4) factors are:

Link between purposes: Is there a link between the original and new purpose?
Context of collection: What was the context in which data was collected, and reasonable expectations of data subjects?
Nature of data: Is special category data (Art.9) or criminal conviction data (Art.10) involved?
Consequences of further processing: What are the possible consequences for data subjects?
Safeguards in place: Are there appropriate safeguards (encryption, pseudonymisation)?

If further processing is compatible under Art.6(4), no new lawful basis is needed. If incompatible, a new lawful basis is required. Art.59 provides that new lawful basis for sandbox AI development.

The AI Development Data Problem

Without Art.59, a provider wanting to train a high-risk AI system on hospital patient data (originally collected for treatment) would face:

GDPR Art.5(1)(b) purpose limitation: AI training is not medical treatment
GDPR Art.6(4) incompatibility: likely found incompatible given sensitivity
GDPR Art.9 special category: health data requires Art.9(2) basis
Practical result: either the AI cannot be trained on real clinical data, or a new consent campaign must be run

Art.59 addresses this impasse for the sandbox context, recognising that high-quality AI systems that serve public interests require representative real-world data — and that the controlled sandbox environment provides sufficient safeguards to justify carefully bounded further processing.

Art.59(1): The Permission and Its Conditions

The Grant

Art.59(1) states that personal data lawfully collected for other purposes may be processed in the AI regulatory sandbox for the purpose of developing, training, testing, and validating AI systems — provided the following conditions are cumulatively satisfied.

The phrase "lawfully collected for other purposes" is important: Art.59 does not create a basis for the original collection. Data must have been collected lawfully (under Art.6(1) or Art.9(2) as applicable). Art.59 only addresses what can be done with that data once it is in the provider's possession.

Condition 1: AI Systems in the Public Interest

Art.59(1)(a): The AI system being developed must be in the public interest within a field listed in Art.57(1) or otherwise justified under the sandbox authorisation.

What counts as public interest for Art.59:

Healthcare AI: disease detection, treatment recommendation, clinical trial optimisation
Public safety AI: fraud detection, border management, critical infrastructure protection
Environmental AI: climate modelling, energy grid optimisation, pollution monitoring
Legal/administrative AI: court efficiency tools, benefits assessment, public service delivery
Education AI: learning outcome prediction, accessibility tools for public institutions

What does NOT qualify:

Pure commercial AI with no public benefit dimension
AI developed to serve competitive private interests without broader societal benefit
AI that serves public interest only incidentally — the public interest purpose must be primary

The public interest requirement creates a meaningful filter. A healthcare startup developing a patient triage AI qualifies; a fintech developing a private credit scoring model for competitive advantage likely does not.

Condition 2: Processing Necessary and Proportionate

Art.59(1)(b): The further processing must be necessary and proportionate to the AI development goal. This mirrors the proportionality principle applied throughout GDPR and the EU AI Act.

Necessity assessment:

Would anonymised or synthetic data be sufficient? If yes, further processing of personal data may fail necessity
Is the full dataset required, or would a representative sample suffice?
Can the AI objective be achieved with less sensitive attributes?

Proportionality assessment:

Is the scope of data subjects proportionate to the training objective?
Is the processing duration limited to what is needed?
Is the retention of personal data limited to what the training actually requires?

Providers should document the necessity and proportionality assessment in the sandbox plan under Art.57(3). Sandbox authorities review this assessment as part of plan approval.

Condition 3: No Adverse Effects on Data Subjects

Art.59(1)(c): The further processing must not produce adverse effects for the data subjects whose data is used for AI training and testing.

What "adverse effects" means:

Direct harm: decisions taken about subjects based on the AI being trained on their data
Indirect harm: re-identification risk from processing combined datasets
Reputational harm: inference of sensitive attributes from seemingly innocuous data
Discriminatory harm: training on biased historical data that encodes discrimination

Practical safeguards to prevent adverse effects:

Strict separation between sandbox AI and production systems — sandbox-trained models must not make live decisions affecting subjects whose data was used for training
Pseudonymisation before training, with key management separate from the training environment
Data subject notification where feasible (though Art.59 does not mandate it in all cases)
Regular bias audits to detect discriminatory pattern encoding in training

The "no adverse effects" condition is ongoing, not just assessed at the point of processing. Providers must monitor whether sandbox training produces outputs that adversely affect subjects — for example, by detecting whether outputs correlate with protected characteristics in ways that could harm individuals if deployed.

Condition 4: Subject to Appropriate Technical and Organisational Measures

Art.59(1)(d): Processing must be subject to appropriate technical and organisational measures (TOMs) protecting data subjects' rights and freedoms.

Required TOMs for Art.59 compliance:

TOM Category	Examples
Pseudonymisation	Tokenisation of identity attributes; key management separate from training
Encryption	Data encrypted at rest and in transit; key management documented
Access control	Role-based access; training data accessible only to necessary personnel
Data minimisation	Feature selection limited to required attributes; remove unnecessary columns
Aggregation	Where possible, aggregate rather than use individual records
Audit logging	All access to sandbox training data logged and retained
Incident response	Breach detection and response procedures for sandbox data
Testing	Regular adversarial testing for re-identification risk

The TOM requirement interacts with GDPR Art.25 (data protection by design and by default), which applies independently. Art.59 TOMs are additional, AI-development-specific safeguards layered on top of standard GDPR Art.25 obligations.

Condition 5: Sandbox Plan Includes Data Processing

Art.59(1)(e): The data processing under Art.59 must be covered by the sandbox plan agreed with the competent authority under Art.57(3).

This is the most significant structural condition: Art.59 cannot be invoked unilaterally. The further processing of personal data must be disclosed to and accepted by the sandbox supervisory authority as part of the formal sandbox plan. This means:

The sandbox plan must describe what personal data will be processed
The plan must specify the sources and original purpose of collection
The plan must identify the Art.6(1) and Art.9(2) basis for original collection
The plan must explain why Art.59 further processing is necessary and proportionate
The competent authority reviews and accepts (or conditions) the data processing element

Providers who wish to use Art.59 cannot simply process the data and claim the sandbox exception after the fact. The exception requires prior authorisation through the sandbox plan process.

Condition 6: No Transfer Outside the Union

Art.59(1)(f): Personal data processed under Art.59 must not be transferred to third countries or international organisations unless adequate protection equivalent to GDPR standards is ensured.

This condition reflects the EU's approach to maintaining the data protection level of the GDPR framework even in innovation contexts. The Art.59 exception is specifically designed for EU-supervised sandbox environments; extending the data to third-country jurisdictions would undermine the supervisory oversight that justifies the exception.

Practical implications:

Training data subject to Art.59 must remain in EU-located infrastructure
Cloud compute used for sandbox training must be in EU data centres
Model weights trained on Art.59 data must not be replicated to non-EU systems
Collaborating parties outside the EU cannot access the Art.59 training data

Art.59(2): The Sandbox Boundary Constraint

Art.59(2) establishes the critical sandbox boundary rule: personal data processed under Art.59 must remain within the sandbox environment and must not be used outside it.

What "Remain Within the Sandbox" Means

The sandbox boundary is both a data boundary and a system boundary:

Data boundary: Personal data processed under Art.59 cannot be extracted from the sandbox environment. This includes:

Raw training data
Intermediate data representations (embeddings, feature matrices)
Data used for validation or testing (validation sets, test sets)

System boundary: AI model outputs derived from Art.59 data cannot be used in production systems without a separate data protection assessment. The sandbox-trained model is an AI artifact — once deployed outside the sandbox, it processes new data under standard GDPR rules. But the weights trained on Art.59 data carry a taint: if the model memorises training data (a known ML risk), production use could re-expose Art.59 subjects.

Art.59(2) and Model Deployment

Art.59(2) creates a compliance gap that many providers overlook: the model trained on Art.59 data in the sandbox needs a separate legal basis for deployment outside the sandbox.

The Art.59 exception covers the training/testing process. When the model is deployed in production:

New personal data is processed under standard GDPR rules
The model's training data heritage (Art.59) is not disclosed to production data subjects
If the model memorises training examples, there is a re-identification risk for Art.59 subjects from production inference

Recommended practice: Before deploying a sandbox-trained model, conduct a DPIA (GDPR Art.35) that specifically addresses the Art.59 training data heritage and the risk of training data memorisation in the deployed model.

Art.59(3): Data Subject Rights During Sandbox Processing

Art.59(3) confirms that data subjects' rights under GDPR continue to apply during Art.59 processing. Art.59 does not suspend access rights (Art.15), rectification rights (Art.16), erasure rights (Art.17), or objection rights (Art.21).

The Right to Erasure Challenge

The right to erasure (GDPR Art.17) creates a practical challenge for AI training: if a data subject exercises their erasure right, the provider must delete their data — but what does "deletion" mean for a trained model?

The EU AI Act does not resolve the machine unlearning question directly, but the Art.59 framework implies:

Training data must be deleted from the sandbox storage
Trained model weights may need to be retrained if the deleted data had material influence
Derived artefacts (embeddings, feature representations traceable to the subject) must be deleted

Machine unlearning — the ability to "forget" a training example from a model — is an active research area. Providers using Art.59 should implement:

Tracking of which data subjects contributed to which training batches
The ability to identify whether a subject's data was in a model's training set
A procedure for retraining or differential privacy techniques if erasure is requested

Where providers process Art.59 data, GDPR Art.13 (data collected from the subject) or Art.14 (data not collected from the subject) notification obligations apply. The notification must disclose the further processing purpose (AI development in the sandbox) unless an Art.14(5) exception applies (disproportionate effort, impossible, or would seriously impair the purposes).

For large-scale healthcare or financial datasets where individual notification is impractical, providers may rely on the Art.14(5)(b) disproportionate effort exception — but must make the further processing information publicly available and take other appropriate measures (GDPR Art.14(5)(b)).

Art.59(4): Supervisory Authority Cooperation

Art.59(4) requires that sandbox competent authorities cooperate with national data protection authorities (DPAs) in supervising the data processing aspects of Art.59.

What DPA Cooperation Means in Practice

The competent authority for the AI regulatory sandbox (typically a market surveillance or sector regulator) has AI expertise but not necessarily GDPR expertise. DPA cooperation ensures:

DPAs can advise on whether the Art.59 conditions are met
DPAs can conduct joint inspections of sandbox data processing
DPAs can raise concerns about sandbox plans that involve sensitive or high-risk data processing
DPAs retain their GDPR enforcement powers — Art.59 does not create an AI-sandbox exemption from DPA jurisdiction

For providers, DPA cooperation means that sandbox plans involving significant personal data processing may be subject to DPA review — either proactively (as part of plan approval) or reactively (in response to complaints or incidents).

Member State Implementation Variation

Some Member States may require DPA pre-approval as part of the sandbox authorisation process. Others may rely on DPA consultation during plan review. Providers should check the specific requirements of the national sandbox authority in each Member State where they operate or intend to seek sandbox participation.

Art.59 supplements (but does not replace) the GDPR Art.6(4) compatibility analysis. A provider should conduct both:

Step 1: GDPR Art.6(4) analysis

Assess whether sandbox AI development is compatible with the original collection purpose under the five Art.6(4) factors
If compatible: no new lawful basis needed — proceed under the original basis
If incompatible: proceed to Step 2

Step 2: Art.59 exception

Verify the AI system qualifies (public interest, sandbox environment)
Verify all six Art.59(1) conditions are satisfied
Ensure the sandbox plan is in place and includes data processing approval
Implement required TOMs

Providers should not skip Step 1. If the Art.6(4) analysis shows compatibility, Art.59 is not needed and the processing is simpler. Art.59 is the safety valve for cases where compatibility cannot be established.

Art.59 does not create a new Art.9(2) basis for special category data. If sandbox training involves health data, biometric data, genetic data, or other Art.9 categories, the provider must establish an independent Art.9(2) basis — typically:

Art.9(2)(g): substantial public interest, with proportionate safeguards — the most likely basis for healthcare AI development
Art.9(2)(j): scientific or historical research purposes — applicable where the AI development has a research component
Art.9(2)(a): explicit consent — feasible for smaller datasets but impractical at scale

Art.59 then operates as the purpose extension mechanism for the further processing, while the Art.9(2) basis covers the special category dimension. Both must be satisfied.

Data Type	Original Collection Basis	Art.59 Further Processing	Art.9(2) Basis (if needed)
Medical records	Art.6(1)(c) legal obligation	Art.59 exception	Art.9(2)(g) public interest
Biometric data	Art.6(1)(a) consent	Art.59 exception	Art.9(2)(a) explicit consent
Employment data	Art.6(1)(b) contract	Art.59 exception	Art.9(2)(b) employment law
Financial data	Art.6(1)(c) legal obligation	Art.59 exception	Not applicable (non-special)

CLOUD Act Jurisdiction Risk for Sandbox Training Data

Art.59(1)(f) prohibits transfers outside the EU — but the CLOUD Act creates a different risk: US authorities may compel production of data held on US cloud infrastructure even without a transfer.

The CLOUD Act Compellability Problem

If an EU provider uses AWS, Azure, or Google Cloud for sandbox AI training under Art.59, the training data — pseudonymised healthcare records, financial histories, biometric datasets — is potentially subject to CLOUD Act orders:

US government serves CLOUD Act order on the US cloud provider (AWS/Microsoft/Google)
Cloud provider must produce data held by it or under its control — including EU data centre data
GDPR Art.48 conflict: GDPR prohibits transfers ordered by third-country authorities unless under MLAT or similar instruments
Art.59(1)(f) violation: the production to a US authority constitutes a transfer outside the EU

The CLOUD Act risk is particularly acute for Art.59 data because:

Sandbox training data is often sensitive (healthcare, financial, biometric)
Large volumes are processed for training — a bigger surface area
Model weights are derivative artefacts that may also be subject to compellability
The sandbox plan may be discoverable as evidence in US proceedings

Mitigation: EU-sovereign infrastructure with no US corporate parent is the cleanest Art.59 infrastructure solution. Providers using EU-only cloud providers (such as European PaaS platforms without US corporate affiliates) avoid the CLOUD Act compellability risk entirely for training workloads.

Practical Infrastructure Requirements for Art.59

Art.59 Compliant Infrastructure Architecture:

[Training Data Storage]
├── EU sovereign cloud or on-premises only
├── Encryption at rest (provider-managed keys, EU HSM)
├── Access logging: all reads auditable
└── No replication to non-EU regions

[Compute Environment]
├── EU-located training nodes
├── No telemetry to US-controlled cloud services
├── Isolated network: no outbound to third-country services
└── Ephemeral environments: clean-up after training complete

[Model Artefacts]
├── Weights stored in EU-sovereign storage
├── No export to non-EU evaluation or inference environments
└── If deployed: separate DPIA required before production use

[Access Control]
├── Role-based: only authorised ML engineers
├── Multi-factor authentication
├── No remote access from non-EU locations (or VPN with EU exit)
└── Third-party contractors: data processing agreements required

Python Implementation: Sandbox Data Processing Governance

"""
EU AI Act Art.59 Sandbox Data Processing Governance
Implements compliance tracking for further processing of personal data
in AI regulatory sandboxes under Art.59(1) conditions.
"""

from dataclasses import dataclass, field
from datetime import datetime, date
from enum import Enum
from typing import List, Optional, Dict
import hashlib
import json


class DataCategory(Enum):
    """GDPR data categories relevant to Art.59"""
    GENERAL = "general"
    SPECIAL_CATEGORY = "special_category"  # Art.9
    CRIMINAL = "criminal"  # Art.10


class OriginalPurposeLegalBasis(Enum):
    """GDPR Art.6(1) bases for original collection"""
    CONSENT = "art_6_1_a_consent"
    CONTRACT = "art_6_1_b_contract"
    LEGAL_OBLIGATION = "art_6_1_c_legal_obligation"
    VITAL_INTERESTS = "art_6_1_d_vital_interests"
    PUBLIC_TASK = "art_6_1_e_public_task"
    LEGITIMATE_INTERESTS = "art_6_1_f_legitimate_interests"


class Art9Basis(Enum):
    """GDPR Art.9(2) bases for special category data"""
    EXPLICIT_CONSENT = "art_9_2_a"
    EMPLOYMENT_LAW = "art_9_2_b"
    VITAL_INTERESTS = "art_9_2_c"
    LEGITIMATE_ACTIVITIES = "art_9_2_d"
    MANIFESTLY_PUBLIC = "art_9_2_e"
    LEGAL_CLAIMS = "art_9_2_f"
    PUBLIC_INTEREST = "art_9_2_g"
    PREVENTIVE_MEDICINE = "art_9_2_h"
    PUBLIC_HEALTH = "art_9_2_i"
    RESEARCH = "art_9_2_j"


@dataclass
class Art6_4CompatibilityAssessment:
    """
    GDPR Art.6(4) compatibility analysis.
    Must be completed before invoking Art.59 exception.
    """
    link_between_purposes: str          # Factor 1: link analysis
    context_of_collection: str          # Factor 2: data subject expectations
    nature_of_data_risk: str            # Factor 3: sensitivity assessment
    consequences_for_subjects: str      # Factor 4: impact analysis
    safeguards_in_place: str            # Factor 5: technical/org measures
    compatibility_conclusion: bool       # True = compatible (Art.59 not needed)
    compatibility_rationale: str
    assessed_by: str
    assessed_date: date

    def requires_art59(self) -> bool:
        """Returns True if Art.59 exception is needed"""
        return not self.compatibility_conclusion


@dataclass
class Art59Condition:
    """Single Art.59(1) condition assessment"""
    condition_id: str
    condition_text: str
    satisfied: bool
    evidence: str
    assessed_date: date


@dataclass
class SandboxDataProcessingRecord:
    """
    Art.59 compliance record for personal data processing
    in an AI regulatory sandbox.
    """
    record_id: str
    sandbox_reference: str              # Art.57 sandbox identifier
    sandbox_plan_reference: str         # Art.57(3) approved plan reference
    ai_system_description: str
    public_interest_justification: str  # Art.59(1)(a)

    # Data source characterisation
    data_source_description: str
    data_category: DataCategory
    original_collection_purpose: str
    original_legal_basis: OriginalPurposeLegalBasis
    art9_basis: Optional[Art9Basis] = None  # Required if special category

    # Subject scope
    approximate_subject_count: int = 0
    subject_population_description: str = ""

    # Art.6(4) assessment
    compatibility_assessment: Optional[Art6_4CompatibilityAssessment] = None

    # Art.59(1) conditions
    conditions: List[Art59Condition] = field(default_factory=list)

    # Processing boundaries
    processing_start: Optional[date] = None
    processing_end: Optional[date] = None
    sandbox_boundary_confirmed: bool = False  # Art.59(2)

    # Infrastructure
    processing_location: str = ""
    eu_sovereign_infrastructure: bool = False
    cloud_act_risk_assessed: bool = False

    # Subject rights mechanism
    erasure_mechanism_documented: bool = False
    notification_approach: str = ""

    # Approval
    dpa_consulted: bool = False
    competent_authority_approved: bool = False
    approved_date: Optional[date] = None

    def all_conditions_satisfied(self) -> bool:
        return all(c.satisfied for c in self.conditions)

    def compliance_summary(self) -> Dict:
        issues = []

        if not self.compatibility_assessment:
            issues.append("GDPR Art.6(4) compatibility assessment not completed")

        if self.data_category == DataCategory.SPECIAL_CATEGORY and not self.art9_basis:
            issues.append("Art.9(2) basis required for special category data")

        if not self.all_conditions_satisfied():
            failed = [c.condition_id for c in self.conditions if not c.satisfied]
            issues.append(f"Art.59(1) conditions not satisfied: {failed}")

        if not self.sandbox_boundary_confirmed:
            issues.append("Art.59(2) sandbox boundary not confirmed")

        if not self.eu_sovereign_infrastructure:
            issues.append("WARNING: Non-EU infrastructure may violate Art.59(1)(f)")

        if not self.cloud_act_risk_assessed:
            issues.append("CLOUD Act compellability risk assessment missing")

        if not self.erasure_mechanism_documented:
            issues.append("Data subject erasure mechanism not documented")

        if not self.competent_authority_approved:
            issues.append("Competent authority approval pending")

        return {
            "record_id": self.record_id,
            "compliant": len(issues) == 0,
            "issues": issues,
            "conditions_satisfied": self.all_conditions_satisfied(),
            "sandbox_boundary_ok": self.sandbox_boundary_confirmed,
            "infrastructure_ok": self.eu_sovereign_infrastructure,
        }


class Art59ConditionChecker:
    """
    Systematic checker for all six Art.59(1) conditions.
    """

    CONDITION_TEMPLATES = [
        ("C1", "AI system serves public interest in an Art.57(1) qualifying field"),
        ("C2", "Processing is necessary and proportionate to the AI development goal"),
        ("C3", "Processing produces no adverse effects for data subjects"),
        ("C4", "Appropriate technical and organisational measures are in place"),
        ("C5", "Processing is covered by the approved sandbox plan under Art.57(3)"),
        ("C6", "No transfer of personal data outside the EU/EEA"),
    ]

    @staticmethod
    def create_conditions_template(assessed_by: str) -> List[Art59Condition]:
        """Create blank condition records for assessment"""
        today = date.today()
        return [
            Art59Condition(
                condition_id=cid,
                condition_text=text,
                satisfied=False,
                evidence="[ASSESSMENT REQUIRED]",
                assessed_date=today,
            )
            for cid, text in Art59ConditionChecker.CONDITION_TEMPLATES
        ]

    @staticmethod
    def validate_record(record: SandboxDataProcessingRecord) -> List[str]:
        """Return list of compliance gaps"""
        gaps = []

        # Check all 6 conditions exist
        condition_ids = {c.condition_id for c in record.conditions}
        required = {cid for cid, _ in Art59ConditionChecker.CONDITION_TEMPLATES}
        missing = required - condition_ids
        if missing:
            gaps.append(f"Missing condition assessments: {missing}")

        # Check all satisfied
        if not record.all_conditions_satisfied():
            failed = [
                f"{c.condition_id}: {c.condition_text}"
                for c in record.conditions if not c.satisfied
            ]
            for f in failed:
                gaps.append(f"Condition not satisfied: {f}")

        return gaps


class SandboxDataSubjectTracker:
    """
    Tracks data subject contributions to sandbox training.
    Enables erasure right compliance under GDPR Art.17.
    """

    def __init__(self, record_id: str):
        self.record_id = record_id
        self.subject_contributions: Dict[str, List[str]] = {}
        # Maps pseudonymised_id -> list of training_batch_ids

    def register_contribution(
        self,
        subject_pseudoid: str,
        batch_id: str,
    ) -> None:
        """Record that a subject's data was used in a training batch"""
        if subject_pseudoid not in self.subject_contributions:
            self.subject_contributions[subject_pseudoid] = []
        self.subject_contributions[subject_pseudoid].append(batch_id)

    def subject_in_training(self, subject_pseudoid: str) -> bool:
        """Check if a subject's data was used in any training batch"""
        return subject_pseudoid in self.subject_contributions

    def get_affected_batches(self, subject_pseudoid: str) -> List[str]:
        """Get all training batches that included the subject's data"""
        return self.subject_contributions.get(subject_pseudoid, [])

    def generate_erasure_impact(self, subject_pseudoid: str) -> Dict:
        """
        Assess erasure request impact for a subject.
        Returns remediation requirements.
        """
        batches = self.get_affected_batches(subject_pseudoid)
        if not batches:
            return {
                "subject_in_training": False,
                "erasure_scope": "data_only",
                "retraining_required": False,
            }

        return {
            "subject_in_training": True,
            "training_batches": batches,
            "erasure_scope": "data_and_model_weights_review",
            "retraining_required": True,
            "retraining_rationale": (
                "Subject's data contributed to model training. "
                "Model must be retrained without subject data, or "
                "differential privacy techniques applied to demonstrate "
                "subject data has been effectively forgotten."
            ),
        }


@dataclass
class Art59AuditLog:
    """
    Immutable audit log for Art.59 data processing.
    """
    timestamp: datetime
    record_id: str
    action: str
    actor: str
    data_accessed: str
    purpose: str
    ip_location: str

    def to_log_entry(self) -> str:
        return json.dumps({
            "ts": self.timestamp.isoformat(),
            "record": self.record_id,
            "action": self.action,
            "actor": self.actor,
            "data": self.data_accessed,
            "purpose": self.purpose,
            "location": self.ip_location,
        })


# Example: Healthcare AI development under Art.59
def example_healthcare_ai_art59():
    record = SandboxDataProcessingRecord(
        record_id="ART59-2025-001",
        sandbox_reference="DE-SANDBOX-BfArM-2025-07",
        sandbox_plan_reference="SANDBOX-PLAN-v2.1-APPROVED-2025-08-01",
        ai_system_description=(
            "High-risk AI system for early detection of diabetic retinopathy "
            "in ophthalmology imaging (Annex III Category 5b: essential services)"
        ),
        public_interest_justification=(
            "Diabetic retinopathy is the leading cause of preventable blindness in the EU. "
            "Early AI-assisted detection enables treatment before irreversible damage. "
            "System serves public health interest under Art.57(1)."
        ),
        data_source_description=(
            "Retinal imaging records from University Hospital clinical database, "
            "originally collected under patient treatment consent and legal obligation "
            "to maintain medical records."
        ),
        data_category=DataCategory.SPECIAL_CATEGORY,
        original_collection_purpose="Medical treatment and clinical record-keeping",
        original_legal_basis=OriginalPurposeLegalBasis.LEGAL_OBLIGATION,
        art9_basis=Art9Basis.PUBLIC_INTEREST,  # Art.9(2)(g)
        approximate_subject_count=12000,
        subject_population_description="Adult diabetic patients in hospital care 2020-2024",
        processing_location="Frankfurt, Germany (EU sovereign colocation)",
        eu_sovereign_infrastructure=True,
        cloud_act_risk_assessed=True,
        erasure_mechanism_documented=True,
        notification_approach=(
            "Art.14(5)(b) disproportionate effort: 12,000 historical patients. "
            "Public notice published on hospital website and data register."
        ),
        dpa_consulted=True,
        competent_authority_approved=True,
        approved_date=date(2025, 8, 1),
        sandbox_boundary_confirmed=True,
        processing_start=date(2025, 8, 15),
        processing_end=date(2026, 2, 14),
    )

    # Add compatibility assessment
    record.compatibility_assessment = Art6_4CompatibilityAssessment(
        link_between_purposes=(
            "AI development purpose (diabetic retinopathy detection) is closely linked "
            "to the original treatment purpose — both aim at improving patient outcomes."
        ),
        context_of_collection=(
            "Data collected in clinical context. Patients would reasonably expect "
            "their data to be used for improving healthcare quality, including AI tools."
        ),
        nature_of_data_risk=(
            "Special category health data (Art.9). High sensitivity requires "
            "strict pseudonymisation and access controls."
        ),
        consequences_for_subjects=(
            "No direct decisions are taken about training subjects. "
            "Model does not feed back into clinical records of contributing patients."
        ),
        safeguards_in_place=(
            "Pseudonymisation with separate key management. "
            "Encryption at rest and in transit. Role-based access. "
            "Audit logging. No transfer outside EU."
        ),
        compatibility_conclusion=False,  # Not compatible without Art.59
        compatibility_rationale=(
            "AI model training is a new purpose not within scope of original treatment "
            "purpose. Art.59 exception required."
        ),
        assessed_by="DPO, Chief Privacy Officer",
        assessed_date=date(2025, 7, 15),
    )

    # Add Art.59(1) conditions
    record.conditions = [
        Art59Condition("C1", "AI system serves public interest in Art.57(1) qualifying field",
                      True, "Diabetic retinopathy detection: preventable blindness prevention", date(2025, 7, 15)),
        Art59Condition("C2", "Processing necessary and proportionate",
                      True, "Clinical images necessary for training; 12,000 subjects proportionate to accuracy target", date(2025, 7, 15)),
        Art59Condition("C3", "No adverse effects for data subjects",
                      True, "Model not deployed for clinical decisions on training subjects; pseudonymised", date(2025, 7, 15)),
        Art59Condition("C4", "Appropriate TOMs in place",
                      True, "Pseudonymisation, encryption, RBAC, audit logs, EU infrastructure documented", date(2025, 7, 15)),
        Art59Condition("C5", "Processing covered by approved sandbox plan",
                      True, "SANDBOX-PLAN-v2.1 Section 4: Data Processing, approved BfArM 2025-08-01", date(2025, 8, 1)),
        Art59Condition("C6", "No transfer outside EU",
                      True, "Frankfurt colocation, no US cloud, no third-country transfer", date(2025, 7, 15)),
    ]

    summary = record.compliance_summary()
    print(f"Art.59 Compliant: {summary['compliant']}")
    if summary['issues']:
        for issue in summary['issues']:
            print(f"  ISSUE: {issue}")
    return record

GDPR Art.89 provides a research and statistics processing framework with derogations from certain data subject rights. How does Art.89 relate to Art.59?

Dimension	GDPR Art.89	EU AI Act Art.59
Scope	Scientific/historical research, statistics	AI development/testing in regulatory sandbox
Environment	Any	AI regulatory sandbox only
Rights derogations	Art.15/16/18/21 may be restricted	No derogation — all rights maintained
Purpose link	Research purpose established independently	AI development in public interest
Supervisory oversight	DPA oversight	Joint: sandbox authority + DPA (Art.59(4))
Transfer prohibition	Not explicit	Explicit: Art.59(1)(f)
Model deployment	N/A	Separate assessment required

Art.59 is narrower than Art.89 in some respects (no rights derogations, sandbox-only) but more enabling in others (targeted exception for purpose extension, multi-authority oversight framework). For AI development that does not occur within a formal sandbox, Art.89 remains relevant where the development has a genuine research character.

Art.59 Compliance Checklist (35 Items)

Pre-Processing: Legal Foundation

1. Confirmed personal data was lawfully collected under GDPR Art.6(1)
2. Identified original collection purpose and legal basis
3. For special category data: identified Art.9(2) basis for original collection
4. Completed GDPR Art.6(4) compatibility assessment (documented)
5. Confirmed Art.59 exception is needed (compatibility assessment found incompatible)
6. Confirmed AI system qualifies: high-risk AI in public interest domain
7. Confirmed AI development occurs within Art.57 sandbox environment

Sandbox Plan Requirements

8. Art.57(3) sandbox plan includes data processing description
9. Sandbox plan specifies data sources and original collection purpose
10. Sandbox plan identifies legal bases for original collection and Art.59 exception
11. Sandbox plan documents necessity and proportionality assessment
12. Competent authority has approved sandbox plan including data processing
13. DPA has been consulted (Art.59(4)) or notified as required by Member State rules
14. Sandbox authority approval date documented

Art.59(1) Conditions — All Six

15. C1: Public interest justification documented for AI system
16. C2: Necessity assessment: personal data required (synthetic/anonymised insufficient)
17. C2: Proportionality assessment: data scope minimum required for AI goal
18. C3: No adverse effects: confirmed model not used to make decisions about training subjects
19. C3: Re-identification risk assessed and mitigated
20. C4: Pseudonymisation implemented with separate key management
21. C4: Encryption at rest and in transit
22. C4: Role-based access control documented and enforced
23. C4: Audit logging of all training data access
24. C5: Sandbox plan approved — data processing section confirmed in scope
25. C6: No transfer to third countries: infrastructure confirmed EU-only

Sandbox Boundary (Art.59(2))

26. Technical controls prevent data extraction from sandbox environment
27. Model weights not replicated to non-EU or non-sandbox environments
28. Sandbox-trained model flagged for separate DPIA before production deployment

Data Subject Rights

29. Erasure mechanism documented: procedure for GDPR Art.17 requests
30. Subject contribution tracking: which subjects contributed to which batches
31. Retraining procedure documented for erasure requests post-training
32. Notification approach documented: Art.13/14 or Art.14(5)(b) justification
33. Access request (Art.15) response procedure in place

CLOUD Act and Infrastructure

34. Infrastructure is EU-sovereign with no US corporate parent
35. CLOUD Act compellability risk assessment documented and mitigated

Common Mistakes

Mistake 1: Invoking Art.59 without sandbox membership Art.59 only applies within a formally established Art.57 AI regulatory sandbox. There is no standalone Art.59 exception for AI development outside the sandbox. Providers who process personal data for AI development outside a sandbox must establish a standard GDPR basis.

Mistake 2: Skipping the Art.6(4) compatibility assessment If further processing is actually compatible under Art.6(4), Art.59 is not needed and the simpler route is to document compatibility. Only invoke Art.59 when the Art.6(4) analysis shows incompatibility.

Mistake 3: Treating Art.59 as a suspension of GDPR rights Data subjects retain all GDPR rights during Art.59 processing. Art.59 does not restrict access, rectification, erasure, or objection rights. Providers must have procedures for exercising these rights during sandbox processing.

Mistake 4: Deploying a sandbox-trained model without a separate DPIA Art.59 covers the training/testing phase. Deploying a model trained on Art.59 data in production requires a separate legal analysis, including a DPIA that addresses training data memorisation risk.

Mistake 5: Using US cloud infrastructure for Art.59 processing Art.59(1)(f) prohibits transfers outside the EU. CLOUD Act compellability orders against US cloud providers are a realistic risk for sensitive training data. EU-sovereign infrastructure is the compliant choice.

Mistake 6: Omitting Art.9(2) basis for special category training data Art.59 supplements but does not replace Art.9(2). If training data includes health, biometric, genetic, or other special category data, a separate Art.9(2) basis must be established.

Key Takeaways

Art.59 is sandbox-bound: it only applies within a formally established Art.57 AI regulatory sandbox — not for general AI development.
Six conditions are cumulative: all must be satisfied. A single failed condition invalidates the Art.59 exception.
Prior authorisation is required: the data processing must be in the sandbox plan approved by the competent authority. Retroactive invocation is not possible.
Data subject rights survive: Art.59 does not suspend any GDPR rights. Erasure requests require machine unlearning procedures.
Deployment requires a new legal assessment: Art.59 covers training, not production use. Models trained on Art.59 data need a separate DPIA before deployment.
EU-sovereign infrastructure is the safe choice: Art.59(1)(f) prohibits transfers outside the EU, and CLOUD Act compellability creates a risk on US cloud infrastructure even without explicit transfer.
DPAs retain authority: Art.59(4) requires DPA cooperation. DPAs can review and can enforce GDPR regardless of sandbox participation.

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View plans