2026-04-09·12 min read·sota.io team

EU AI Act Regulatory Sandbox (Art.57-63): Developer Guide for High-Risk AI Testing

The EU AI Act's conformity assessment process for high-risk AI systems is demanding. Annex III systems require risk management systems (Art. 9), training data governance documentation (Art. 10), technical documentation packages (Art. 11), transparency statements (Art. 13), human oversight mechanisms (Art. 14), and post-market monitoring plans (Art. 72) — all completed before market placement.

For many developers, particularly SMEs and startups building novel high-risk AI systems, completing this documentation without live data from real-world deployments is circular: you need deployment data to properly document risks, but you cannot deploy without documenting risks first.

The EU AI Act Regulatory Sandbox (Articles 57-63) breaks this cycle. It creates a controlled legal environment where developers can test high-risk AI systems in real conditions — with real users, real data, and real operational contexts — before completing full conformity assessment. This guide explains what the sandbox is, who qualifies, how to apply, what liability applies during testing, and what infrastructure choices determine your data governance compliance.

What the Regulatory Sandbox Is (Art. 57)

Article 57 defines the AI regulatory sandbox as a controlled framework established by national competent authorities that allows providers and prospective providers to develop, train, test, and validate innovative AI systems for a limited period under regulatory supervision.

The sandbox is not a waiver of EU AI Act obligations. It is a supervised testing environment where:

Normal conformity assessment timelines are suspended for the duration
Developers gain direct access to national regulatory authorities for guidance
Real-world testing is permitted on a defined scope without full Art. 43 conformity certification
The authority monitors testing conditions and can terminate the sandbox at any point

Three conditions define what qualifies as a sandbox under Art. 57:

Innovative AI system: the system must have "significant novelty" in approach, capability, or application domain — incremental improvements to existing certified systems do not qualify
Development or pre-deployment phase: the sandbox covers systems not yet placed on the EU market; it cannot retroactively regularise already-deployed systems
Genuine supervisory purpose: the sandbox must provide regulatory value — authorities are not obliged to approve applications that do not advance their supervisory capacity

Each Member State must establish at least one regulatory sandbox at national level. Member States may also establish joint sandboxes covering multiple jurisdictions, though as of 2026 most operate at national level. The European AI Office coordinates cross-border sandbox activities and maintains a sandbox register.

Who Can Apply (Art. 58-59)

Article 58 establishes that regulatory sandboxes are available to any provider or prospective provider of a high-risk AI system — but grants priority access to SMEs and startups.

The priority definition uses EU Commission definitions:

Micro enterprises: fewer than 10 employees, annual turnover or balance sheet under €2M
Small enterprises: fewer than 50 employees, annual turnover or balance sheet under €10M
Medium enterprises: fewer than 250 employees, annual turnover under €50M or balance sheet under €43M
Startups: enterprises meeting the SME definition AND operating for fewer than 5 years since commercial registration

Priority means competent authorities must process SME/startup applications within shorter timeframes and cannot reject SME applications solely on capacity grounds without providing alternative supervisory arrangements.

Article 59 — Eligibility Conditions:

To enter the sandbox, providers must demonstrate:

1. Annex III Scope: The AI system must fall within at least one of the eight high-risk categories in Annex III:

Biometric identification and categorisation (Art. III.1)
Critical infrastructure management (Art. III.2)
Education and vocational training (Art. III.3)
Employment and workers management (Art. III.4)
Access to essential services and benefits (Art. III.5)
Law enforcement (Art. III.6)
Migration, asylum and border control (Art. III.7)
Administration of justice and democratic processes (Art. III.8)

General-purpose AI systems, even large-scale foundation models, do not automatically qualify for the sandbox unless their specific deployment use case maps to an Annex III category.

2. Pre-Market Status: The system must not already be placed on the EU market. Sandbox participation is prospective — it covers the development-to-market transition, not regularisation of existing deployments.

3. Data Protection Compliance Plan: Art. 59 requires providers to submit a GDPR compliance plan specifically covering sandbox testing. This includes data minimisation measures, purpose limitation controls, and — critically — documentation of where testing data will be stored and processed. This is where infrastructure jurisdiction becomes directly relevant.

4. Serious Incident Protocol: Providers must submit a protocol for identifying, documenting, and reporting serious incidents that occur during sandbox testing to both the competent authority and (for personal data processing) to the relevant Data Protection Authority.

5. Exit Plan: Providers must document how they will transition from sandbox to full conformity assessment upon sandbox conclusion, or how they will cease testing if conformity cannot be achieved.

Application Process and Duration

The application process varies by Member State, but Art. 59 specifies minimum elements:

Technical description of the AI system (sufficient to establish Annex III scope)
Description of the intended real-world testing environment
GDPR compliance plan
Serious incident protocol
Exit plan
SME/startup status documentation (if priority access claimed)

The competent authority must respond within a defined period (Member States specify, typically 30-90 days) and may request additional documentation. Rejection must be reasoned and written — providers may challenge rejections through administrative appeal.

Duration: Article 58 specifies a 12-month maximum sandbox period, with a possible 12-month extension under exceptional circumstances. The authority may terminate sandbox participation early if:

A serious incident occurs that was not disclosed per the incident protocol
The provider fails to implement corrective measures requested by the authority
The scope of testing materially differs from the approved application

During the sandbox period, providers submit periodic progress reports (frequency defined by national authority, typically quarterly) documenting testing results, incidents, and compliance measures.

Real-World Testing Conditions (Art. 60)

Article 60 covers real-world testing within the sandbox framework — allowing providers to test AI systems with actual subjects in authentic operational contexts, not just laboratory simulations.

This is the most practically significant provision for developers building AI systems where performance characteristics can only be validated with real users, real workflows, or real data streams.

What real-world testing permits:

Deployment to actual end users within the defined sandbox scope
Processing of real personal data under the approved GDPR plan
Integration with live operational systems (critical infrastructure testing requires elevated supervision)
Collection of operational performance data that feeds the Art. 9 risk management system

What real-world testing does NOT permit:

Testing on vulnerable populations without enhanced safeguards (Art. 60 specifies heightened requirements for systems that may affect children, patients, or criminal suspects)
Deploying to subjects without explicit informed consent to sandbox participation
Scope expansion beyond the approved application without authority approval
Using sandbox as a mechanism to commercially deploy systems before conformity — any revenue from sandbox testing requires explicit authority approval

Monitoring Requirements During Real-World Testing:

Providers must implement real-time monitoring during real-world testing that satisfies two parallel requirements:

AI Act Art. 72 post-market monitoring: the testing period generates the initial operational data that feeds the post-market monitoring system required for full market placement
Sandbox authority reporting: monitoring data must be accessible to the supervisory authority on request — this creates a direct access channel that requires the authority to be able to read and audit your operational logs

The authority access requirement has infrastructure implications: if your testing infrastructure is deployed on US-incorporated cloud providers, the supervisory authority's direct access channel may conflict with CLOUD Act data access provisions. EU national authorities are not equipped to navigate US legal frameworks to access monitoring data they are entitled to under Art. 60.

Pre-Market Testing Without Full Conformity (Art. 61)

Article 61 permits pre-market testing that goes beyond sandbox conditions — specifically, testing on real subjects outside the controlled sandbox environment — but under strict conditions.

This provision applies to systems where laboratory or sandbox testing cannot fully replicate operational conditions: for example, a medical AI system whose accuracy can only be validated against clinical workflows, or a border control system whose performance requires actual border crossing conditions.

Pre-market testing under Art. 61 requires:

Written consent from the supervisory authority (not merely notification — full approval)
Specific testing protocol identifying exactly what will be tested, with whom, in what context, for what duration
Independent safety monitoring: a third party not affiliated with the provider must oversee testing and can halt it if safety conditions are violated
Data isolation: data collected during pre-market testing must be isolated from commercial operations and deleted after the testing period unless retained for risk management documentation

The liability implications of Art. 61 testing differ from Art. 60 sandbox testing. Within the sandbox (Art. 60), the competent authority shares supervisory responsibility for the testing environment. Under Art. 61 pre-market testing, the provider retains full liability for outcomes — the authority approval creates a compliance framework, not a liability transfer.

SME and Startup Support Mechanisms (Art. 62)

Article 62 establishes specific support obligations for competent authorities toward SMEs and startups:

1. Fee Structure: Authorities must publish sandbox fees (if any) and must offer reduced or waived fees for SME/startup applicants. Several Member States have committed to fee-free sandbox access for companies under 50 employees.

2. Guidance Access: Competent authorities must provide SME/startup applicants with direct access to regulatory guidance — not merely published documents, but genuine advisory capacity. This is intended to compensate for SMEs' inability to afford external legal counsel at the scale that large enterprises can.

3. Technical Assistance: For complex Annex III categories (law enforcement, critical infrastructure), authorities may provide technical assistance teams that work alongside providers during sandbox testing. This is especially relevant for startups without in-house compliance expertise.

4. Fast-Track Processing: SME/startup applications receive expedited review compared to large enterprise applications. Member States implement this differently — some use dedicated SME sandbox teams, others prioritise SME applications in the review queue.

5. Cross-Border Coordination: The European AI Office maintains a sandbox coordination mechanism that helps SMEs operating in multiple Member States navigate sandbox applications across jurisdictions simultaneously.

# Sandbox application checklist — Art.59 required elements
sandbox_application = {
    "system_description": {
        "annex_iii_category": "Art.III.X",  # Specify exact category
        "innovation_claim": "...",           # What makes this 'significantly novel'
        "deployment_scope": "...",           # Who, where, how many subjects
        "technical_architecture": "..."     # High-level system description
    },
    "testing_environment": {
        "location": "...",                  # Country/region
        "infrastructure": "EU-native",      # Regulatory access requirement
        "duration_months": 12,             # Max Art.58 duration
        "subject_count_estimate": 0        # Affected natural persons
    },
    "gdpr_compliance_plan": {
        "legal_basis": "...",              # Art.6 GDPR basis for testing
        "data_minimisation": "...",        # Technical measures
        "retention_period": "...",         # How long data kept post-sandbox
        "jurisdiction": "EU",             # Storage jurisdiction (CLOUD Act relevance)
        "dpa_notification": False          # Required if sensitive categories
    },
    "serious_incident_protocol": {
        "definition": "...",              # What constitutes serious incident
        "detection_method": "...",        # How incidents are identified
        "reporting_timeline_hours": 24,   # To competent authority
        "escalation_path": "..."         # Internal escalation chain
    },
    "exit_plan": {
        "conformity_pathway": "Art.43A",  # Or Art.43B for self-assessment
        "transition_timeline": "...",     # From sandbox end to full conformity
        "abort_conditions": [...]         # If conformity not achievable
    }
}

AILD Intersection: Does the Causation Presumption Apply to Sandbox Testing?

A critical question for developers entering the AI Act Regulatory Sandbox: does the AI Liability Directive (AILD) Art. 4 causation presumption apply to harm caused during sandbox testing?

The AILD Art. 4 causation presumption triggers when:

A claimant proves a defendant violated an AI Act obligation
The claimant establishes that the violation plausibly caused the damage
Courts then presume causation (rebuttable by the provider)

During sandbox testing, a provider is explicitly operating in a pre-conformity environment. They have not yet passed Art. 43 conformity assessment. They may not have completed their Art. 9 risk management system or Art. 11 technical documentation package to full standard.

Does this mean every sandbox incident triggers Art. 4 causation presumption?

The legal analysis is nuanced:

Argument for presumption applying: AILD Art. 4 refers to violations of obligations "arising from" the AI Act — including the conformity obligations themselves. A provider who has not yet achieved conformity is technically in violation of Art. 43 requirements. If harm occurs during real-world sandbox testing (Art. 60), a claimant could argue the provider violated conformity obligations, triggering the presumption.

Argument against presumption applying: The sandbox is a legally authorised framework under Art. 57-63. Participation in the sandbox is itself an EU AI Act mechanism — not a violation of it. The competent authority's supervision creates a regulatory context that arguably displaces the presumption: the authority has approved the testing, is monitoring it, and is co-responsible for the supervisory framework. Treating sandbox-approved testing as a violation of the very Act that creates the sandbox creates a logical contradiction.

Practical implication: Until CJEU jurisprudence or Commission guidance clarifies this intersection, providers should treat sandbox testing as a period of elevated documentation discipline. The sandbox Art. 60 monitoring requirements (real-time logging, authority access channels) should be implemented with AILD Art. 3 evidence disclosure in mind: assume every incident log may be discoverable in subsequent liability proceedings. Document testing conditions, authority approvals, and corrective responses with the same legal defensibility as post-conformity operational logs.

import hashlib
import json
import time

def sandbox_incident_log(
    incident_type: str,
    subject_impact: str,
    detection_method: str,
    response_taken: str,
    authority_notified: bool,
    sandbox_reference: str  # Competent authority sandbox ID
) -> dict:
    """
    Sandbox testing incident log — dual purpose:
    1. Art.60 sandbox monitoring (authority access)
    2. AILD Art.3 evidence preservation (legal defensibility)
    
    Tamper-evident hashing ensures log integrity for
    potential AILD Art.3 disclosure proceedings.
    """
    entry = {
        "timestamp_utc": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
        "sandbox_reference": sandbox_reference,
        "incident_type": incident_type,
        "subject_impact": subject_impact,
        "detection_method": detection_method,
        "response_taken": response_taken,
        "authority_notified": authority_notified,
        "reporting_deadline_hours": 24,  # Art.59 serious incident reporting
    }
    
    # Tamper-evident hash for legal admissibility
    entry_hash = hashlib.sha256(
        json.dumps(entry, sort_keys=True).encode()
    ).hexdigest()
    entry["integrity_hash"] = entry_hash
    
    return entry

Infrastructure Requirements for Sandbox Compliance

The sandbox data governance requirements create specific infrastructure constraints that most developers underestimate until they are already in the application process.

Constraint 1 — Competent Authority Access

Art. 60 real-world testing requires that the supervisory authority can access monitoring data on request. The authority is a national EU institution operating under EU administrative law. It cannot issue US court orders, navigate CLOUD Act proceedings, or compel disclosure from US-incorporated cloud providers through normal administrative channels.

If your sandbox testing infrastructure runs on AWS, Azure, or GCP (all US-incorporated), the authority's Art. 60 access right may become practically unenforceable. The authority cannot access your monitoring logs without initiating US legal proceedings — which no national EU competent authority has the mandate or resources to do for routine sandbox supervision.

Constraint 2 — GDPR Compliance Plan Storage

The Art. 59 GDPR compliance plan must identify where testing data will be stored. National Data Protection Authorities review this plan as part of sandbox approval. Several Member States' DPAs have explicitly stated they will not approve sandbox applications where testing data is stored on infrastructure subject to CLOUD Act jurisdiction — US-incorporated cloud providers in particular.

This is not a theoretical concern. Sandbox testing involves real users, real personal data, and often Annex III-category sensitive data (health data for medical AI, employment data for recruitment AI, criminal justice data for law enforcement AI). DPAs apply heightened scrutiny to Chapter V GDPR transfers for this data — and testing infrastructure that creates CLOUD Act exposure fails that scrutiny.

Constraint 3 — Cross-Border Sandbox Coordination

For providers operating in multiple Member States simultaneously, the European AI Office coordinates sandbox activities across national authorities. The coordination mechanism requires that testing data be accessible to each involved national authority — creating a multi-jurisdiction access requirement that compound the infrastructure constraints above.

EU-native infrastructure operated by EU-incorporated entities — with no US nexus — provides the cleanest compliance posture for sandbox applications. The supervisory authority gets direct access through EU administrative channels. The GDPR compliance plan passes DPA scrutiny. Cross-border coordination does not create CLOUD Act exposure at any node.

# Sandbox infrastructure compliance check
def assess_sandbox_infrastructure(provider_config: dict) -> dict:
    """
    Validates infrastructure configuration against sandbox
    compliance requirements (Art.59 GDPR plan + Art.60 authority access).
    """
    checks = []
    
    # Cloud Act exposure check
    us_incorporated_providers = ["aws", "azure", "gcp", "oracle-cloud"]
    infrastructure_provider = provider_config.get("cloud_provider", "").lower()
    
    if infrastructure_provider in us_incorporated_providers:
        checks.append({
            "check": "CLOUD_ACT_EXPOSURE",
            "status": "FAIL",
            "impact": "Art.60 authority access may be unenforceable",
            "remedy": "Migrate testing infrastructure to EU-incorporated provider"
        })
    else:
        checks.append({
            "check": "CLOUD_ACT_EXPOSURE",
            "status": "PASS",
            "detail": "No US nexus — EU authority access unobstructed"
        })
    
    # Data residency check
    if provider_config.get("data_residency") != "EU":
        checks.append({
            "check": "GDPR_DATA_RESIDENCY",
            "status": "FAIL",
            "impact": "Art.59 GDPR compliance plan likely rejected by DPA",
            "remedy": "Restrict testing data to EU-resident storage"
        })
    else:
        checks.append({
            "check": "GDPR_DATA_RESIDENCY",
            "status": "PASS"
        })
    
    # Audit log accessibility
    if not provider_config.get("authority_read_access_available"):
        checks.append({
            "check": "AUTHORITY_ACCESS_CHANNEL",
            "status": "WARN",
            "impact": "Art.60 requires authority access to monitoring data",
            "remedy": "Establish read-only authority access to monitoring logs"
        })
    
    return {
        "overall_status": "PASS" if all(c["status"] == "PASS" for c in checks) else "REVIEW",
        "checks": checks,
        "sandbox_application_ready": all(
            c["status"] in ["PASS", "WARN"] for c in checks
        )
    }

Sandbox Exit: Transition to Full Conformity

The sandbox period (up to 12 months) ends with one of three outcomes:

1. Successful transition to Art. 43 conformity assessment: the provider uses sandbox data to complete their Art. 9 risk management system, Art. 11 technical documentation, and Art. 13-14 transparency and oversight mechanisms. Conformity assessment proceeds under the applicable pathway (Art. 43A for notified body certification, Art. 43B for self-assessment where permitted).

2. Extended sandbox: Under exceptional circumstances, the competent authority may grant a 12-month extension. This typically requires demonstrating that conformity is achievable but requires additional real-world testing data that cannot be obtained within the initial period.

3. Termination without market placement: The provider determines that conformity is not achievable, or that the business case does not justify completing the conformity process. The Art. 59 exit plan governs data deletion, subject notification, and authority reporting in this scenario.

Post-sandbox obligations persist: even if a provider does not proceed to market placement, they retain obligations for data deletion, incident report retention, and authority cooperation for residual investigation purposes for a defined period (typically the limitation period under applicable national tort law, typically 3-10 years).

Key Takeaways

The sandbox breaks the documentation chicken-and-egg problem — you need operational data to properly document risks, but you need documented risks to deploy. Art. 57-63 explicitly creates a pathway for getting real operational data under regulatory supervision before completing full conformity assessment.

SMEs and startups have structural advantages in sandbox access — Art. 58-59 creates priority access and Art. 62 creates affirmative support obligations. If you are an SME building a novel Annex III AI system, the regulatory sandbox is the intended pathway, not the exception.

AILD causation presumption during sandbox testing is legally uncertain — treat sandbox incident logging with the same legal defensibility discipline as post-conformity operational logging. Document everything. Hash it. Preserve it.

Infrastructure jurisdiction is a sandbox application prerequisite, not an afterthought — the Art. 59 GDPR compliance plan requires identifying infrastructure, DPAs scrutinise CLOUD Act exposure for Annex III data categories, and Art. 60 authority access requires EU administrative channel viability. EU-native infrastructure resolves all three constraints simultaneously.

The sandbox period generates your conformity evidence base — the Art. 60 monitoring data, incident logs, and risk observations collected during sandbox testing become the primary inputs to your Art. 9 risk management system and Art. 11 technical documentation. Treat the sandbox as the first chapter of your conformity file, not a separate regulatory process.