2026-04-24·13 min read·sota.io team

EU AI Act Art.55: AI Office Evaluation Powers over GPAI Models with Systemic Risk — Developer Guide (2026)

EU AI Act Article 55 establishes the AI Office's investigation and evaluation powers specifically targeting providers of General-Purpose AI models with systemic risk. While Art.53 imposes self-reported obligations on providers — adversarial testing, incident reporting, cybersecurity, energy transparency — Art.55 gives the AI Office independent authority to verify, evaluate, and act on those obligations from the outside.

Art.55 is the enforcement interface of Chapter V: the mechanism through which the AI Office exercises oversight authority over the most powerful AI models on the EU market. It answers a critical question for frontier AI developers: if the AI Office decides to look closely at your model, what exactly can they do, what must you provide, and what happens next?

Art.55 became applicable on 2 August 2025 as part of Chapter V of the EU AI Act (Regulation (EU) 2024/1689). Any provider of a GPAI model that meets the Art.51 systemic risk classification — whether via the 10²⁵ FLOPs compute presumption or a Commission decision — must be prepared to respond to AI Office model evaluations under Art.55 from that date forward.

For EU infrastructure providers and PaaS operators — including sota.io — Art.55 is significant for a structural reason: when model weights, training data, and inference infrastructure are hosted on EU-established infrastructure under EU jurisdiction, the Art.55 evaluation process operates under a single legal order. Providers relying on US-incorporated cloud infrastructure face a dual-jurisdiction problem: AI Office demands under Art.55 and potential US government access under the CLOUD Act operate simultaneously.

Art.55 in the Chapter V GPAI Enforcement Architecture

Art.55 sits at the active oversight end of the Chapter V obligation structure:

Article	Title	Function
Art.51	Systemic risk classification	Defines threshold — who Art.55 applies to
Art.52	General GPAI obligations	Baseline self-compliance (all GPAI providers)
Art.53	Systemic risk obligations	Enhanced self-compliance (adversarial testing, incident reporting)
Art.54	Authorised representatives	Non-EU provider gateway obligation
Art.55	AI Office evaluation powers	External oversight — AI Office action trigger
Art.56	Codes of practice	Compliance pathway and conformity presumption

Art.55 is distinct from the general market surveillance powers in Art.74-75. Art.74-75 cover investigations by national Market Surveillance Authorities (MSAs) into high-risk AI systems generally; Art.75 specifically addresses GPAI components in high-risk systems. Art.55, by contrast, operates at the Chapter V level — it is the AI Office's own evaluation mechanism for GPAI models with systemic risk, independent of whether those models are embedded in a high-risk AI system.

Art.55(1): AI Office Mandate to Conduct Model Evaluations

Art.55(1) establishes the AI Office's core authority: it may conduct model evaluations of GPAI models with systemic risk, either on its own initiative or following a referral.

What "Model Evaluation" Means Under Art.55

A model evaluation under Art.55 is a structured technical assessment that may include:

Evaluation Type	Scope	Trigger
Capability evaluation	What the model can and cannot do across a defined task spectrum	Baseline mandate, periodic review
Systemic risk evaluation	Whether the model's capabilities create risks at scale — CBRN uplift, manipulation, autonomous action	Post-incident review, new evidence
Adversarial testing verification	Whether the provider's Art.53(1)(a) adversarial testing was conducted adequately and under appropriate protocols	Inadequate Art.53 reports, CoP deviation
Cybersecurity posture assessment	Whether Art.53(1)(c) cybersecurity measures are proportionate to the model's systemic risk level	Following a security incident or discovered vulnerability
Energy efficiency audit	Whether Art.53(1)(d) energy reporting accurately reflects training and inference energy consumption	Discrepancies in reported figures

The AI Office may initiate an evaluation without requiring a prior Art.73 incident report or a formal finding of non-compliance. Art.55 is a proactive oversight tool — the AI Office does not need a complaint or breach suspicion to begin an evaluation.

Evaluation Frequency and Triggers

Art.55 does not specify mandatory evaluation intervals. Evaluations are triggered by:

Periodic monitoring — The AI Office's ongoing GPAI market monitoring function (Art.68) generates evaluation candidates
Art.53(1)(b) incident reports — Serious incidents reported by providers may trigger a follow-up evaluation
Third-party alerts — Researchers, downstream providers, or civil society organisations may flag concerns
Scientific Panel recommendations — The Scientific Panel under Art.68 may recommend evaluation of specific models
Code of Practice deviation — If a provider departs from the Art.56 CoP without equivalent alternative measures, the AI Office may evaluate
Commission request — The Commission may direct the AI Office to evaluate a specific model

Art.55(2): AI Office Information and Access Powers

Art.55(2) enumerates the specific powers the AI Office may exercise in conducting a model evaluation:

Art.55(2)(a): Documentation Requests

The AI Office may require the provider to submit:

Annex XI technical documentation compiled under Art.52(1)(a)
Adversarial testing results from Art.53(1)(a) evaluations
Incident reports and incident management records from Art.53(1)(b)
Cybersecurity measure documentation under Art.53(1)(c)
Energy consumption data under Art.53(1)(d)
Any additional technical documentation the AI Office considers necessary for the evaluation

Response deadlines are not fixed in Art.55 itself but must be "reasonable" — typically aligned with Art.74(2)(a) standards of 10 working days for initial documentation production, with extensions for large-scale or complex documentation packages.

Art.55(2)(b): Model Access for Evaluation

Beyond documentation, the AI Office may request access to the model itself for the purpose of running evaluation tests. This includes:

API access — A dedicated evaluation API instance with agreed query limits and logging
Controlled environment testing — The AI Office or its designated experts may conduct evaluations in a provider-controlled environment that prevents model weight exfiltration while allowing capability assessment
Benchmark execution — Running standardised evaluation benchmarks (as developed under Art.56 and the AI Office's standardisation work) against the production model

This is the most operationally significant Art.55 power for providers. Preparing for model access requests requires:

A model access governance protocol — who within the organisation can authorise AI Office access
An evaluation environment — isolated from production, with appropriate logging and rate controls
A legal review process — ensuring that providing model access does not itself create CLOUD Act compellability issues

Art.55(2)(c): Expert Resource Requests

The AI Office may request that providers make human resources available to support the evaluation — including engineers who understand the model's architecture, training methodology, and safety evaluation design. This is distinct from documentation; it requires live technical collaboration.

Art.55(3): Independent Experts and the Scientific Panel

Art.55(3) authorises the AI Office to designate independent experts to conduct or participate in model evaluations. These experts are drawn from the Scientific Panel established under Art.68(1).

Scientific Panel Role in Art.55 Evaluations

The Scientific Panel's involvement transforms Art.55 evaluations from an administrative documentation review into a technical peer review process:

Role	Scientific Panel Function
Capability assessment	Independent technical assessment of model capabilities across defined risk domains
Adversarial testing oversight	Review of provider's Art.53(1)(a) testing methodology and results for adequacy
Threshold evaluation	Technical advice on whether a model crosses the Art.51 systemic risk threshold
Novel risk identification	Flagging emerging systemic risks not covered by existing evaluation frameworks
CoP adequacy assessment	Evaluating whether a provider's CoP alternative measures are equivalent to the standard requirements

Confidentiality Obligations of Experts

Independent experts designated under Art.55(3) are bound by strict professional secrecy obligations. They cannot disclose:

Model architecture details
Training data composition
Evaluation results until publication is authorised
Commercially sensitive technical parameters

This confidentiality framework gives providers assurance that participating in Art.55 evaluations does not result in proprietary technical information entering the public domain.

Art.55(4): Provider Cooperation Obligation

Art.55(4) imposes a mandatory cooperation duty on providers of GPAI models with systemic risk. Non-cooperation is not a compliance option.

What Cooperation Requires

Cooperation Dimension	Specific Obligations
Documentation production	Submit requested documentation within specified timeframes
Model access	Provide evaluation access to the model as specified by the AI Office
Personnel availability	Make technical staff available for expert consultation during evaluation
Infrastructure access	Provide access to evaluation infrastructure (read-only, controlled)
Response accuracy	Provide complete, accurate, and non-misleading responses to AI Office inquiries
Non-obstruction	Refrain from actions that delay, impede, or undermine the evaluation process

Sanctions for Non-Cooperation

Art.55(4) non-cooperation triggers enforcement under Art.101, which covers violations by GPAI model providers:

Violation	Maximum Fine
Non-cooperation with AI Office model evaluation	€15 million or 3% of global annual turnover (whichever is higher)
Providing false or misleading information	€15 million or 3% of global annual turnover

For a company with €10 billion annual revenue (e.g., a major frontier AI laboratory), 3% exposure equals €300 million — significantly exceeding the nominal €15 million cap.

Art.55(4) × Art.21: Cooperation as Systemic Obligation

Art.55(4) does not operate in isolation. It intersects with Art.21, which imposes a general cooperation obligation on all AI value chain participants. For downstream SaaS developers building on GPAI APIs:

If the AI Office requests cooperation from a GPAI provider under Art.55, and that provider's compliance depends on information held by a downstream integrator, Art.21 may extend the cooperation duty into the downstream layer
Downstream providers using GPAI APIs under contractual arrangements should ensure those contracts include Art.55 cooperation clauses — obligations for the upstream GPAI provider to notify downstream integrators of AI Office evaluation proceedings that may require downstream information production

Art.55(5): Corrective Measure Recommendations

Following an Art.55 evaluation, the AI Office may issue recommendations for corrective measures if it finds that:

The provider's Art.53 obligations were not adequately fulfilled
The model poses systemic risks not addressed by current mitigation measures
The provider's Code of Practice implementation is insufficient
New systemic risks have emerged since the last evaluation

Nature of Recommendations

Art.55(5) recommendations are formally issued as regulatory acts, not merely advisory opinions:

Recommendation Type	Legal Effect
Corrective action recommendation	Provider must comply or explain non-compliance in writing
Risk mitigation measure recommendation	Specific technical or operational changes to address identified risks
Adversarial testing scope expansion	Extending Art.53(1)(a) testing to newly identified risk categories
AI Office evaluation follow-up	Triggering a second evaluation after a defined correction period

If the provider does not implement recommended corrective measures and cannot provide equivalent alternative measures, the AI Office may:

Refer to national Market Surveillance Authorities for formal enforcement under Art.74
Issue a formal finding under Art.79
Recommend Commission action including market restriction measures

Art.55(5) × Art.56 Code of Practice Interaction

The Code of Practice (Art.56) creates a safe harbour dynamic relative to Art.55 corrective measure recommendations:

A provider that adheres to the CoP is presumed to comply with Art.52-55 obligations
Art.55 corrective measure recommendations therefore primarily target providers outside the CoP or providers that have deviated from CoP requirements
If a CoP-adherent provider receives an Art.55 corrective measure recommendation, it can invoke CoP conformity as a defence — shifting the burden to the AI Office to demonstrate that CoP adherence was insufficient

This interaction makes Art.56 CoP participation a risk management tool against Art.55 regulatory exposure, not merely a compliance pathway.

Art.55 × CLOUD Act: The Dual-Jurisdiction Evaluation Problem

Art.55 evaluations create a legally complex scenario for providers whose infrastructure is subject to both EU AI Act obligations and US CLOUD Act jurisdiction.

The Dual-Compellability Scenario

Legal Authority	Demand	Legal Basis
EU AI Office (Art.55)	Model documentation, evaluation access, technical information	Regulation (EU) 2024/1689 Art.55
US Government (CLOUD Act)	Model weights, training data, evaluation results	CLOUD Act 18 U.S.C. § 2713

A provider operating under both regimes faces a structural conflict:

Complying with Art.55 requires producing documentation and providing model access to the AI Office
The CLOUD Act means that US authorities can compel production of the same information independently — including information produced for or held in connection with AI Office evaluations
Art.70 (confidentiality obligations) protects AI Office evaluation records within EU proceedings, but does not override US CLOUD Act jurisdiction over US-incorporated or US-controlled entities

EU Infrastructure as Risk Mitigation

Providers that train, store, and operate GPAI models on EU-established, EU-incorporated infrastructure — not US cloud infrastructure — reduce CLOUD Act exposure for:

Model weights and checkpoints
Training data and RLHF preference records
Adversarial testing results compiled under Art.53(1)(a)
Art.55 evaluation records and correspondence

For downstream SaaS developers building applications on GPAI APIs, this creates a procurement argument: when selecting between GPAI API providers, a provider whose model infrastructure is wholly EU-established and EU-operated can demonstrate a cleaner Art.55 compliance profile without dual-jurisdiction contamination.

sota.io operates as an EU GmbH with all infrastructure in Frankfurt (EU-West) — providing downstream developers deploying EU-facing applications with a deployment environment that does not add additional US-jurisdiction exposure on top of their GPAI provider dependencies.

Art.55 Evaluation Lifecycle: A Developer's Practical Timeline

AI Office Identifies Evaluation Target (Art.55(1))
        │
        ▼
Formal Notification to Provider
(notification of evaluation scope + initial documentation request)
        │
        ▼
Provider Documentation Production Period
(10-15 working days for initial package — Art.55(2)(a))
        │
        ▼
AI Office/Scientific Panel Documentation Review
(2-4 weeks — internal AI Office process)
        │
        ▼
Model Access Phase (if required)
(Art.55(2)(b) — provider sets up evaluation environment)
        │
        ▼
Technical Evaluation by AI Office + Independent Experts
(Art.55(3) — 4-8 weeks typical)
        │
        ▼
Preliminary Findings Consultation
(provider receives draft findings, opportunity to respond)
        │
        ▼
Final Evaluation Report
        │
   ┌────┴────┐
   │         │
No Issues   Issues Found
   │         │
   ▼         ▼
Evaluation  Corrective Measure
Complete    Recommendation (Art.55(5))
                │
           ┌────┴────┐
           │         │
        CoP-     Non-CoP
       Adherent  Provider
           │         │
           ▼         ▼
        CoP Safe  Art.74
        Harbour   Referral
        Defence   Risk

Python Implementation: `AIOfficeEvaluationResponseManager`

from dataclasses import dataclass, field
from datetime import datetime, timedelta
from enum import Enum
from typing import Optional
import uuid


class EvaluationType(Enum):
    CAPABILITY = "capability_evaluation"
    SYSTEMIC_RISK = "systemic_risk_evaluation"
    ADVERSARIAL_TESTING_VERIFICATION = "adversarial_testing_verification"
    CYBERSECURITY_POSTURE = "cybersecurity_posture"
    ENERGY_EFFICIENCY_AUDIT = "energy_efficiency_audit"
    FULL_EVALUATION = "full_evaluation"


class CooperationStatus(Enum):
    NOTIFIED = "evaluation_notified"
    DOCUMENTATION_PRODUCTION = "documentation_production"
    DOCUMENTATION_SUBMITTED = "documentation_submitted"
    MODEL_ACCESS_GRANTED = "model_access_granted"
    EVALUATION_IN_PROGRESS = "evaluation_in_progress"
    PRELIMINARY_FINDINGS = "preliminary_findings"
    RESPONSE_SUBMITTED = "response_submitted"
    EVALUATION_COMPLETE = "evaluation_complete"
    CORRECTIVE_MEASURES_ISSUED = "corrective_measures_issued"


@dataclass
class Art55EvaluationRecord:
    """Tracks an Art.55 AI Office model evaluation proceeding."""
    evaluation_id: str = field(default_factory=lambda: f"ART55-{uuid.uuid4().hex[:8].upper()}")
    evaluation_type: EvaluationType = EvaluationType.FULL_EVALUATION
    notification_date: datetime = field(default_factory=datetime.now)
    status: CooperationStatus = CooperationStatus.NOTIFIED

    # Documentation production
    documentation_deadline: Optional[datetime] = None
    documentation_submitted_date: Optional[datetime] = None
    documentation_package_contents: list[str] = field(default_factory=list)

    # Model access phase
    model_access_requested: bool = False
    model_access_granted_date: Optional[datetime] = None
    evaluation_environment_url: Optional[str] = None

    # Expert involvement
    scientific_panel_involved: bool = False
    designated_expert_count: int = 0

    # Outcomes
    preliminary_findings_date: Optional[datetime] = None
    provider_response_deadline: Optional[datetime] = None
    final_report_date: Optional[datetime] = None
    corrective_measures: list[str] = field(default_factory=list)
    cop_adherence_invoked: bool = False

    def set_documentation_deadline(self, working_days: int = 10):
        """Set documentation production deadline (10 working days default)."""
        # Simplified: does not account for public holidays
        deadline = self.notification_date
        days_added = 0
        while days_added < working_days:
            deadline += timedelta(days=1)
            if deadline.weekday() < 5:  # Monday=0, Friday=4
                days_added += 1
        self.documentation_deadline = deadline

    def days_to_documentation_deadline(self) -> int:
        if not self.documentation_deadline:
            return -1
        delta = self.documentation_deadline - datetime.now()
        return delta.days

    def is_at_risk_of_non_cooperation_fine(self) -> bool:
        """True if documentation deadline has passed without submission."""
        if self.documentation_submitted_date:
            return False
        if not self.documentation_deadline:
            return False
        return datetime.now() > self.documentation_deadline

    def generate_cooperation_status_report(self) -> dict:
        return {
            "evaluation_id": self.evaluation_id,
            "type": self.evaluation_type.value,
            "current_status": self.status.value,
            "notification_date": self.notification_date.isoformat(),
            "documentation_deadline": self.documentation_deadline.isoformat()
                if self.documentation_deadline else None,
            "days_to_doc_deadline": self.days_to_documentation_deadline(),
            "at_risk_non_cooperation_fine": self.is_at_risk_of_non_cooperation_fine(),
            "model_access_requested": self.model_access_requested,
            "model_access_granted": self.model_access_granted_date is not None,
            "scientific_panel_involved": self.scientific_panel_involved,
            "corrective_measures_count": len(self.corrective_measures),
            "cop_adherence_invoked": self.cop_adherence_invoked,
        }


@dataclass
class GPAIModelEvaluationReadinessChecker:
    """Pre-evaluation readiness assessment for Art.55 compliance."""
    model_name: str
    provider_eu_established: bool
    infrastructure_jurisdiction: str  # "EU", "US", "MIXED"
    cop_adherent: bool
    cop_documentation_current: bool
    annex_xi_documentation_complete: bool
    adversarial_testing_conducted: bool
    adversarial_testing_reported: bool
    incident_reporting_system_active: bool
    cybersecurity_measures_documented: bool
    energy_consumption_documented: bool
    evaluation_environment_ready: bool
    legal_counsel_designated: bool
    technical_contact_designated: bool

    def compute_readiness_score(self) -> int:
        """Score 0-14 based on Art.55 readiness dimensions."""
        checks = [
            self.provider_eu_established,
            self.infrastructure_jurisdiction == "EU",
            self.cop_adherent,
            self.cop_documentation_current,
            self.annex_xi_documentation_complete,
            self.adversarial_testing_conducted,
            self.adversarial_testing_reported,
            self.incident_reporting_system_active,
            self.cybersecurity_measures_documented,
            self.energy_consumption_documented,
            self.evaluation_environment_ready,
            self.legal_counsel_designated,
            self.technical_contact_designated,
            self.infrastructure_jurisdiction != "US",  # No CLOUD Act dual-jurisdiction
        ]
        return sum(checks)

    def identify_readiness_gaps(self) -> list[str]:
        gaps = []
        if not self.provider_eu_established:
            gaps.append("Art.54 authorised representative may be required (non-EU provider)")
        if self.infrastructure_jurisdiction == "US":
            gaps.append("CLOUD Act dual-jurisdiction risk: model weights on US infrastructure compellable by US authorities independently of Art.55")
        if self.infrastructure_jurisdiction == "MIXED":
            gaps.append("CLOUD Act partial risk: assess which infrastructure components are US-controlled")
        if not self.cop_adherent:
            gaps.append("No Art.56 Code of Practice adherence: CoP safe harbour unavailable against Art.55(5) corrective measures")
        if not self.cop_documentation_current:
            gaps.append("CoP documentation not current: adherence records must be maintained continuously")
        if not self.annex_xi_documentation_complete:
            gaps.append("Annex XI technical documentation incomplete: primary Art.55(2)(a) production target")
        if not self.adversarial_testing_conducted:
            gaps.append("No Art.53(1)(a) adversarial testing conducted: Art.55 evaluation will expose non-compliance")
        if not self.adversarial_testing_reported:
            gaps.append("Adversarial testing not reported to AI Office: Art.53(1)(a) reporting obligation unmet")
        if not self.incident_reporting_system_active:
            gaps.append("No Art.53(1)(b) incident reporting system: serious incidents may be unreported")
        if not self.cybersecurity_measures_documented:
            gaps.append("Cybersecurity measures not documented: Art.53(1)(c) compliance cannot be demonstrated")
        if not self.energy_consumption_documented:
            gaps.append("Energy consumption not documented: Art.53(1)(d) transparency obligation unmet")
        if not self.evaluation_environment_ready:
            gaps.append("No evaluation environment prepared: Art.55(2)(b) model access request cannot be fulfilled promptly")
        if not self.legal_counsel_designated:
            gaps.append("No designated legal counsel for Art.55 proceedings: cooperation responses need legal review")
        if not self.technical_contact_designated:
            gaps.append("No designated technical contact: Art.55(2)(c) personnel availability obligation unfulfilled")
        return gaps

    def generate_readiness_report(self) -> dict:
        score = self.compute_readiness_score()
        gaps = self.identify_readiness_gaps()
        return {
            "model": self.model_name,
            "readiness_score": f"{score}/14",
            "readiness_percentage": f"{(score / 14) * 100:.1f}%",
            "cop_safe_harbour_available": self.cop_adherent and self.cop_documentation_current,
            "cloud_act_risk": self.infrastructure_jurisdiction in ("US", "MIXED"),
            "gaps_count": len(gaps),
            "gaps": gaps,
            "recommendation": "AI Office evaluation ready" if score >= 12 else
                              "Significant gaps — evaluation response at risk" if score >= 8 else
                              "Critical gaps — immediate remediation required",
        }


# Example usage
if __name__ == "__main__":
    checker = GPAIModelEvaluationReadinessChecker(
        model_name="ExampleGPAI-v2",
        provider_eu_established=True,
        infrastructure_jurisdiction="EU",
        cop_adherent=True,
        cop_documentation_current=True,
        annex_xi_documentation_complete=True,
        adversarial_testing_conducted=True,
        adversarial_testing_reported=True,
        incident_reporting_system_active=True,
        cybersecurity_measures_documented=True,
        energy_consumption_documented=True,
        evaluation_environment_ready=True,
        legal_counsel_designated=True,
        technical_contact_designated=True,
    )
    report = checker.generate_readiness_report()
    print(f"Readiness: {report['readiness_score']} — {report['recommendation']}")
    # Output: Readiness: 14/14 — AI Office evaluation ready

Art.55 Readiness Checklist (14 Items)

#	Item	Art.55 Reference
1	EU establishment confirmed or Art.54 authorised representative appointed	Art.54 × Art.55
2	Annex XI technical documentation compiled and current	Art.55(2)(a) × Art.52(1)(a)
3	Adversarial testing program established under Art.53(1)(a)	Art.55(2)(a) × Art.53(1)(a)
4	Adversarial testing results reported to AI Office	Art.55(2)(a) × Art.53(1)(a)
5	Serious incident reporting system active and tested	Art.55(2)(a) × Art.53(1)(b)
6	Cybersecurity measures documented at Art.53(1)(c) level	Art.55(2)(a) × Art.53(1)(c)
7	Energy consumption data compiled for training and inference	Art.55(2)(a) × Art.53(1)(d)
8	Dedicated model evaluation environment prepared (isolated, logged)	Art.55(2)(b)
9	Model access governance protocol documented (who can authorise AI Office access)	Art.55(2)(b)
10	Technical personnel designated for AI Office evaluation support	Art.55(2)(c)
11	Legal counsel designated for Art.55 cooperation responses	Art.55(4)
12	Art.56 Code of Practice adherence documented and current	Art.55(5) × Art.56
13	CLOUD Act jurisdiction assessment completed for model infrastructure	Art.55 × CLOUD Act
14	Internal Art.55 response protocol established (notification → response pipeline)	Art.55(4)