EU AI Act Art.55: AI Office Evaluation Powers over GPAI Models with Systemic Risk — Developer Guide (2026)
EU AI Act Article 55 establishes the AI Office's investigation and evaluation powers specifically targeting providers of General-Purpose AI models with systemic risk. While Art.53 imposes self-reported obligations on providers — adversarial testing, incident reporting, cybersecurity, energy transparency — Art.55 gives the AI Office independent authority to verify, evaluate, and act on those obligations from the outside.
Art.55 is the enforcement interface of Chapter V: the mechanism through which the AI Office exercises oversight authority over the most powerful AI models on the EU market. It answers a critical question for frontier AI developers: if the AI Office decides to look closely at your model, what exactly can they do, what must you provide, and what happens next?
Art.55 became applicable on 2 August 2025 as part of Chapter V of the EU AI Act (Regulation (EU) 2024/1689). Any provider of a GPAI model that meets the Art.51 systemic risk classification — whether via the 10²⁵ FLOPs compute presumption or a Commission decision — must be prepared to respond to AI Office model evaluations under Art.55 from that date forward.
For EU infrastructure providers and PaaS operators — including sota.io — Art.55 is significant for a structural reason: when model weights, training data, and inference infrastructure are hosted on EU-established infrastructure under EU jurisdiction, the Art.55 evaluation process operates under a single legal order. Providers relying on US-incorporated cloud infrastructure face a dual-jurisdiction problem: AI Office demands under Art.55 and potential US government access under the CLOUD Act operate simultaneously.
Art.55 in the Chapter V GPAI Enforcement Architecture
Art.55 sits at the active oversight end of the Chapter V obligation structure:
| Article | Title | Function |
|---|---|---|
| Art.51 | Systemic risk classification | Defines threshold — who Art.55 applies to |
| Art.52 | General GPAI obligations | Baseline self-compliance (all GPAI providers) |
| Art.53 | Systemic risk obligations | Enhanced self-compliance (adversarial testing, incident reporting) |
| Art.54 | Authorised representatives | Non-EU provider gateway obligation |
| Art.55 | AI Office evaluation powers | External oversight — AI Office action trigger |
| Art.56 | Codes of practice | Compliance pathway and conformity presumption |
Art.55 is distinct from the general market surveillance powers in Art.74-75. Art.74-75 cover investigations by national Market Surveillance Authorities (MSAs) into high-risk AI systems generally; Art.75 specifically addresses GPAI components in high-risk systems. Art.55, by contrast, operates at the Chapter V level — it is the AI Office's own evaluation mechanism for GPAI models with systemic risk, independent of whether those models are embedded in a high-risk AI system.
Art.55(1): AI Office Mandate to Conduct Model Evaluations
Art.55(1) establishes the AI Office's core authority: it may conduct model evaluations of GPAI models with systemic risk, either on its own initiative or following a referral.
What "Model Evaluation" Means Under Art.55
A model evaluation under Art.55 is a structured technical assessment that may include:
| Evaluation Type | Scope | Trigger |
|---|---|---|
| Capability evaluation | What the model can and cannot do across a defined task spectrum | Baseline mandate, periodic review |
| Systemic risk evaluation | Whether the model's capabilities create risks at scale — CBRN uplift, manipulation, autonomous action | Post-incident review, new evidence |
| Adversarial testing verification | Whether the provider's Art.53(1)(a) adversarial testing was conducted adequately and under appropriate protocols | Inadequate Art.53 reports, CoP deviation |
| Cybersecurity posture assessment | Whether Art.53(1)(c) cybersecurity measures are proportionate to the model's systemic risk level | Following a security incident or discovered vulnerability |
| Energy efficiency audit | Whether Art.53(1)(d) energy reporting accurately reflects training and inference energy consumption | Discrepancies in reported figures |
The AI Office may initiate an evaluation without requiring a prior Art.73 incident report or a formal finding of non-compliance. Art.55 is a proactive oversight tool — the AI Office does not need a complaint or breach suspicion to begin an evaluation.
Evaluation Frequency and Triggers
Art.55 does not specify mandatory evaluation intervals. Evaluations are triggered by:
- Periodic monitoring — The AI Office's ongoing GPAI market monitoring function (Art.68) generates evaluation candidates
- Art.53(1)(b) incident reports — Serious incidents reported by providers may trigger a follow-up evaluation
- Third-party alerts — Researchers, downstream providers, or civil society organisations may flag concerns
- Scientific Panel recommendations — The Scientific Panel under Art.68 may recommend evaluation of specific models
- Code of Practice deviation — If a provider departs from the Art.56 CoP without equivalent alternative measures, the AI Office may evaluate
- Commission request — The Commission may direct the AI Office to evaluate a specific model
Art.55(2): AI Office Information and Access Powers
Art.55(2) enumerates the specific powers the AI Office may exercise in conducting a model evaluation:
Art.55(2)(a): Documentation Requests
The AI Office may require the provider to submit:
- Annex XI technical documentation compiled under Art.52(1)(a)
- Adversarial testing results from Art.53(1)(a) evaluations
- Incident reports and incident management records from Art.53(1)(b)
- Cybersecurity measure documentation under Art.53(1)(c)
- Energy consumption data under Art.53(1)(d)
- Any additional technical documentation the AI Office considers necessary for the evaluation
Response deadlines are not fixed in Art.55 itself but must be "reasonable" — typically aligned with Art.74(2)(a) standards of 10 working days for initial documentation production, with extensions for large-scale or complex documentation packages.
Art.55(2)(b): Model Access for Evaluation
Beyond documentation, the AI Office may request access to the model itself for the purpose of running evaluation tests. This includes:
- API access — A dedicated evaluation API instance with agreed query limits and logging
- Controlled environment testing — The AI Office or its designated experts may conduct evaluations in a provider-controlled environment that prevents model weight exfiltration while allowing capability assessment
- Benchmark execution — Running standardised evaluation benchmarks (as developed under Art.56 and the AI Office's standardisation work) against the production model
This is the most operationally significant Art.55 power for providers. Preparing for model access requests requires:
- A model access governance protocol — who within the organisation can authorise AI Office access
- An evaluation environment — isolated from production, with appropriate logging and rate controls
- A legal review process — ensuring that providing model access does not itself create CLOUD Act compellability issues
Art.55(2)(c): Expert Resource Requests
The AI Office may request that providers make human resources available to support the evaluation — including engineers who understand the model's architecture, training methodology, and safety evaluation design. This is distinct from documentation; it requires live technical collaboration.
Art.55(3): Independent Experts and the Scientific Panel
Art.55(3) authorises the AI Office to designate independent experts to conduct or participate in model evaluations. These experts are drawn from the Scientific Panel established under Art.68(1).
Scientific Panel Role in Art.55 Evaluations
The Scientific Panel's involvement transforms Art.55 evaluations from an administrative documentation review into a technical peer review process:
| Role | Scientific Panel Function |
|---|---|
| Capability assessment | Independent technical assessment of model capabilities across defined risk domains |
| Adversarial testing oversight | Review of provider's Art.53(1)(a) testing methodology and results for adequacy |
| Threshold evaluation | Technical advice on whether a model crosses the Art.51 systemic risk threshold |
| Novel risk identification | Flagging emerging systemic risks not covered by existing evaluation frameworks |
| CoP adequacy assessment | Evaluating whether a provider's CoP alternative measures are equivalent to the standard requirements |
Confidentiality Obligations of Experts
Independent experts designated under Art.55(3) are bound by strict professional secrecy obligations. They cannot disclose:
- Model architecture details
- Training data composition
- Evaluation results until publication is authorised
- Commercially sensitive technical parameters
This confidentiality framework gives providers assurance that participating in Art.55 evaluations does not result in proprietary technical information entering the public domain.
Art.55(4): Provider Cooperation Obligation
Art.55(4) imposes a mandatory cooperation duty on providers of GPAI models with systemic risk. Non-cooperation is not a compliance option.
What Cooperation Requires
| Cooperation Dimension | Specific Obligations |
|---|---|
| Documentation production | Submit requested documentation within specified timeframes |
| Model access | Provide evaluation access to the model as specified by the AI Office |
| Personnel availability | Make technical staff available for expert consultation during evaluation |
| Infrastructure access | Provide access to evaluation infrastructure (read-only, controlled) |
| Response accuracy | Provide complete, accurate, and non-misleading responses to AI Office inquiries |
| Non-obstruction | Refrain from actions that delay, impede, or undermine the evaluation process |
Sanctions for Non-Cooperation
Art.55(4) non-cooperation triggers enforcement under Art.101, which covers violations by GPAI model providers:
| Violation | Maximum Fine |
|---|---|
| Non-cooperation with AI Office model evaluation | €15 million or 3% of global annual turnover (whichever is higher) |
| Providing false or misleading information | €15 million or 3% of global annual turnover |
For a company with €10 billion annual revenue (e.g., a major frontier AI laboratory), 3% exposure equals €300 million — significantly exceeding the nominal €15 million cap.
Art.55(4) × Art.21: Cooperation as Systemic Obligation
Art.55(4) does not operate in isolation. It intersects with Art.21, which imposes a general cooperation obligation on all AI value chain participants. For downstream SaaS developers building on GPAI APIs:
- If the AI Office requests cooperation from a GPAI provider under Art.55, and that provider's compliance depends on information held by a downstream integrator, Art.21 may extend the cooperation duty into the downstream layer
- Downstream providers using GPAI APIs under contractual arrangements should ensure those contracts include Art.55 cooperation clauses — obligations for the upstream GPAI provider to notify downstream integrators of AI Office evaluation proceedings that may require downstream information production
Art.55(5): Corrective Measure Recommendations
Following an Art.55 evaluation, the AI Office may issue recommendations for corrective measures if it finds that:
- The provider's Art.53 obligations were not adequately fulfilled
- The model poses systemic risks not addressed by current mitigation measures
- The provider's Code of Practice implementation is insufficient
- New systemic risks have emerged since the last evaluation
Nature of Recommendations
Art.55(5) recommendations are formally issued as regulatory acts, not merely advisory opinions:
| Recommendation Type | Legal Effect |
|---|---|
| Corrective action recommendation | Provider must comply or explain non-compliance in writing |
| Risk mitigation measure recommendation | Specific technical or operational changes to address identified risks |
| Adversarial testing scope expansion | Extending Art.53(1)(a) testing to newly identified risk categories |
| AI Office evaluation follow-up | Triggering a second evaluation after a defined correction period |
If the provider does not implement recommended corrective measures and cannot provide equivalent alternative measures, the AI Office may:
- Refer to national Market Surveillance Authorities for formal enforcement under Art.74
- Issue a formal finding under Art.79
- Recommend Commission action including market restriction measures
Art.55(5) × Art.56 Code of Practice Interaction
The Code of Practice (Art.56) creates a safe harbour dynamic relative to Art.55 corrective measure recommendations:
- A provider that adheres to the CoP is presumed to comply with Art.52-55 obligations
- Art.55 corrective measure recommendations therefore primarily target providers outside the CoP or providers that have deviated from CoP requirements
- If a CoP-adherent provider receives an Art.55 corrective measure recommendation, it can invoke CoP conformity as a defence — shifting the burden to the AI Office to demonstrate that CoP adherence was insufficient
This interaction makes Art.56 CoP participation a risk management tool against Art.55 regulatory exposure, not merely a compliance pathway.
Art.55 × CLOUD Act: The Dual-Jurisdiction Evaluation Problem
Art.55 evaluations create a legally complex scenario for providers whose infrastructure is subject to both EU AI Act obligations and US CLOUD Act jurisdiction.
The Dual-Compellability Scenario
| Legal Authority | Demand | Legal Basis |
|---|---|---|
| EU AI Office (Art.55) | Model documentation, evaluation access, technical information | Regulation (EU) 2024/1689 Art.55 |
| US Government (CLOUD Act) | Model weights, training data, evaluation results | CLOUD Act 18 U.S.C. § 2713 |
A provider operating under both regimes faces a structural conflict:
- Complying with Art.55 requires producing documentation and providing model access to the AI Office
- The CLOUD Act means that US authorities can compel production of the same information independently — including information produced for or held in connection with AI Office evaluations
- Art.70 (confidentiality obligations) protects AI Office evaluation records within EU proceedings, but does not override US CLOUD Act jurisdiction over US-incorporated or US-controlled entities
EU Infrastructure as Risk Mitigation
Providers that train, store, and operate GPAI models on EU-established, EU-incorporated infrastructure — not US cloud infrastructure — reduce CLOUD Act exposure for:
- Model weights and checkpoints
- Training data and RLHF preference records
- Adversarial testing results compiled under Art.53(1)(a)
- Art.55 evaluation records and correspondence
For downstream SaaS developers building applications on GPAI APIs, this creates a procurement argument: when selecting between GPAI API providers, a provider whose model infrastructure is wholly EU-established and EU-operated can demonstrate a cleaner Art.55 compliance profile without dual-jurisdiction contamination.
sota.io operates as an EU GmbH with all infrastructure in Frankfurt (EU-West) — providing downstream developers deploying EU-facing applications with a deployment environment that does not add additional US-jurisdiction exposure on top of their GPAI provider dependencies.
Art.55 Evaluation Lifecycle: A Developer's Practical Timeline
AI Office Identifies Evaluation Target (Art.55(1))
│
▼
Formal Notification to Provider
(notification of evaluation scope + initial documentation request)
│
▼
Provider Documentation Production Period
(10-15 working days for initial package — Art.55(2)(a))
│
▼
AI Office/Scientific Panel Documentation Review
(2-4 weeks — internal AI Office process)
│
▼
Model Access Phase (if required)
(Art.55(2)(b) — provider sets up evaluation environment)
│
▼
Technical Evaluation by AI Office + Independent Experts
(Art.55(3) — 4-8 weeks typical)
│
▼
Preliminary Findings Consultation
(provider receives draft findings, opportunity to respond)
│
▼
Final Evaluation Report
│
┌────┴────┐
│ │
No Issues Issues Found
│ │
▼ ▼
Evaluation Corrective Measure
Complete Recommendation (Art.55(5))
│
┌────┴────┐
│ │
CoP- Non-CoP
Adherent Provider
│ │
▼ ▼
CoP Safe Art.74
Harbour Referral
Defence Risk
Python Implementation: AIOfficeEvaluationResponseManager
from dataclasses import dataclass, field
from datetime import datetime, timedelta
from enum import Enum
from typing import Optional
import uuid
class EvaluationType(Enum):
CAPABILITY = "capability_evaluation"
SYSTEMIC_RISK = "systemic_risk_evaluation"
ADVERSARIAL_TESTING_VERIFICATION = "adversarial_testing_verification"
CYBERSECURITY_POSTURE = "cybersecurity_posture"
ENERGY_EFFICIENCY_AUDIT = "energy_efficiency_audit"
FULL_EVALUATION = "full_evaluation"
class CooperationStatus(Enum):
NOTIFIED = "evaluation_notified"
DOCUMENTATION_PRODUCTION = "documentation_production"
DOCUMENTATION_SUBMITTED = "documentation_submitted"
MODEL_ACCESS_GRANTED = "model_access_granted"
EVALUATION_IN_PROGRESS = "evaluation_in_progress"
PRELIMINARY_FINDINGS = "preliminary_findings"
RESPONSE_SUBMITTED = "response_submitted"
EVALUATION_COMPLETE = "evaluation_complete"
CORRECTIVE_MEASURES_ISSUED = "corrective_measures_issued"
@dataclass
class Art55EvaluationRecord:
"""Tracks an Art.55 AI Office model evaluation proceeding."""
evaluation_id: str = field(default_factory=lambda: f"ART55-{uuid.uuid4().hex[:8].upper()}")
evaluation_type: EvaluationType = EvaluationType.FULL_EVALUATION
notification_date: datetime = field(default_factory=datetime.now)
status: CooperationStatus = CooperationStatus.NOTIFIED
# Documentation production
documentation_deadline: Optional[datetime] = None
documentation_submitted_date: Optional[datetime] = None
documentation_package_contents: list[str] = field(default_factory=list)
# Model access phase
model_access_requested: bool = False
model_access_granted_date: Optional[datetime] = None
evaluation_environment_url: Optional[str] = None
# Expert involvement
scientific_panel_involved: bool = False
designated_expert_count: int = 0
# Outcomes
preliminary_findings_date: Optional[datetime] = None
provider_response_deadline: Optional[datetime] = None
final_report_date: Optional[datetime] = None
corrective_measures: list[str] = field(default_factory=list)
cop_adherence_invoked: bool = False
def set_documentation_deadline(self, working_days: int = 10):
"""Set documentation production deadline (10 working days default)."""
# Simplified: does not account for public holidays
deadline = self.notification_date
days_added = 0
while days_added < working_days:
deadline += timedelta(days=1)
if deadline.weekday() < 5: # Monday=0, Friday=4
days_added += 1
self.documentation_deadline = deadline
def days_to_documentation_deadline(self) -> int:
if not self.documentation_deadline:
return -1
delta = self.documentation_deadline - datetime.now()
return delta.days
def is_at_risk_of_non_cooperation_fine(self) -> bool:
"""True if documentation deadline has passed without submission."""
if self.documentation_submitted_date:
return False
if not self.documentation_deadline:
return False
return datetime.now() > self.documentation_deadline
def generate_cooperation_status_report(self) -> dict:
return {
"evaluation_id": self.evaluation_id,
"type": self.evaluation_type.value,
"current_status": self.status.value,
"notification_date": self.notification_date.isoformat(),
"documentation_deadline": self.documentation_deadline.isoformat()
if self.documentation_deadline else None,
"days_to_doc_deadline": self.days_to_documentation_deadline(),
"at_risk_non_cooperation_fine": self.is_at_risk_of_non_cooperation_fine(),
"model_access_requested": self.model_access_requested,
"model_access_granted": self.model_access_granted_date is not None,
"scientific_panel_involved": self.scientific_panel_involved,
"corrective_measures_count": len(self.corrective_measures),
"cop_adherence_invoked": self.cop_adherence_invoked,
}
@dataclass
class GPAIModelEvaluationReadinessChecker:
"""Pre-evaluation readiness assessment for Art.55 compliance."""
model_name: str
provider_eu_established: bool
infrastructure_jurisdiction: str # "EU", "US", "MIXED"
cop_adherent: bool
cop_documentation_current: bool
annex_xi_documentation_complete: bool
adversarial_testing_conducted: bool
adversarial_testing_reported: bool
incident_reporting_system_active: bool
cybersecurity_measures_documented: bool
energy_consumption_documented: bool
evaluation_environment_ready: bool
legal_counsel_designated: bool
technical_contact_designated: bool
def compute_readiness_score(self) -> int:
"""Score 0-14 based on Art.55 readiness dimensions."""
checks = [
self.provider_eu_established,
self.infrastructure_jurisdiction == "EU",
self.cop_adherent,
self.cop_documentation_current,
self.annex_xi_documentation_complete,
self.adversarial_testing_conducted,
self.adversarial_testing_reported,
self.incident_reporting_system_active,
self.cybersecurity_measures_documented,
self.energy_consumption_documented,
self.evaluation_environment_ready,
self.legal_counsel_designated,
self.technical_contact_designated,
self.infrastructure_jurisdiction != "US", # No CLOUD Act dual-jurisdiction
]
return sum(checks)
def identify_readiness_gaps(self) -> list[str]:
gaps = []
if not self.provider_eu_established:
gaps.append("Art.54 authorised representative may be required (non-EU provider)")
if self.infrastructure_jurisdiction == "US":
gaps.append("CLOUD Act dual-jurisdiction risk: model weights on US infrastructure compellable by US authorities independently of Art.55")
if self.infrastructure_jurisdiction == "MIXED":
gaps.append("CLOUD Act partial risk: assess which infrastructure components are US-controlled")
if not self.cop_adherent:
gaps.append("No Art.56 Code of Practice adherence: CoP safe harbour unavailable against Art.55(5) corrective measures")
if not self.cop_documentation_current:
gaps.append("CoP documentation not current: adherence records must be maintained continuously")
if not self.annex_xi_documentation_complete:
gaps.append("Annex XI technical documentation incomplete: primary Art.55(2)(a) production target")
if not self.adversarial_testing_conducted:
gaps.append("No Art.53(1)(a) adversarial testing conducted: Art.55 evaluation will expose non-compliance")
if not self.adversarial_testing_reported:
gaps.append("Adversarial testing not reported to AI Office: Art.53(1)(a) reporting obligation unmet")
if not self.incident_reporting_system_active:
gaps.append("No Art.53(1)(b) incident reporting system: serious incidents may be unreported")
if not self.cybersecurity_measures_documented:
gaps.append("Cybersecurity measures not documented: Art.53(1)(c) compliance cannot be demonstrated")
if not self.energy_consumption_documented:
gaps.append("Energy consumption not documented: Art.53(1)(d) transparency obligation unmet")
if not self.evaluation_environment_ready:
gaps.append("No evaluation environment prepared: Art.55(2)(b) model access request cannot be fulfilled promptly")
if not self.legal_counsel_designated:
gaps.append("No designated legal counsel for Art.55 proceedings: cooperation responses need legal review")
if not self.technical_contact_designated:
gaps.append("No designated technical contact: Art.55(2)(c) personnel availability obligation unfulfilled")
return gaps
def generate_readiness_report(self) -> dict:
score = self.compute_readiness_score()
gaps = self.identify_readiness_gaps()
return {
"model": self.model_name,
"readiness_score": f"{score}/14",
"readiness_percentage": f"{(score / 14) * 100:.1f}%",
"cop_safe_harbour_available": self.cop_adherent and self.cop_documentation_current,
"cloud_act_risk": self.infrastructure_jurisdiction in ("US", "MIXED"),
"gaps_count": len(gaps),
"gaps": gaps,
"recommendation": "AI Office evaluation ready" if score >= 12 else
"Significant gaps — evaluation response at risk" if score >= 8 else
"Critical gaps — immediate remediation required",
}
# Example usage
if __name__ == "__main__":
checker = GPAIModelEvaluationReadinessChecker(
model_name="ExampleGPAI-v2",
provider_eu_established=True,
infrastructure_jurisdiction="EU",
cop_adherent=True,
cop_documentation_current=True,
annex_xi_documentation_complete=True,
adversarial_testing_conducted=True,
adversarial_testing_reported=True,
incident_reporting_system_active=True,
cybersecurity_measures_documented=True,
energy_consumption_documented=True,
evaluation_environment_ready=True,
legal_counsel_designated=True,
technical_contact_designated=True,
)
report = checker.generate_readiness_report()
print(f"Readiness: {report['readiness_score']} — {report['recommendation']}")
# Output: Readiness: 14/14 — AI Office evaluation ready
Art.55 Readiness Checklist (14 Items)
| # | Item | Art.55 Reference |
|---|---|---|
| 1 | EU establishment confirmed or Art.54 authorised representative appointed | Art.54 × Art.55 |
| 2 | Annex XI technical documentation compiled and current | Art.55(2)(a) × Art.52(1)(a) |
| 3 | Adversarial testing program established under Art.53(1)(a) | Art.55(2)(a) × Art.53(1)(a) |
| 4 | Adversarial testing results reported to AI Office | Art.55(2)(a) × Art.53(1)(a) |
| 5 | Serious incident reporting system active and tested | Art.55(2)(a) × Art.53(1)(b) |
| 6 | Cybersecurity measures documented at Art.53(1)(c) level | Art.55(2)(a) × Art.53(1)(c) |
| 7 | Energy consumption data compiled for training and inference | Art.55(2)(a) × Art.53(1)(d) |
| 8 | Dedicated model evaluation environment prepared (isolated, logged) | Art.55(2)(b) |
| 9 | Model access governance protocol documented (who can authorise AI Office access) | Art.55(2)(b) |
| 10 | Technical personnel designated for AI Office evaluation support | Art.55(2)(c) |
| 11 | Legal counsel designated for Art.55 cooperation responses | Art.55(4) |
| 12 | Art.56 Code of Practice adherence documented and current | Art.55(5) × Art.56 |
| 13 | CLOUD Act jurisdiction assessment completed for model infrastructure | Art.55 × CLOUD Act |
| 14 | Internal Art.55 response protocol established (notification → response pipeline) | Art.55(4) |