2026-06-11·5 min read·sota.io Team

EU AI Act Art.55 GPAI Systemic Risk Obligations: Adversarial Testing, Incident Reporting, and What Downstream Developers Must Verify

Post #4 in the EU AI Act GPAI Compliance 2026 Series

EU AI Act Art.55 GPAI systemic risk adversarial testing incident reporting SaaS developer compliance 2026

The first three posts in this series covered the GPAI landscape, the three-tier obligation stack, and the contractual rights that Art.53 gives downstream developers against their GPAI providers. This post addresses the elevated tier: what happens when a GPAI model is classified as having systemic risk under Art.51, and what the Art.55 obligations that follow mean in practice for both providers and the SaaS developers integrating those models.

This distinction matters because the models most SaaS developers actually use — GPT-4o, Claude 3 Opus and Sonnet, Gemini Ultra — are almost certainly classified as systemic risk GPAI models under the EU AI Act. If your product sends user data to any of these APIs, your provider is subject to Art.55 obligations that are both more demanding and more consequential than the baseline Art.53 requirements covered in the previous post. Understanding those obligations tells you what your provider should be doing, what documentation you can demand, and where your own compliance exposure shifts.

Art.51: How a GPAI Model Gets Classified as Having Systemic Risk

Article 51 of the EU AI Act establishes the classification framework for GPAI models with systemic risk. The article creates two pathways to classification:

The Threshold Pathway

A GPAI model is automatically presumed to have systemic risk if it has been trained using a cumulative amount of compute greater than 10²⁵ floating-point operations (FLOPs). This threshold was selected to capture only the most capable frontier models while excluding smaller open-source and fine-tuned models from elevated obligations.

At the time of publication, this threshold captures models trained on infrastructure equivalent to or exceeding the scale used for GPT-4, Gemini Ultra, and Claude 3 Opus. Smaller models — including many that can run on consumer hardware — fall below the threshold.

The threshold is not permanent. Article 51(2) gives the European Commission the power to update the 10²⁵ FLOP threshold by delegated act as the state of the technology evolves.

The Designation Pathway

In addition to the automatic threshold, the AI Office has authority under Art.51(2) to designate a GPAI model as having systemic risk based on criteria beyond raw training compute. These criteria include:

The number of registered end users of the model (scale of deployment)
The capabilities of the model in domains of particular sensitivity (cybersecurity, biosecurity, critical infrastructure)
Whether the model serves as a foundation model for a significant portion of downstream AI systems
Risk assessments submitted by the provider or third parties

This pathway allows the AI Office to capture highly capable but computationally efficient models — systems trained with fewer FLOPs than the threshold but still capable of generating systemic risk at Union level.

What Classification Means for Your GPAI Provider

If your provider's model qualifies under either pathway, they become subject to Art.55 obligations in addition to their baseline Art.53 obligations. These are not alternatives — they are additive. A systemic risk GPAI provider must comply with both.

Art.52: The Classification Procedure

Article 52 governs the procedural mechanics of how systemic risk classification happens and how providers interact with the AI Office around it.

Under Art.52, providers that train GPAI models must notify the AI Office when they reach or approach the 10²⁵ FLOP threshold. The AI Office then makes a classification determination. Providers can challenge a designation that goes beyond the automatic threshold through a formal process, including by submitting a qualified alert based on their internal evaluation.

For downstream developers, this procedural chapter has one practical implication: if your provider has not notified the AI Office, and their model should qualify, they may already be in non-compliance. Responsible providers — OpenAI, Anthropic, Google DeepMind — have engaged publicly with the AI Office classification process and publicly acknowledge their systemic risk status or expected classification. Providers that avoid this discussion entirely should be treated as a compliance risk.

Art.55: The Elevated Obligations for Systemic Risk GPAI Providers

Article 55 is where the substantive elevated obligations live. These go beyond Art.53's documentation and transparency requirements and impose active ongoing compliance duties on providers.

Art.55(1)(a): Model Evaluation Against State-of-the-Art Protocols

Providers must perform model evaluation using standardised protocols reflecting the state of the art, including conducting and documenting adversarial testing of the model with a view to identifying and mitigating systemic risks.

The EU AI Office is responsible for developing and publishing these standardised evaluation protocols, which will be anchored in Annex XIII of the Regulation. Until formal standards are finalised, providers are expected to apply recognised third-party evaluation frameworks (including NIST AI RMF, MLCommons, and AI safety evaluation benchmarks from academic institutions) alongside their own internal red-teaming programmes.

What this means in practice:

Adversarial testing under Art.55 is not a one-time exercise. It is an ongoing obligation. Providers must:

Conduct structured red-teaming exercises that attempt to elicit harmful, dangerous, or capability-exceeding outputs from the model
Document the methodology, scope, and outcomes of each evaluation
Identify systemic risks surfaced during evaluation
Take and document mitigation measures for identified risks
Repeat the evaluation process when the model is substantially modified or when new risk categories emerge

The evaluation must cover risks at Union level — meaning the aggregate effect of the model's deployment across all users and use cases in the EU, not just worst-case individual interactions.

What downstream developers can demand:

Under Art.53 (the general GPAI documentation obligation), downstream providers can already request documentation necessary for their own compliance. For systemic risk GPAI providers, Art.55 creates an additional basis for this request: the adversarial testing results, including the identified risk categories and the mitigation measures taken, constitute documentation directly relevant to a downstream developer's own risk assessment.

If you are building a high-risk AI system on a systemic risk GPAI API, you should have access to a summary of the provider's most recent adversarial testing outcomes — at minimum, the risk categories identified and the steps taken to address them. You do not need (and will typically not get) raw red-teaming transcripts, but you should expect a structured summary.

Art.55(1)(b): Assessing and Mitigating Systemic Risks

Beyond the evaluation protocol, providers must actively assess and mitigate possible systemic risks — including their sources — that may arise at Union level from the development, placing on the market, or putting into service of the GPAI model.

This is distinct from the evaluation obligation. Evaluation is the process of finding risks. This obligation requires actual mitigation measures to be implemented and documented when risks are found.

Risk categories that the AI Office has indicated are in scope include:

Capability uplift for weapons of mass destruction (biological, chemical, nuclear, radiological)
Large-scale cyberattacks and offensive cyber capabilities
Serious risks to democratic institutions and electoral processes
Risks to critical infrastructure at Union level
Risks from the autonomous agency of AI models integrated with external systems or tools

For most mainstream SaaS use cases — customer service, content generation, code assistance, data analysis — the primary risk categories are more diffuse: misinformation at scale, privacy inference from large data sets, and economic displacement. These still fall within scope of the Art.55(1)(b) assessment obligation but are typically lower severity than the catastrophic risk categories above.

Art.55(1)(c): Incident Reporting to the EU AI Office

Providers of systemic risk GPAI models must track, document, and report without undue delay to the EU AI Office any serious incidents and possible corrective measures to address them.

This is a distinct incident reporting obligation from Art.73, which applies to providers and deployers of high-risk AI systems. There are three important differences:

1. Who is the recipient. Art.55 incident reports go to the EU AI Office, not to national competent authorities (NCAs). Art.73 reports go to the market surveillance authority of the member state where the incident occurred. For a GPAI provider operating at Union level, the AI Office is the single contact point.

2. What counts as a serious incident. For GPAI models, a serious incident is one that has or may have an actual, material impact at Union level — such as a widespread capability misuse event, a major safety failure enabling significant harm, or a cybersecurity compromise of the model's serving infrastructure. The threshold is higher than for Art.73 incidents (which capture individual-level serious harm).

3. The corrective measure obligation. Unlike Art.73, which focuses on incident notification, Art.55 requires providers to include in their report the corrective measures they are taking or plan to take. This creates an accountability loop between incident discovery and remediation.

What downstream developers can demand:

Serious incidents affecting a GPAI model can directly affect downstream products. If your GPAI API provider reports a serious incident to the AI Office, you may need to:

Assess whether the incident affects your specific integration
Update your own risk assessment if the incident reveals a new capability boundary
Consider whether your downstream high-risk AI system was affected in ways that trigger your own Art.73 reporting obligations

Your API agreement should include a contractual obligation on the GPAI provider to notify you of serious incidents that could affect your downstream integration, in addition to their regulatory obligation to notify the AI Office. This is a gap in many current enterprise API agreements.

Art.55(1)(d): Cybersecurity Protection

Providers must ensure an adequate level of cybersecurity protection for both the GPAI model itself and its physical infrastructure.

This obligation reflects a risk specific to systemic risk GPAI models: a successful attack on a frontier model's serving infrastructure could have effects at Union scale. The cybersecurity obligation under Art.55 is distinct from the Art.15 cybersecurity requirement for high-risk AI systems — Art.15 applies to AI system providers; Art.55(1)(d) applies specifically to GPAI model providers and covers the underlying model and its infrastructure.

In practice, this requires:

Access controls on model weights and training artefacts
Secure API serving infrastructure with appropriate authentication and rate limiting
Incident detection and response capabilities for the serving layer
Protection against model extraction attacks and adversarial prompt injection at infrastructure scale

For downstream developers, this creates an indirect assurance: if your GPAI provider demonstrates Art.55(1)(d) compliance, their infrastructure has met a regulatory-level security bar that reduces your own residual risk from provider compromise.

Art.56: Codes of Practice as the Compliance Demonstration Mechanism

Article 56 establishes codes of practice as the primary mechanism through which GPAI providers demonstrate compliance with their Art.53 and Art.55 obligations.

The AI Office has facilitated the development of a GPAI Code of Practice at Union level. Participation in the Code is not mandatory, but it serves as a recognised route to compliance: a provider that adheres to a Code approved by the AI Office creates a presumption of conformity with the obligations the Code covers.

The Code addresses both general GPAI provider obligations (Art.53) and systemic risk provider obligations (Art.55). For downstream developers, the Code creates a useful proxy: if your provider has publicly committed to and is audited against the GPAI Code of Practice, you have a structured third-party verification of their Art.55 compliance — without needing to conduct your own provider audit.

How to verify Code participation:

The AI Office maintains a public registry of GPAI model providers and their classification status. Providers that have registered under the Code are listed. You should check:

Whether your provider is listed in the AI Office's GPAI model register
Whether their classification status (systemic risk or not) is confirmed
Whether they have committed to or been audited against the GPAI Code of Practice

Major providers — OpenAI, Anthropic, Google, Meta — have engaged with the AI Office's Code development process. The publicly accessible documentation they publish (model cards, safety evaluations, transparency reports) reflects their participation in this process.

Five-Step Downstream Developer Verification Checklist

This checklist applies when you are integrating a GPAI model API and need to assess your provider's Art.55 compliance status.

Step 1 — Confirm classification status. Check the EU AI Office's GPAI model register (or your provider's public documentation) to confirm whether the model you are using is classified as having systemic risk. If the provider has not confirmed classification status and their model likely qualifies, treat them as classified and apply the same due diligence.

Step 2 — Request adversarial testing summary. Ask your provider for a summary of their most recent adversarial testing outcomes under Art.55(1)(a), including the risk categories identified and the mitigation measures applied. For enterprise API agreements, this should be a contractual deliverable. For standard API terms, the provider's AI safety evaluation publication typically contains this information.

Step 3 — Review systemic risk assessment scope. Confirm that your provider's systemic risk assessment under Art.55(1)(b) covers the use case categories relevant to your integration. If you are building in a sensitive domain (health, finance, infrastructure), verify that your use case type was within scope of the assessment.

Step 4 — Check incident notification obligations in your agreement. Review your API agreement or terms of service to confirm whether your provider has committed to notify you of serious incidents under Art.55(1)(c). If this obligation is absent, raise it with your legal team and consider adding it to your enterprise agreement.

Step 5 — Verify Code of Practice participation. Confirm whether your provider participates in the AI Office-endorsed GPAI Code of Practice. This is currently the strongest available third-party verification of Art.55 compliance.

What Changes on August 2, 2026

The GPAI chapter (Art.51–56) became applicable on August 2, 2025 — one year after the Regulation entered into force. This means Art.55 obligations are already in force for providers of systemic risk GPAI models.

For downstream SaaS developers, August 2, 2026 marks the deadline for their own high-risk AI system obligations under Chapter III. The convergence of deadlines is not coincidental: the regulation is designed so that by August 2, 2026, the full GPAI compliance chain is operational — systemic risk providers have been meeting Art.55 obligations for a year, and downstream developers can rely on that compliance chain when demonstrating their own Chapter III conformity.

If your GPAI provider is not demonstrating Art.55 compliance by the time you need to demonstrate your own Chapter III conformity, you have a gap in your technical documentation that national competent authorities can identify during market surveillance.

The Next Post in This Series

Post #5 will synthesise the full GPAI compliance picture — covering what the complete obligation stack looks like from the perspective of a SaaS developer who both uses GPAI APIs and may be developing their own AI capabilities — with a consolidated compliance checklist for August 2026.

For developers running on EU-hosted infrastructure, using providers who support data residency within the EU, the compliance chain is significantly simpler: provider obligations, infrastructure attestations, and incident reporting channels all remain within EU jurisdiction. The infrastructure layer is one of the few compliance variables a SaaS developer fully controls — which is why the choice of hosting provider matters well before the NCA audit begins.

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View plans