EU AI Office & GPAI Model Regulation: Developer Guide (EU AI Act Art.51-56)
The EU AI Act (Regulation 2024/1689) introduced a new regulatory category that did not exist in the original 2021 Commission proposal: General-Purpose AI (GPAI) models. Articles 51–56, added during trialogue negotiations in 2023, create tiered obligations for foundation model providers, and the newly established EU AI Office has primary enforcement jurisdiction over these provisions — bypassing national market surveillance authorities.
If you build, fine-tune, host, or integrate a GPAI model into applications serving EU users, Articles 51–56 apply directly to your product. If you use a GPAI model (OpenAI, Anthropic, Meta, Mistral) as a building block, downstream developer obligations from Article 52(2) still reach you.
This guide covers the complete GPAI regulatory framework: scope, provider obligations, the systemic risk tier, GPAI Codes of Practice, enforcement by the EU AI Office, and what EU-native infrastructure means for compliance.
What is a GPAI Model under the EU AI Act?
Definition (Art. 3(63))
A General-Purpose AI model is an AI model — including large generative models — that:
- Is trained on large amounts of data using self-supervision at scale
- Displays significant generality
- Is capable of competently performing a wide range of distinct tasks
- Can be integrated into a variety of downstream systems or applications
This covers: GPT-4, Claude 3/4, Llama 3/4, Mistral, Gemini, PaLM 2, Falcon, Command R, Phi-3, and any comparable model. It does not cover narrow, task-specific AI systems (a spam classifier is not a GPAI model).
What is NOT a GPAI Model
The Recitals (§ 97) clarify:
- AI models released for research purposes under open-source licenses get a lighter regime (Art. 53(2)) — they still must comply with copyright transparency but get exemptions from some documentation requirements
- AI systems that happen to use GPAI outputs but are not themselves GPAI models are governed by the High-Risk AI provisions (Annex III), not Art. 51–56
The Systemic Risk Tier (Art. 51)
Article 51 creates a two-tier system within GPAI:
Tier 1 — All GPAI models: Articles 52–53 apply.
Tier 2 — GPAI models with systemic risk: Articles 54–55 apply in addition to Tier 1.
A GPAI model has systemic risk if:
- Training compute exceeds 10^25 FLOPs (Art. 51(1)(a)) — the objective threshold
- The Commission designates it as posing systemic risk based on qualitative criteria (Art. 51(1)(b)) — subjective designation power
The 10^25 FLOP threshold is currently above GPT-4 (estimated 10^24–10^25), but below GPT-4-class training runs using modern efficient parallelism. The Commission can revise this threshold via delegated act (Art. 51(3)) to track compute scaling.
In practice in 2026: OpenAI GPT-4 and GPT-4o, Google Gemini Ultra, Anthropic Claude 3 Opus/Sonnet, and Meta Llama 3 (largest variants) are either at or approaching the systemic risk threshold. The Commission has discretion to designate models even below 10^25 FLOPs if they pose cross-sectoral risks.
The EU AI Office: Structure and Powers
Establishment and Role
The EU AI Office (EUAIO) was established within the European Commission by Commission Decision 2024/1438 and is now the primary enforcement body for GPAI models under the AI Act. Unlike High-Risk AI enforcement (handled by national Market Surveillance Authorities), GPAI oversight is centralized at EU level.
Key structural features:
- Location: Brussels (DG CNECT, European Commission)
- Jurisdiction: All GPAI providers operating in the EU, regardless of establishment location
- Operates under: Regulation 2024/1689, Chapter VIII (Art. 64-70 on AI Office powers)
- Tools: Model evaluations, access to model weights, investigation powers, fines enforcement
AI Office Enforcement Powers (Art. 88-99)
The AI Office can:
| Power | Legal Basis |
|---|---|
| Request technical documentation from GPAI providers | Art. 88 |
| Conduct evaluations of GPAI models (incl. red-teaming) | Art. 92 |
| Order providers to take risk mitigation measures | Art. 93 |
| Access source code, training data samples, model weights | Art. 91 |
| Impose fines (via Commission) | Art. 99 |
Fine exposure for GPAI violations:
- Art. 53 / Art. 55 violations (provider obligations): up to €15 million or 3% of worldwide annual turnover
- Supplying incorrect/misleading information to the AI Office: up to €7.5 million or 1.5% of worldwide turnover
- For SMEs: lower of the two thresholds always applies
Art. 52: Base Obligations for All GPAI Providers
Article 52 imposes obligations on every GPAI model provider regardless of compute scale.
52(1): Technical Documentation
Providers must draw up technical documentation before placing the GPAI model on the EU market. Annex XI specifies the required content:
Annex XI — GPAI Technical Documentation Requirements:
1. General description:
- Intended purpose
- Architecture (parameter count, context window, modalities)
- Training methodology
- Performance benchmarks
- Known limitations
2. Training data description:
- Data sources and types
- Data collection methodology
- Data filtering and preprocessing pipeline
- Copyright compliance measures (Art. 53(1)(c))
3. Evaluation and testing:
- Evaluation benchmarks and results
- Bias and fairness evaluations
- Safety testing results (red-teaming for systemic risk models)
4. Model card equivalent:
- Intended and prohibited use cases
- Known risks and mitigation measures
- Fine-tuning and adaptation guidance
This documentation must be kept up to date and provided to the AI Office upon request.
52(2): Information for Downstream Providers
When a GPAI model is integrated into a downstream AI system (a chatbot, coding assistant, retrieval system), the GPAI provider must give downstream developers sufficient information to comply with their own AI Act obligations.
Practically: If you build a High-Risk AI system using GPT-4 or Claude as the reasoning backbone, OpenAI/Anthropic must give you the information you need for your own Annex III risk assessment. This creates a contractual chain of compliance information.
For developers using GPAI APIs:
- You receive information adequate for your downstream compliance
- You remain responsible for your system's classification (High-Risk vs. not)
- The GPAI provider's obligations do not substitute for yours
52(3): Authorized EU Representative
GPAI providers established outside the EU must designate an authorized representative in the EU (Art. 54) before placing models on the EU market. This is similar to GDPR's Art. 27 representative requirement.
US-based providers (OpenAI, Anthropic, Google DeepMind US) must have an EU legal entity or designated representative. Failure to designate is itself an Art. 52 violation.
52(4): Open-Source Exemptions
Providers of open-source GPAI models benefit from:
- Exemption from the Annex XI technical documentation requirement (Art. 53(2)(a))
- Exemption from downstream information sharing (Art. 53(2)(b))
- They remain subject to copyright transparency (Art. 53(1)(c))
- And subject to systemic risk obligations if the threshold is met (Art. 55 has no open-source exemption)
This is why Llama 3 (open weights) still needs to comply with Art. 55 if training compute qualifies, and must maintain copyright transparency regardless.
Art. 53: Transparency and Copyright Compliance
53(1)(a-b): Technical Documentation and Model Cards
Article 53(1)(a) and (b) operationalize the Annex XI documentation requirement and require GPAI providers to make model cards publicly available — Annex XII specifies the minimum content:
Annex XII — GPAI Model Card (Public Summary):
1. Intended purpose
2. Capabilities and performance benchmarks
3. Known limitations and risks
4. Prohibited uses
5. Context-appropriate safety measures
6. Copyright compliance summary (how training data copyright was handled)
These are public-facing documents. They must be sufficiently detailed for downstream developers to make informed integration decisions.
53(1)(c): Copyright and Training Data Transparency
This is arguably the most immediately impactful provision for the AI industry. Article 53(1)(c) requires GPAI providers to:
"put in place a policy to comply with Union law on copyright and related rights, and in particular to identify and comply with, including through state-of-the-art technologies, a reservation of rights expressed pursuant to Article 4(3) of Directive 2019/790."
What this means in practice:
-
Opt-out compliance (Directive 2019/790 Art. 4(3)): Website operators and content publishers can opt out of having their content used for AI training by expressing a machine-readable reservation (e.g.,
robots.txtwithUser-agent: GPTBotor theTDMRepstandard from W3C). GPAI providers must implement technical systems to detect and honor these reservations. -
Transparency about training data: Providers must publish a "sufficiently detailed summary" of training data (Art. 53(1)(d)):
# Example GPAI training data disclosure structure (Art. 53(1)(d) compliance)
class TrainingDataDisclosure:
"""
Summary of training data per Art. 53(1)(d) EU AI Act.
Must be published publicly and kept up to date.
"""
def __init__(self):
self.data_sources = {
"web_crawl": {
"description": "Filtered common crawl snapshots",
"date_range": "2016–2024",
"size_tokens": "~2T tokens",
"opt_out_compliance": "robots.txt honored, TDMRep honored since 2023-01",
"copyright_measures": "C4-style URL filtering, deduplication"
},
"books": {
"description": "Licensed book collections",
"licensing": "Partnership agreements with publishers",
"size_tokens": "~100B tokens",
"opt_out_compliance": "N/A — licensed"
},
"code": {
"description": "Public code repositories",
"licenses": ["MIT", "Apache-2.0", "BSD-2/3"],
"excluded": ["GPL-3.0 (copyleft)", "AGPL-3.0", "unlicensed"],
"size_tokens": "~200B tokens",
"opt_out_compliance": "robots.txt honored for code hosting platforms"
},
"academic": {
"description": "Open access academic papers",
"source": "arXiv, PubMed Open Access, Semantic Scholar",
"size_tokens": "~50B tokens",
"opt_out_compliance": "Publisher agreements or CC licensing"
}
}
def publish_summary(self) -> dict:
"""Returns public-facing Annex XII summary."""
return {
"total_sources": len(self.data_sources),
"opt_out_mechanism": "robots.txt + TDMRep (W3C) + custom opt-out portal",
"opt_out_portal": "https://provider.com/ai-training-opt-out",
"copyright_review_date": "2024-Q4",
"next_review": "2025-Q2"
}
- Copyright litigation risk: Art. 53(1)(c) creates a direct liability hook. If a rights holder can show their opt-out was not honored, the GPAI provider has violated Art. 53 and the rights holder can invoke both EU copyright law (Directive 2019/790) AND AI Act enforcement. This is a dual enforcement pathway that did not exist before the AI Act.
53(1)(d): Training Data Summary
GPAI providers must maintain and publish a sufficiently detailed summary of the training data used:
- Data categories (text, code, images, video, audio)
- Data sources (web crawl, licensed datasets, synthetic data)
- Time periods covered
- Opt-out compliance mechanisms
This summary must be available on the EUAIO public register (Art. 71) — a public database of GPAI models and their documentation.
Art. 55: Systemic Risk Obligations
For models above the 10^25 FLOP threshold (or Commission-designated), Article 55 imposes a second tier of obligations.
55(1)(a): Model Evaluations
Systemic risk GPAI providers must conduct adversarial testing including:
- Red-teaming for misuse (bioweapons uplift, cyberattacks, CSAM generation)
- Evaluation against AI Office-published benchmark protocols
- Testing before major capability updates
The AI Office has published preliminary evaluation protocols; formal harmonized standards (under Art. 40) are in development via CEN/CENELEC.
55(1)(b): Adversarial Testing with AI Office Cooperation
Providers may be required to share model weights or API access with the AI Office for independent evaluation. This is a significant national security-adjacent power: the Commission can evaluate closed-weights models under controlled conditions.
55(1)(c): Incident Reporting
GPAI providers with systemic risk must implement an incident reporting system:
class GPAIIncidentReport:
"""
Incident report structure per Art. 55(1)(c) EU AI Act.
Reportable incidents: serious incidents with EU persons or fundamental rights.
Timeline: Serious incidents → 2-business-day initial notification to AI Office.
"""
def __init__(
self,
incident_id: str,
provider_name: str,
model_name: str,
discovery_datetime: str,
description: str,
affected_users_estimate: int,
severity: str, # "serious" | "significant" | "minor"
harm_categories: list[str],
immediate_mitigation: str,
eu_ai_office_notification_required: bool
):
self.incident_id = incident_id
self.provider_name = provider_name
self.model_name = model_name
self.discovery_datetime = discovery_datetime
self.description = description
self.affected_users_estimate = affected_users_estimate
self.severity = severity
self.harm_categories = harm_categories
self.immediate_mitigation = immediate_mitigation
self.eu_ai_office_notification_required = eu_ai_office_notification_required
def get_notification_deadline(self) -> str:
"""
Art. 55(1)(c): Serious incidents → 2 business day notification.
Art. 55(1)(c): Significant incidents → follow-up within 15 days.
"""
if self.severity == "serious":
return "2 business days from discovery"
elif self.severity == "significant":
return "15 calendar days from discovery"
else:
return "No mandatory notification — retain for annual summary"
def reportable_harm_categories(self) -> list[str]:
"""Per Recital 110: What constitutes 'serious incident'."""
return [
"Death or serious physical harm",
"Disruption of critical infrastructure (NIS2 scope)",
"Property damage exceeding €500,000",
"Serious privacy breach (GDPR Art. 4(12) threshold)",
"Fundamental rights violation at scale",
"National security implications"
]
Note the intersection with NIS2: a GPAI-related incident that also involves critical infrastructure creates dual reporting — AI Act incident notification to the EUAIO + NIS2 early warning to the competent NCA/CSIRT within 24 hours. See our NIS2 × AI Act guide for the full dual-reporting matrix.
55(1)(d): Cybersecurity Measures
Systemic risk GPAI providers must implement "adequate cybersecurity protection" including:
- Access controls for model weights and inference infrastructure
- Protection against model extraction (model stealing attacks)
- Protection against training data extraction (membership inference)
- Incident response procedures
This overlaps with DORA (Digital Operational Resilience Act) for financial sector deployments of GPAI systems.
55(2): Adversarial Testing Standards
The AI Office issues technical standards for adversarial testing. In 2025, the AI Office published preliminary red-teaming guidance covering:
- Uplift evaluation (weapons of mass destruction, bioweapons)
- CBRN (chemical/biological/radiological/nuclear) uplift testing
- Cyberoffense capability evaluation
- Influence operations and synthetic media generation
Providers can use AI Safety Institutes (established in UK, USA, Singapore, Japan) results as partial evidence of compliance, but the EU AI Office runs its own evaluation program.
Art. 56: GPAI Codes of Practice
Article 56 establishes a co-regulatory mechanism: the AI Office facilitates development of Codes of Practice by GPAI providers, research institutions, and civil society. These Codes create presumption of conformity with Articles 52–55 when followed.
Status in 2026
The AI Office began the Code of Practice process in 2024. Key participants include:
- OpenAI, Google DeepMind, Anthropic, Meta, Mistral
- European research institutions (INRIA, Fraunhofer, Alan Turing Institute)
- Civil society organizations (AlgorithmWatch, EDRi)
The Code of Practice covers:
| Topic | Code Requirements | AI Act Provision |
|---|---|---|
| Technical documentation | Annex XI template + update cadence | Art. 52 |
| Model cards | Annex XII minimum + machine-readable format | Art. 53(1)(d) |
| Copyright opt-out | TDMRep + robots.txt + opt-out portal SLA | Art. 53(1)(c) |
| Red-teaming | Frequency (pre-release + major update) + scope | Art. 55(1)(a) |
| Incident reporting | 2-day notification + 15-day detailed report format | Art. 55(1)(c) |
| Open-source GPAI | Reduced documentation, same copyright rules | Art. 53(2) |
Following the Code creates a safe harbor presumption. Providers not following the Code must demonstrate equivalent compliance — a higher evidentiary bar.
Open-Source Model Implications
Open-source GPAI providers (Meta's Llama, Mistral AI) face a structural tension:
- Documentation requirements are reduced (Art. 53(2) exemption)
- Copyright transparency still applies — opt-out compliance cannot be waived
- Systemic risk obligations still apply if compute threshold is met
- The Code of Practice has a separate track for open-source models
def check_gpai_obligations(
model_name: str,
training_flops: float, # in FLOPs
is_open_source: bool,
commission_designated: bool
) -> dict:
"""
Determine GPAI obligations under EU AI Act Art. 51-55.
"""
SYSTEMIC_RISK_THRESHOLD = 1e25 # 10^25 FLOPs
is_systemic_risk = (
training_flops >= SYSTEMIC_RISK_THRESHOLD
or commission_designated
)
obligations = {
"art_52_documentation": True, # Always required
"art_52_downstream_info": not is_open_source, # Exempt for open-source
"art_52_eu_representative": True, # Always for non-EU providers
"art_53_copyright_policy": True, # Always required (incl. open-source)
"art_53_training_data_summary": not is_open_source, # Exempt for open-source
"art_55_adversarial_testing": is_systemic_risk,
"art_55_incident_reporting": is_systemic_risk,
"art_55_cybersecurity": is_systemic_risk,
}
return {
"model": model_name,
"systemic_risk": is_systemic_risk,
"open_source_reduced_regime": is_open_source,
"obligations": obligations,
"fine_exposure": "€15M or 3% global turnover" if not is_open_source else "€7.5M or 1.5% (reduced)",
"primary_enforcer": "EU AI Office (centralized)"
}
# Examples
print(check_gpai_obligations("GPT-4", 2e24, False, True))
# → systemic_risk: True (Commission-designated)
# → all Art. 55 obligations apply
print(check_gpai_obligations("Mistral-7B", 1e23, True, False))
# → systemic_risk: False
# → open_source: reduced documentation (Art. 53(2) exemption)
# → copyright transparency: STILL required
print(check_gpai_obligations("Llama-3-405B", 5e24, True, False))
# → systemic_risk: uncertain (approaching threshold)
# → Commission designation possible
Downstream Developer Obligations
If you integrate a GPAI model (via API or self-hosted weights) into an application, you are a "deployer" under the AI Act, and your obligations depend on the downstream system's classification:
High-Risk AI System Using GPAI (Annex III)
If your application falls under Annex III (credit scoring, recruitment, medical diagnosis, critical infrastructure), you must:
- Conduct a risk assessment that accounts for the GPAI model's capabilities and limitations
- Use the information provided by the GPAI provider (Art. 52(2)) in your risk assessment
- Register your system in the EU database (Art. 49)
- Maintain technical documentation (Annex IV)
- Implement human oversight (Art. 14) — this is where "human in the loop" becomes a legal requirement, not just a design preference
The GPAI provider's compliance does not exempt your downstream system from Annex III requirements.
Low-Risk/Non-Regulated AI Using GPAI
For most applications (chatbots, coding assistants, summarization tools not in Annex III):
- No AI Act registration required
- But transparency obligations (Art. 50) apply: you must inform users when they interact with an AI system
- If generating synthetic content (audio, video, images, text) with potential for public dissemination, watermarking/labeling requirements apply (Art. 50(2))
The CLOUD Act Problem for GPAI Deployments
When you deploy a GPAI model via a US-based provider (OpenAI API, Google Gemini API, Anthropic API), you are routing EU user interactions through US infrastructure. This creates:
- CLOUD Act reach: US law enforcement can compel production of inference logs, conversation data, and user metadata held by US providers without EU judicial cooperation
- Art. 46 GDPR conflict: International transfers of EU personal data to US providers require adequacy (EU-US DPF, Privacy Shield successor) or SCCs — inference data often contains personal data
- Art. 53(1)(c) copyright conflict: When the EU AI Office investigates copyright compliance, it may need access to training data documentation held in US systems — subject to CLOUD Act parallel access
EU-native GPAI deployment (hosting open-weight models on EU infrastructure) eliminates the CLOUD Act vector:
- Inference data stays in EU jurisdiction
- No US government compelled access pathway
- Single regulatory regime: EU AI Act + GDPR only
- Relevant for GPAI providers who open-source weights (Llama 3, Mistral) + deployers who need CLOUD Act isolation
class GPAIDeploymentJurisdiction:
"""
Jurisdiction analysis for GPAI deployment choices.
Relevant for developers choosing between API and self-hosted.
"""
DEPLOYMENT_OPTIONS = {
"us_api": {
"provider_example": "OpenAI API / Anthropic API",
"inference_jurisdiction": "USA (US servers)",
"cloud_act_reach": True,
"gdpr_transfer_mechanism": "Standard Contractual Clauses (SCCs)",
"eu_ai_act_compliance": "Provider responsible for Art. 52-55",
"deployer_risk": "CLOUD Act exposure for inference data",
"cost": "Pay-per-token"
},
"eu_api": {
"provider_example": "Mistral API (Paris), Aleph Alpha",
"inference_jurisdiction": "EU (FR/DE servers)",
"cloud_act_reach": False,
"gdpr_transfer_mechanism": "N/A — EU-resident processing",
"eu_ai_act_compliance": "Provider responsible for Art. 52-55",
"deployer_risk": "Low — single EU jurisdiction",
"cost": "Pay-per-token"
},
"self_hosted_eu": {
"provider_example": "Llama 3.1 70B on sota.io EU PaaS",
"inference_jurisdiction": "EU (deployer's choice of EU region)",
"cloud_act_reach": False,
"gdpr_transfer_mechanism": "N/A — EU-resident processing",
"eu_ai_act_compliance": "Deployer responsible for system-level compliance",
"deployer_risk": "Full control — no third-party CLOUD Act pathway",
"cost": "Infrastructure cost, no per-token pricing"
}
}
def get_recommendation(self, requirements: dict) -> str:
if requirements.get("cloud_act_isolation") and requirements.get("cost_predictable"):
return "self_hosted_eu"
elif requirements.get("cloud_act_isolation"):
return "eu_api"
else:
return "us_api"
Compliance Timeline
The GPAI provisions have a specific entry-into-force schedule distinct from the rest of the AI Act:
| Date | Milestone |
|---|---|
| 2 August 2025 | GPAI model obligations (Art. 51-56) fully applicable |
| 2 August 2025 | AI Office operational, GPAI register open |
| 2 February 2026 | Codes of Practice finalized (if process complete) |
| 2 August 2026 | High-Risk AI Annex III obligations fully applicable |
| 2 August 2027 | GPAI models under pre-existing rules must comply |
The GPAI obligations entered force in August 2025 — they are already applicable. If your organization provides a GPAI model or integrates one in an Annex III application, compliance is not a future task.
What GPAI Providers Should Have Done by Now
- Technical documentation (Annex XI) — drafted, reviewed by legal, submitted to AI Office registration system
- Model card (Annex XII) — published publicly
- Copyright policy — implemented TDMRep + robots.txt scraping compliance + opt-out portal
- Training data summary (Art. 53(1)(d)) — published, up to date
- EU authorized representative — designated if provider is non-EU
- Incident reporting system — operational (for systemic risk models)
- Adversarial testing — completed pre-release evaluation (for systemic risk models)
What EU-Native GPAI Deployment Means for Developers
Building AI-powered applications on EU infrastructure using open-weight GPAI models (Llama 3, Mistral, Falcon) on an EU PaaS like sota.io achieves:
- Single regulatory jurisdiction: EU AI Act + GDPR only. No CLOUD Act parallel compliance pathway.
- CLOUD Act isolation: Inference data, user conversations, and embeddings stay in EU territory.
- Data residency for Art. 46 GDPR: No cross-border transfer SCCs needed for inference.
- Alignment with systemic risk avoidance: Self-hosted open-weight models below systemic risk threshold avoid the most demanding Art. 55 obligations at the deployer level.
- Audit trail within EU jurisdiction: All logs, model outputs, and monitoring data remain under EU judicial reach only.
For regulated industries (financial services under DORA, healthcare under MDR, critical infrastructure under NIS2), EU-native GPAI deployment is not just a compliance preference — it is increasingly a procurement requirement as regulators publish GPAI guidance specific to their sectors.
See Also
- EU AI Act Art.29 GPAI Provider Obligations: Developer Guide — Deep-dive into Art.29(1) Annex XI technical documentation, Art.29(2) downstream access, Art.29(3) systemic risk, and the Art.29 × Art.51/52/53/55 intersection matrix
- EU AI Act High-Risk AI & Conformity Assessment Guide
- EU AI Act Regulatory Sandbox: Art. 57-63 Developer Guide
- EU AI Liability Directive (AILD) Developer Guide
- EU NIS2 × AI Act: Critical Infrastructure Compliance
- EU Data Act: B2B Data Sharing and AI Training Data