2026-04-10·15 min read·sota.io team

EU AI Office & GPAI Model Regulation: Developer Guide (EU AI Act Art.51-56)

The EU AI Act (Regulation 2024/1689) introduced a new regulatory category that did not exist in the original 2021 Commission proposal: General-Purpose AI (GPAI) models. Articles 51–56, added during trialogue negotiations in 2023, create tiered obligations for foundation model providers, and the newly established EU AI Office has primary enforcement jurisdiction over these provisions — bypassing national market surveillance authorities.

If you build, fine-tune, host, or integrate a GPAI model into applications serving EU users, Articles 51–56 apply directly to your product. If you use a GPAI model (OpenAI, Anthropic, Meta, Mistral) as a building block, downstream developer obligations from Article 52(2) still reach you.

This guide covers the complete GPAI regulatory framework: scope, provider obligations, the systemic risk tier, GPAI Codes of Practice, enforcement by the EU AI Office, and what EU-native infrastructure means for compliance.

What is a GPAI Model under the EU AI Act?

Definition (Art. 3(63))

A General-Purpose AI model is an AI model — including large generative models — that:

Is trained on large amounts of data using self-supervision at scale
Displays significant generality
Is capable of competently performing a wide range of distinct tasks
Can be integrated into a variety of downstream systems or applications

This covers: GPT-4, Claude 3/4, Llama 3/4, Mistral, Gemini, PaLM 2, Falcon, Command R, Phi-3, and any comparable model. It does not cover narrow, task-specific AI systems (a spam classifier is not a GPAI model).

What is NOT a GPAI Model

The Recitals (§ 97) clarify:

AI models released for research purposes under open-source licenses get a lighter regime (Art. 53(2)) — they still must comply with copyright transparency but get exemptions from some documentation requirements
AI systems that happen to use GPAI outputs but are not themselves GPAI models are governed by the High-Risk AI provisions (Annex III), not Art. 51–56

The Systemic Risk Tier (Art. 51)

Article 51 creates a two-tier system within GPAI:

Tier 1 — All GPAI models: Articles 52–53 apply.

Tier 2 — GPAI models with systemic risk: Articles 54–55 apply in addition to Tier 1.

A GPAI model has systemic risk if:

Training compute exceeds 10^25 FLOPs (Art. 51(1)(a)) — the objective threshold
The Commission designates it as posing systemic risk based on qualitative criteria (Art. 51(1)(b)) — subjective designation power

The 10^25 FLOP threshold is currently above GPT-4 (estimated 10^24–10^25), but below GPT-4-class training runs using modern efficient parallelism. The Commission can revise this threshold via delegated act (Art. 51(3)) to track compute scaling.

In practice in 2026: OpenAI GPT-4 and GPT-4o, Google Gemini Ultra, Anthropic Claude 3 Opus/Sonnet, and Meta Llama 3 (largest variants) are either at or approaching the systemic risk threshold. The Commission has discretion to designate models even below 10^25 FLOPs if they pose cross-sectoral risks.

The EU AI Office: Structure and Powers

Establishment and Role

The EU AI Office (EUAIO) was established within the European Commission by Commission Decision 2024/1438 and is now the primary enforcement body for GPAI models under the AI Act. Unlike High-Risk AI enforcement (handled by national Market Surveillance Authorities), GPAI oversight is centralized at EU level.

Key structural features:

Location: Brussels (DG CNECT, European Commission)
Jurisdiction: All GPAI providers operating in the EU, regardless of establishment location
Operates under: Regulation 2024/1689, Chapter VIII (Art. 64-70 on AI Office powers)
Tools: Model evaluations, access to model weights, investigation powers, fines enforcement

AI Office Enforcement Powers (Art. 88-99)

The AI Office can:

Power	Legal Basis
Request technical documentation from GPAI providers	Art. 88
Conduct evaluations of GPAI models (incl. red-teaming)	Art. 92
Order providers to take risk mitigation measures	Art. 93
Access source code, training data samples, model weights	Art. 91
Impose fines (via Commission)	Art. 99

Fine exposure for GPAI violations:

Art. 53 / Art. 55 violations (provider obligations): up to €15 million or 3% of worldwide annual turnover
Supplying incorrect/misleading information to the AI Office: up to €7.5 million or 1.5% of worldwide turnover
For SMEs: lower of the two thresholds always applies

Art. 52: Base Obligations for All GPAI Providers

Article 52 imposes obligations on every GPAI model provider regardless of compute scale.

52(1): Technical Documentation

Providers must draw up technical documentation before placing the GPAI model on the EU market. Annex XI specifies the required content:

Annex XI — GPAI Technical Documentation Requirements:
1. General description:
   - Intended purpose
   - Architecture (parameter count, context window, modalities)
   - Training methodology
   - Performance benchmarks
   - Known limitations

2. Training data description:
   - Data sources and types
   - Data collection methodology
   - Data filtering and preprocessing pipeline
   - Copyright compliance measures (Art. 53(1)(c))

3. Evaluation and testing:
   - Evaluation benchmarks and results
   - Bias and fairness evaluations
   - Safety testing results (red-teaming for systemic risk models)

4. Model card equivalent:
   - Intended and prohibited use cases
   - Known risks and mitigation measures
   - Fine-tuning and adaptation guidance

This documentation must be kept up to date and provided to the AI Office upon request.

52(2): Information for Downstream Providers

When a GPAI model is integrated into a downstream AI system (a chatbot, coding assistant, retrieval system), the GPAI provider must give downstream developers sufficient information to comply with their own AI Act obligations.

Practically: If you build a High-Risk AI system using GPT-4 or Claude as the reasoning backbone, OpenAI/Anthropic must give you the information you need for your own Annex III risk assessment. This creates a contractual chain of compliance information.

For developers using GPAI APIs:

You receive information adequate for your downstream compliance
You remain responsible for your system's classification (High-Risk vs. not)
The GPAI provider's obligations do not substitute for yours

52(3): Authorized EU Representative

GPAI providers established outside the EU must designate an authorized representative in the EU (Art. 54) before placing models on the EU market. This is similar to GDPR's Art. 27 representative requirement.

US-based providers (OpenAI, Anthropic, Google DeepMind US) must have an EU legal entity or designated representative. Failure to designate is itself an Art. 52 violation.

52(4): Open-Source Exemptions

Providers of open-source GPAI models benefit from:

Exemption from the Annex XI technical documentation requirement (Art. 53(2)(a))
Exemption from downstream information sharing (Art. 53(2)(b))
They remain subject to copyright transparency (Art. 53(1)(c))
And subject to systemic risk obligations if the threshold is met (Art. 55 has no open-source exemption)

This is why Llama 3 (open weights) still needs to comply with Art. 55 if training compute qualifies, and must maintain copyright transparency regardless.

Art. 53: Transparency and Copyright Compliance

53(1)(a-b): Technical Documentation and Model Cards

Article 53(1)(a) and (b) operationalize the Annex XI documentation requirement and require GPAI providers to make model cards publicly available — Annex XII specifies the minimum content:

Annex XII — GPAI Model Card (Public Summary):
1. Intended purpose
2. Capabilities and performance benchmarks
3. Known limitations and risks
4. Prohibited uses
5. Context-appropriate safety measures
6. Copyright compliance summary (how training data copyright was handled)

These are public-facing documents. They must be sufficiently detailed for downstream developers to make informed integration decisions.

53(1)(c): Copyright and Training Data Transparency

This is arguably the most immediately impactful provision for the AI industry. Article 53(1)(c) requires GPAI providers to:

"put in place a policy to comply with Union law on copyright and related rights, and in particular to identify and comply with, including through state-of-the-art technologies, a reservation of rights expressed pursuant to Article 4(3) of Directive 2019/790."

What this means in practice:

Opt-out compliance (Directive 2019/790 Art. 4(3)): Website operators and content publishers can opt out of having their content used for AI training by expressing a machine-readable reservation (e.g., robots.txt with User-agent: GPTBot or the TDMRep standard from W3C). GPAI providers must implement technical systems to detect and honor these reservations.
Transparency about training data: Providers must publish a "sufficiently detailed summary" of training data (Art. 53(1)(d)):

# Example GPAI training data disclosure structure (Art. 53(1)(d) compliance)
class TrainingDataDisclosure:
    """
    Summary of training data per Art. 53(1)(d) EU AI Act.
    Must be published publicly and kept up to date.
    """
    
    def __init__(self):
        self.data_sources = {
            "web_crawl": {
                "description": "Filtered common crawl snapshots",
                "date_range": "2016–2024",
                "size_tokens": "~2T tokens",
                "opt_out_compliance": "robots.txt honored, TDMRep honored since 2023-01",
                "copyright_measures": "C4-style URL filtering, deduplication"
            },
            "books": {
                "description": "Licensed book collections",
                "licensing": "Partnership agreements with publishers",
                "size_tokens": "~100B tokens",
                "opt_out_compliance": "N/A — licensed"
            },
            "code": {
                "description": "Public code repositories",
                "licenses": ["MIT", "Apache-2.0", "BSD-2/3"],
                "excluded": ["GPL-3.0 (copyleft)", "AGPL-3.0", "unlicensed"],
                "size_tokens": "~200B tokens",
                "opt_out_compliance": "robots.txt honored for code hosting platforms"
            },
            "academic": {
                "description": "Open access academic papers",
                "source": "arXiv, PubMed Open Access, Semantic Scholar",
                "size_tokens": "~50B tokens",
                "opt_out_compliance": "Publisher agreements or CC licensing"
            }
        }
        
    def publish_summary(self) -> dict:
        """Returns public-facing Annex XII summary."""
        return {
            "total_sources": len(self.data_sources),
            "opt_out_mechanism": "robots.txt + TDMRep (W3C) + custom opt-out portal",
            "opt_out_portal": "https://provider.com/ai-training-opt-out",
            "copyright_review_date": "2024-Q4",
            "next_review": "2025-Q2"
        }

Copyright litigation risk: Art. 53(1)(c) creates a direct liability hook. If a rights holder can show their opt-out was not honored, the GPAI provider has violated Art. 53 and the rights holder can invoke both EU copyright law (Directive 2019/790) AND AI Act enforcement. This is a dual enforcement pathway that did not exist before the AI Act.

53(1)(d): Training Data Summary

GPAI providers must maintain and publish a sufficiently detailed summary of the training data used:

Data categories (text, code, images, video, audio)
Data sources (web crawl, licensed datasets, synthetic data)
Time periods covered
Opt-out compliance mechanisms

This summary must be available on the EUAIO public register (Art. 71) — a public database of GPAI models and their documentation.

Art. 55: Systemic Risk Obligations

For models above the 10^25 FLOP threshold (or Commission-designated), Article 55 imposes a second tier of obligations.

55(1)(a): Model Evaluations

Systemic risk GPAI providers must conduct adversarial testing including:

Red-teaming for misuse (bioweapons uplift, cyberattacks, CSAM generation)
Evaluation against AI Office-published benchmark protocols
Testing before major capability updates

The AI Office has published preliminary evaluation protocols; formal harmonized standards (under Art. 40) are in development via CEN/CENELEC.

55(1)(b): Adversarial Testing with AI Office Cooperation

Providers may be required to share model weights or API access with the AI Office for independent evaluation. This is a significant national security-adjacent power: the Commission can evaluate closed-weights models under controlled conditions.

55(1)(c): Incident Reporting

GPAI providers with systemic risk must implement an incident reporting system:

class GPAIIncidentReport:
    """
    Incident report structure per Art. 55(1)(c) EU AI Act.
    Reportable incidents: serious incidents with EU persons or fundamental rights.
    Timeline: Serious incidents → 2-business-day initial notification to AI Office.
    """
    
    def __init__(
        self,
        incident_id: str,
        provider_name: str,
        model_name: str,
        discovery_datetime: str,
        description: str,
        affected_users_estimate: int,
        severity: str,  # "serious" | "significant" | "minor"
        harm_categories: list[str],
        immediate_mitigation: str,
        eu_ai_office_notification_required: bool
    ):
        self.incident_id = incident_id
        self.provider_name = provider_name
        self.model_name = model_name
        self.discovery_datetime = discovery_datetime
        self.description = description
        self.affected_users_estimate = affected_users_estimate
        self.severity = severity
        self.harm_categories = harm_categories
        self.immediate_mitigation = immediate_mitigation
        self.eu_ai_office_notification_required = eu_ai_office_notification_required
        
    def get_notification_deadline(self) -> str:
        """
        Art. 55(1)(c): Serious incidents → 2 business day notification.
        Art. 55(1)(c): Significant incidents → follow-up within 15 days.
        """
        if self.severity == "serious":
            return "2 business days from discovery"
        elif self.severity == "significant":
            return "15 calendar days from discovery"
        else:
            return "No mandatory notification — retain for annual summary"
    
    def reportable_harm_categories(self) -> list[str]:
        """Per Recital 110: What constitutes 'serious incident'."""
        return [
            "Death or serious physical harm",
            "Disruption of critical infrastructure (NIS2 scope)",
            "Property damage exceeding €500,000",
            "Serious privacy breach (GDPR Art. 4(12) threshold)",
            "Fundamental rights violation at scale",
            "National security implications"
        ]

Note the intersection with NIS2: a GPAI-related incident that also involves critical infrastructure creates dual reporting — AI Act incident notification to the EUAIO + NIS2 early warning to the competent NCA/CSIRT within 24 hours. See our NIS2 × AI Act guide for the full dual-reporting matrix.

55(1)(d): Cybersecurity Measures

Systemic risk GPAI providers must implement "adequate cybersecurity protection" including:

Access controls for model weights and inference infrastructure
Protection against model extraction (model stealing attacks)
Protection against training data extraction (membership inference)
Incident response procedures

This overlaps with DORA (Digital Operational Resilience Act) for financial sector deployments of GPAI systems.

55(2): Adversarial Testing Standards

The AI Office issues technical standards for adversarial testing. In 2025, the AI Office published preliminary red-teaming guidance covering:

Uplift evaluation (weapons of mass destruction, bioweapons)
CBRN (chemical/biological/radiological/nuclear) uplift testing
Cyberoffense capability evaluation
Influence operations and synthetic media generation

Providers can use AI Safety Institutes (established in UK, USA, Singapore, Japan) results as partial evidence of compliance, but the EU AI Office runs its own evaluation program.

Art. 56: GPAI Codes of Practice

Article 56 establishes a co-regulatory mechanism: the AI Office facilitates development of Codes of Practice by GPAI providers, research institutions, and civil society. These Codes create presumption of conformity with Articles 52–55 when followed.

Status in 2026

The AI Office began the Code of Practice process in 2024. Key participants include:

OpenAI, Google DeepMind, Anthropic, Meta, Mistral
European research institutions (INRIA, Fraunhofer, Alan Turing Institute)
Civil society organizations (AlgorithmWatch, EDRi)

The Code of Practice covers:

Topic	Code Requirements	AI Act Provision
Technical documentation	Annex XI template + update cadence	Art. 52
Model cards	Annex XII minimum + machine-readable format	Art. 53(1)(d)
Copyright opt-out	TDMRep + robots.txt + opt-out portal SLA	Art. 53(1)(c)
Red-teaming	Frequency (pre-release + major update) + scope	Art. 55(1)(a)
Incident reporting	2-day notification + 15-day detailed report format	Art. 55(1)(c)
Open-source GPAI	Reduced documentation, same copyright rules	Art. 53(2)

Following the Code creates a safe harbor presumption. Providers not following the Code must demonstrate equivalent compliance — a higher evidentiary bar.

Open-Source Model Implications

Open-source GPAI providers (Meta's Llama, Mistral AI) face a structural tension:

Documentation requirements are reduced (Art. 53(2) exemption)
Copyright transparency still applies — opt-out compliance cannot be waived
Systemic risk obligations still apply if compute threshold is met
The Code of Practice has a separate track for open-source models

def check_gpai_obligations(
    model_name: str,
    training_flops: float,  # in FLOPs
    is_open_source: bool,
    commission_designated: bool
) -> dict:
    """
    Determine GPAI obligations under EU AI Act Art. 51-55.
    """
    
    SYSTEMIC_RISK_THRESHOLD = 1e25  # 10^25 FLOPs
    
    is_systemic_risk = (
        training_flops >= SYSTEMIC_RISK_THRESHOLD 
        or commission_designated
    )
    
    obligations = {
        "art_52_documentation": True,  # Always required
        "art_52_downstream_info": not is_open_source,  # Exempt for open-source
        "art_52_eu_representative": True,  # Always for non-EU providers
        "art_53_copyright_policy": True,  # Always required (incl. open-source)
        "art_53_training_data_summary": not is_open_source,  # Exempt for open-source
        "art_55_adversarial_testing": is_systemic_risk,
        "art_55_incident_reporting": is_systemic_risk,
        "art_55_cybersecurity": is_systemic_risk,
    }
    
    return {
        "model": model_name,
        "systemic_risk": is_systemic_risk,
        "open_source_reduced_regime": is_open_source,
        "obligations": obligations,
        "fine_exposure": "€15M or 3% global turnover" if not is_open_source else "€7.5M or 1.5% (reduced)",
        "primary_enforcer": "EU AI Office (centralized)"
    }

# Examples
print(check_gpai_obligations("GPT-4", 2e24, False, True))
# → systemic_risk: True (Commission-designated)
# → all Art. 55 obligations apply

print(check_gpai_obligations("Mistral-7B", 1e23, True, False))
# → systemic_risk: False
# → open_source: reduced documentation (Art. 53(2) exemption)
# → copyright transparency: STILL required

print(check_gpai_obligations("Llama-3-405B", 5e24, True, False))
# → systemic_risk: uncertain (approaching threshold)
# → Commission designation possible

Downstream Developer Obligations

If you integrate a GPAI model (via API or self-hosted weights) into an application, you are a "deployer" under the AI Act, and your obligations depend on the downstream system's classification:

High-Risk AI System Using GPAI (Annex III)

If your application falls under Annex III (credit scoring, recruitment, medical diagnosis, critical infrastructure), you must:

Conduct a risk assessment that accounts for the GPAI model's capabilities and limitations
Use the information provided by the GPAI provider (Art. 52(2)) in your risk assessment
Register your system in the EU database (Art. 49)
Maintain technical documentation (Annex IV)
Implement human oversight (Art. 14) — this is where "human in the loop" becomes a legal requirement, not just a design preference

The GPAI provider's compliance does not exempt your downstream system from Annex III requirements.

Low-Risk/Non-Regulated AI Using GPAI

For most applications (chatbots, coding assistants, summarization tools not in Annex III):

No AI Act registration required
But transparency obligations (Art. 50) apply: you must inform users when they interact with an AI system
If generating synthetic content (audio, video, images, text) with potential for public dissemination, watermarking/labeling requirements apply (Art. 50(2))

The CLOUD Act Problem for GPAI Deployments

When you deploy a GPAI model via a US-based provider (OpenAI API, Google Gemini API, Anthropic API), you are routing EU user interactions through US infrastructure. This creates:

CLOUD Act reach: US law enforcement can compel production of inference logs, conversation data, and user metadata held by US providers without EU judicial cooperation
Art. 46 GDPR conflict: International transfers of EU personal data to US providers require adequacy (EU-US DPF, Privacy Shield successor) or SCCs — inference data often contains personal data
Art. 53(1)(c) copyright conflict: When the EU AI Office investigates copyright compliance, it may need access to training data documentation held in US systems — subject to CLOUD Act parallel access

EU-native GPAI deployment (hosting open-weight models on EU infrastructure) eliminates the CLOUD Act vector:

Inference data stays in EU jurisdiction
No US government compelled access pathway
Single regulatory regime: EU AI Act + GDPR only
Relevant for GPAI providers who open-source weights (Llama 3, Mistral) + deployers who need CLOUD Act isolation

class GPAIDeploymentJurisdiction:
    """
    Jurisdiction analysis for GPAI deployment choices.
    Relevant for developers choosing between API and self-hosted.
    """
    
    DEPLOYMENT_OPTIONS = {
        "us_api": {
            "provider_example": "OpenAI API / Anthropic API",
            "inference_jurisdiction": "USA (US servers)",
            "cloud_act_reach": True,
            "gdpr_transfer_mechanism": "Standard Contractual Clauses (SCCs)",
            "eu_ai_act_compliance": "Provider responsible for Art. 52-55",
            "deployer_risk": "CLOUD Act exposure for inference data",
            "cost": "Pay-per-token"
        },
        "eu_api": {
            "provider_example": "Mistral API (Paris), Aleph Alpha",
            "inference_jurisdiction": "EU (FR/DE servers)",
            "cloud_act_reach": False,
            "gdpr_transfer_mechanism": "N/A — EU-resident processing",
            "eu_ai_act_compliance": "Provider responsible for Art. 52-55",
            "deployer_risk": "Low — single EU jurisdiction",
            "cost": "Pay-per-token"
        },
        "self_hosted_eu": {
            "provider_example": "Llama 3.1 70B on sota.io EU PaaS",
            "inference_jurisdiction": "EU (deployer's choice of EU region)",
            "cloud_act_reach": False,
            "gdpr_transfer_mechanism": "N/A — EU-resident processing",
            "eu_ai_act_compliance": "Deployer responsible for system-level compliance",
            "deployer_risk": "Full control — no third-party CLOUD Act pathway",
            "cost": "Infrastructure cost, no per-token pricing"
        }
    }
    
    def get_recommendation(self, requirements: dict) -> str:
        if requirements.get("cloud_act_isolation") and requirements.get("cost_predictable"):
            return "self_hosted_eu"
        elif requirements.get("cloud_act_isolation"):
            return "eu_api"
        else:
            return "us_api"

Compliance Timeline

The GPAI provisions have a specific entry-into-force schedule distinct from the rest of the AI Act:

Date	Milestone
2 August 2025	GPAI model obligations (Art. 51-56) fully applicable
2 August 2025	AI Office operational, GPAI register open
2 February 2026	Codes of Practice finalized (if process complete)
2 August 2026	High-Risk AI Annex III obligations fully applicable
2 August 2027	GPAI models under pre-existing rules must comply

The GPAI obligations entered force in August 2025 — they are already applicable. If your organization provides a GPAI model or integrates one in an Annex III application, compliance is not a future task.

What GPAI Providers Should Have Done by Now

Technical documentation (Annex XI) — drafted, reviewed by legal, submitted to AI Office registration system
Model card (Annex XII) — published publicly
Copyright policy — implemented TDMRep + robots.txt scraping compliance + opt-out portal
Training data summary (Art. 53(1)(d)) — published, up to date
EU authorized representative — designated if provider is non-EU
Incident reporting system — operational (for systemic risk models)
Adversarial testing — completed pre-release evaluation (for systemic risk models)

What EU-Native GPAI Deployment Means for Developers

Building AI-powered applications on EU infrastructure using open-weight GPAI models (Llama 3, Mistral, Falcon) on an EU PaaS like sota.io achieves:

Single regulatory jurisdiction: EU AI Act + GDPR only. No CLOUD Act parallel compliance pathway.
CLOUD Act isolation: Inference data, user conversations, and embeddings stay in EU territory.
Data residency for Art. 46 GDPR: No cross-border transfer SCCs needed for inference.
Alignment with systemic risk avoidance: Self-hosted open-weight models below systemic risk threshold avoid the most demanding Art. 55 obligations at the deployer level.
Audit trail within EU jurisdiction: All logs, model outputs, and monitoring data remain under EU judicial reach only.

For regulated industries (financial services under DORA, healthcare under MDR, critical infrastructure under NIS2), EU-native GPAI deployment is not just a compliance preference — it is increasingly a procurement requirement as regulators publish GPAI guidance specific to their sectors.