2026-04-10·15 min read·sota.io team

EU AI Office & GPAI Model Regulation: Developer Guide (EU AI Act Art.51-56)

The EU AI Act (Regulation 2024/1689) introduced a new regulatory category that did not exist in the original 2021 Commission proposal: General-Purpose AI (GPAI) models. Articles 51–56, added during trialogue negotiations in 2023, create tiered obligations for foundation model providers, and the newly established EU AI Office has primary enforcement jurisdiction over these provisions — bypassing national market surveillance authorities.

If you build, fine-tune, host, or integrate a GPAI model into applications serving EU users, Articles 51–56 apply directly to your product. If you use a GPAI model (OpenAI, Anthropic, Meta, Mistral) as a building block, downstream developer obligations from Article 52(2) still reach you.

This guide covers the complete GPAI regulatory framework: scope, provider obligations, the systemic risk tier, GPAI Codes of Practice, enforcement by the EU AI Office, and what EU-native infrastructure means for compliance.

What is a GPAI Model under the EU AI Act?

Definition (Art. 3(63))

A General-Purpose AI model is an AI model — including large generative models — that:

  1. Is trained on large amounts of data using self-supervision at scale
  2. Displays significant generality
  3. Is capable of competently performing a wide range of distinct tasks
  4. Can be integrated into a variety of downstream systems or applications

This covers: GPT-4, Claude 3/4, Llama 3/4, Mistral, Gemini, PaLM 2, Falcon, Command R, Phi-3, and any comparable model. It does not cover narrow, task-specific AI systems (a spam classifier is not a GPAI model).

What is NOT a GPAI Model

The Recitals (§ 97) clarify:

The Systemic Risk Tier (Art. 51)

Article 51 creates a two-tier system within GPAI:

Tier 1 — All GPAI models: Articles 52–53 apply.

Tier 2 — GPAI models with systemic risk: Articles 54–55 apply in addition to Tier 1.

A GPAI model has systemic risk if:

  1. Training compute exceeds 10^25 FLOPs (Art. 51(1)(a)) — the objective threshold
  2. The Commission designates it as posing systemic risk based on qualitative criteria (Art. 51(1)(b)) — subjective designation power

The 10^25 FLOP threshold is currently above GPT-4 (estimated 10^24–10^25), but below GPT-4-class training runs using modern efficient parallelism. The Commission can revise this threshold via delegated act (Art. 51(3)) to track compute scaling.

In practice in 2026: OpenAI GPT-4 and GPT-4o, Google Gemini Ultra, Anthropic Claude 3 Opus/Sonnet, and Meta Llama 3 (largest variants) are either at or approaching the systemic risk threshold. The Commission has discretion to designate models even below 10^25 FLOPs if they pose cross-sectoral risks.

The EU AI Office: Structure and Powers

Establishment and Role

The EU AI Office (EUAIO) was established within the European Commission by Commission Decision 2024/1438 and is now the primary enforcement body for GPAI models under the AI Act. Unlike High-Risk AI enforcement (handled by national Market Surveillance Authorities), GPAI oversight is centralized at EU level.

Key structural features:

AI Office Enforcement Powers (Art. 88-99)

The AI Office can:

PowerLegal Basis
Request technical documentation from GPAI providersArt. 88
Conduct evaluations of GPAI models (incl. red-teaming)Art. 92
Order providers to take risk mitigation measuresArt. 93
Access source code, training data samples, model weightsArt. 91
Impose fines (via Commission)Art. 99

Fine exposure for GPAI violations:

Art. 52: Base Obligations for All GPAI Providers

Article 52 imposes obligations on every GPAI model provider regardless of compute scale.

52(1): Technical Documentation

Providers must draw up technical documentation before placing the GPAI model on the EU market. Annex XI specifies the required content:

Annex XI — GPAI Technical Documentation Requirements:
1. General description:
   - Intended purpose
   - Architecture (parameter count, context window, modalities)
   - Training methodology
   - Performance benchmarks
   - Known limitations

2. Training data description:
   - Data sources and types
   - Data collection methodology
   - Data filtering and preprocessing pipeline
   - Copyright compliance measures (Art. 53(1)(c))

3. Evaluation and testing:
   - Evaluation benchmarks and results
   - Bias and fairness evaluations
   - Safety testing results (red-teaming for systemic risk models)

4. Model card equivalent:
   - Intended and prohibited use cases
   - Known risks and mitigation measures
   - Fine-tuning and adaptation guidance

This documentation must be kept up to date and provided to the AI Office upon request.

52(2): Information for Downstream Providers

When a GPAI model is integrated into a downstream AI system (a chatbot, coding assistant, retrieval system), the GPAI provider must give downstream developers sufficient information to comply with their own AI Act obligations.

Practically: If you build a High-Risk AI system using GPT-4 or Claude as the reasoning backbone, OpenAI/Anthropic must give you the information you need for your own Annex III risk assessment. This creates a contractual chain of compliance information.

For developers using GPAI APIs:

52(3): Authorized EU Representative

GPAI providers established outside the EU must designate an authorized representative in the EU (Art. 54) before placing models on the EU market. This is similar to GDPR's Art. 27 representative requirement.

US-based providers (OpenAI, Anthropic, Google DeepMind US) must have an EU legal entity or designated representative. Failure to designate is itself an Art. 52 violation.

52(4): Open-Source Exemptions

Providers of open-source GPAI models benefit from:

This is why Llama 3 (open weights) still needs to comply with Art. 55 if training compute qualifies, and must maintain copyright transparency regardless.

53(1)(a-b): Technical Documentation and Model Cards

Article 53(1)(a) and (b) operationalize the Annex XI documentation requirement and require GPAI providers to make model cards publicly available — Annex XII specifies the minimum content:

Annex XII — GPAI Model Card (Public Summary):
1. Intended purpose
2. Capabilities and performance benchmarks
3. Known limitations and risks
4. Prohibited uses
5. Context-appropriate safety measures
6. Copyright compliance summary (how training data copyright was handled)

These are public-facing documents. They must be sufficiently detailed for downstream developers to make informed integration decisions.

This is arguably the most immediately impactful provision for the AI industry. Article 53(1)(c) requires GPAI providers to:

"put in place a policy to comply with Union law on copyright and related rights, and in particular to identify and comply with, including through state-of-the-art technologies, a reservation of rights expressed pursuant to Article 4(3) of Directive 2019/790."

What this means in practice:

  1. Opt-out compliance (Directive 2019/790 Art. 4(3)): Website operators and content publishers can opt out of having their content used for AI training by expressing a machine-readable reservation (e.g., robots.txt with User-agent: GPTBot or the TDMRep standard from W3C). GPAI providers must implement technical systems to detect and honor these reservations.

  2. Transparency about training data: Providers must publish a "sufficiently detailed summary" of training data (Art. 53(1)(d)):

# Example GPAI training data disclosure structure (Art. 53(1)(d) compliance)
class TrainingDataDisclosure:
    """
    Summary of training data per Art. 53(1)(d) EU AI Act.
    Must be published publicly and kept up to date.
    """
    
    def __init__(self):
        self.data_sources = {
            "web_crawl": {
                "description": "Filtered common crawl snapshots",
                "date_range": "2016–2024",
                "size_tokens": "~2T tokens",
                "opt_out_compliance": "robots.txt honored, TDMRep honored since 2023-01",
                "copyright_measures": "C4-style URL filtering, deduplication"
            },
            "books": {
                "description": "Licensed book collections",
                "licensing": "Partnership agreements with publishers",
                "size_tokens": "~100B tokens",
                "opt_out_compliance": "N/A — licensed"
            },
            "code": {
                "description": "Public code repositories",
                "licenses": ["MIT", "Apache-2.0", "BSD-2/3"],
                "excluded": ["GPL-3.0 (copyleft)", "AGPL-3.0", "unlicensed"],
                "size_tokens": "~200B tokens",
                "opt_out_compliance": "robots.txt honored for code hosting platforms"
            },
            "academic": {
                "description": "Open access academic papers",
                "source": "arXiv, PubMed Open Access, Semantic Scholar",
                "size_tokens": "~50B tokens",
                "opt_out_compliance": "Publisher agreements or CC licensing"
            }
        }
        
    def publish_summary(self) -> dict:
        """Returns public-facing Annex XII summary."""
        return {
            "total_sources": len(self.data_sources),
            "opt_out_mechanism": "robots.txt + TDMRep (W3C) + custom opt-out portal",
            "opt_out_portal": "https://provider.com/ai-training-opt-out",
            "copyright_review_date": "2024-Q4",
            "next_review": "2025-Q2"
        }
  1. Copyright litigation risk: Art. 53(1)(c) creates a direct liability hook. If a rights holder can show their opt-out was not honored, the GPAI provider has violated Art. 53 and the rights holder can invoke both EU copyright law (Directive 2019/790) AND AI Act enforcement. This is a dual enforcement pathway that did not exist before the AI Act.

53(1)(d): Training Data Summary

GPAI providers must maintain and publish a sufficiently detailed summary of the training data used:

This summary must be available on the EUAIO public register (Art. 71) — a public database of GPAI models and their documentation.

Art. 55: Systemic Risk Obligations

For models above the 10^25 FLOP threshold (or Commission-designated), Article 55 imposes a second tier of obligations.

55(1)(a): Model Evaluations

Systemic risk GPAI providers must conduct adversarial testing including:

The AI Office has published preliminary evaluation protocols; formal harmonized standards (under Art. 40) are in development via CEN/CENELEC.

55(1)(b): Adversarial Testing with AI Office Cooperation

Providers may be required to share model weights or API access with the AI Office for independent evaluation. This is a significant national security-adjacent power: the Commission can evaluate closed-weights models under controlled conditions.

55(1)(c): Incident Reporting

GPAI providers with systemic risk must implement an incident reporting system:

class GPAIIncidentReport:
    """
    Incident report structure per Art. 55(1)(c) EU AI Act.
    Reportable incidents: serious incidents with EU persons or fundamental rights.
    Timeline: Serious incidents → 2-business-day initial notification to AI Office.
    """
    
    def __init__(
        self,
        incident_id: str,
        provider_name: str,
        model_name: str,
        discovery_datetime: str,
        description: str,
        affected_users_estimate: int,
        severity: str,  # "serious" | "significant" | "minor"
        harm_categories: list[str],
        immediate_mitigation: str,
        eu_ai_office_notification_required: bool
    ):
        self.incident_id = incident_id
        self.provider_name = provider_name
        self.model_name = model_name
        self.discovery_datetime = discovery_datetime
        self.description = description
        self.affected_users_estimate = affected_users_estimate
        self.severity = severity
        self.harm_categories = harm_categories
        self.immediate_mitigation = immediate_mitigation
        self.eu_ai_office_notification_required = eu_ai_office_notification_required
        
    def get_notification_deadline(self) -> str:
        """
        Art. 55(1)(c): Serious incidents → 2 business day notification.
        Art. 55(1)(c): Significant incidents → follow-up within 15 days.
        """
        if self.severity == "serious":
            return "2 business days from discovery"
        elif self.severity == "significant":
            return "15 calendar days from discovery"
        else:
            return "No mandatory notification — retain for annual summary"
    
    def reportable_harm_categories(self) -> list[str]:
        """Per Recital 110: What constitutes 'serious incident'."""
        return [
            "Death or serious physical harm",
            "Disruption of critical infrastructure (NIS2 scope)",
            "Property damage exceeding €500,000",
            "Serious privacy breach (GDPR Art. 4(12) threshold)",
            "Fundamental rights violation at scale",
            "National security implications"
        ]

Note the intersection with NIS2: a GPAI-related incident that also involves critical infrastructure creates dual reporting — AI Act incident notification to the EUAIO + NIS2 early warning to the competent NCA/CSIRT within 24 hours. See our NIS2 × AI Act guide for the full dual-reporting matrix.

55(1)(d): Cybersecurity Measures

Systemic risk GPAI providers must implement "adequate cybersecurity protection" including:

This overlaps with DORA (Digital Operational Resilience Act) for financial sector deployments of GPAI systems.

55(2): Adversarial Testing Standards

The AI Office issues technical standards for adversarial testing. In 2025, the AI Office published preliminary red-teaming guidance covering:

  1. Uplift evaluation (weapons of mass destruction, bioweapons)
  2. CBRN (chemical/biological/radiological/nuclear) uplift testing
  3. Cyberoffense capability evaluation
  4. Influence operations and synthetic media generation

Providers can use AI Safety Institutes (established in UK, USA, Singapore, Japan) results as partial evidence of compliance, but the EU AI Office runs its own evaluation program.

Art. 56: GPAI Codes of Practice

Article 56 establishes a co-regulatory mechanism: the AI Office facilitates development of Codes of Practice by GPAI providers, research institutions, and civil society. These Codes create presumption of conformity with Articles 52–55 when followed.

Status in 2026

The AI Office began the Code of Practice process in 2024. Key participants include:

The Code of Practice covers:

TopicCode RequirementsAI Act Provision
Technical documentationAnnex XI template + update cadenceArt. 52
Model cardsAnnex XII minimum + machine-readable formatArt. 53(1)(d)
Copyright opt-outTDMRep + robots.txt + opt-out portal SLAArt. 53(1)(c)
Red-teamingFrequency (pre-release + major update) + scopeArt. 55(1)(a)
Incident reporting2-day notification + 15-day detailed report formatArt. 55(1)(c)
Open-source GPAIReduced documentation, same copyright rulesArt. 53(2)

Following the Code creates a safe harbor presumption. Providers not following the Code must demonstrate equivalent compliance — a higher evidentiary bar.

Open-Source Model Implications

Open-source GPAI providers (Meta's Llama, Mistral AI) face a structural tension:

def check_gpai_obligations(
    model_name: str,
    training_flops: float,  # in FLOPs
    is_open_source: bool,
    commission_designated: bool
) -> dict:
    """
    Determine GPAI obligations under EU AI Act Art. 51-55.
    """
    
    SYSTEMIC_RISK_THRESHOLD = 1e25  # 10^25 FLOPs
    
    is_systemic_risk = (
        training_flops >= SYSTEMIC_RISK_THRESHOLD 
        or commission_designated
    )
    
    obligations = {
        "art_52_documentation": True,  # Always required
        "art_52_downstream_info": not is_open_source,  # Exempt for open-source
        "art_52_eu_representative": True,  # Always for non-EU providers
        "art_53_copyright_policy": True,  # Always required (incl. open-source)
        "art_53_training_data_summary": not is_open_source,  # Exempt for open-source
        "art_55_adversarial_testing": is_systemic_risk,
        "art_55_incident_reporting": is_systemic_risk,
        "art_55_cybersecurity": is_systemic_risk,
    }
    
    return {
        "model": model_name,
        "systemic_risk": is_systemic_risk,
        "open_source_reduced_regime": is_open_source,
        "obligations": obligations,
        "fine_exposure": "€15M or 3% global turnover" if not is_open_source else "€7.5M or 1.5% (reduced)",
        "primary_enforcer": "EU AI Office (centralized)"
    }

# Examples
print(check_gpai_obligations("GPT-4", 2e24, False, True))
# → systemic_risk: True (Commission-designated)
# → all Art. 55 obligations apply

print(check_gpai_obligations("Mistral-7B", 1e23, True, False))
# → systemic_risk: False
# → open_source: reduced documentation (Art. 53(2) exemption)
# → copyright transparency: STILL required

print(check_gpai_obligations("Llama-3-405B", 5e24, True, False))
# → systemic_risk: uncertain (approaching threshold)
# → Commission designation possible

Downstream Developer Obligations

If you integrate a GPAI model (via API or self-hosted weights) into an application, you are a "deployer" under the AI Act, and your obligations depend on the downstream system's classification:

High-Risk AI System Using GPAI (Annex III)

If your application falls under Annex III (credit scoring, recruitment, medical diagnosis, critical infrastructure), you must:

  1. Conduct a risk assessment that accounts for the GPAI model's capabilities and limitations
  2. Use the information provided by the GPAI provider (Art. 52(2)) in your risk assessment
  3. Register your system in the EU database (Art. 49)
  4. Maintain technical documentation (Annex IV)
  5. Implement human oversight (Art. 14) — this is where "human in the loop" becomes a legal requirement, not just a design preference

The GPAI provider's compliance does not exempt your downstream system from Annex III requirements.

Low-Risk/Non-Regulated AI Using GPAI

For most applications (chatbots, coding assistants, summarization tools not in Annex III):

The CLOUD Act Problem for GPAI Deployments

When you deploy a GPAI model via a US-based provider (OpenAI API, Google Gemini API, Anthropic API), you are routing EU user interactions through US infrastructure. This creates:

  1. CLOUD Act reach: US law enforcement can compel production of inference logs, conversation data, and user metadata held by US providers without EU judicial cooperation
  2. Art. 46 GDPR conflict: International transfers of EU personal data to US providers require adequacy (EU-US DPF, Privacy Shield successor) or SCCs — inference data often contains personal data
  3. Art. 53(1)(c) copyright conflict: When the EU AI Office investigates copyright compliance, it may need access to training data documentation held in US systems — subject to CLOUD Act parallel access

EU-native GPAI deployment (hosting open-weight models on EU infrastructure) eliminates the CLOUD Act vector:

class GPAIDeploymentJurisdiction:
    """
    Jurisdiction analysis for GPAI deployment choices.
    Relevant for developers choosing between API and self-hosted.
    """
    
    DEPLOYMENT_OPTIONS = {
        "us_api": {
            "provider_example": "OpenAI API / Anthropic API",
            "inference_jurisdiction": "USA (US servers)",
            "cloud_act_reach": True,
            "gdpr_transfer_mechanism": "Standard Contractual Clauses (SCCs)",
            "eu_ai_act_compliance": "Provider responsible for Art. 52-55",
            "deployer_risk": "CLOUD Act exposure for inference data",
            "cost": "Pay-per-token"
        },
        "eu_api": {
            "provider_example": "Mistral API (Paris), Aleph Alpha",
            "inference_jurisdiction": "EU (FR/DE servers)",
            "cloud_act_reach": False,
            "gdpr_transfer_mechanism": "N/A — EU-resident processing",
            "eu_ai_act_compliance": "Provider responsible for Art. 52-55",
            "deployer_risk": "Low — single EU jurisdiction",
            "cost": "Pay-per-token"
        },
        "self_hosted_eu": {
            "provider_example": "Llama 3.1 70B on sota.io EU PaaS",
            "inference_jurisdiction": "EU (deployer's choice of EU region)",
            "cloud_act_reach": False,
            "gdpr_transfer_mechanism": "N/A — EU-resident processing",
            "eu_ai_act_compliance": "Deployer responsible for system-level compliance",
            "deployer_risk": "Full control — no third-party CLOUD Act pathway",
            "cost": "Infrastructure cost, no per-token pricing"
        }
    }
    
    def get_recommendation(self, requirements: dict) -> str:
        if requirements.get("cloud_act_isolation") and requirements.get("cost_predictable"):
            return "self_hosted_eu"
        elif requirements.get("cloud_act_isolation"):
            return "eu_api"
        else:
            return "us_api"

Compliance Timeline

The GPAI provisions have a specific entry-into-force schedule distinct from the rest of the AI Act:

DateMilestone
2 August 2025GPAI model obligations (Art. 51-56) fully applicable
2 August 2025AI Office operational, GPAI register open
2 February 2026Codes of Practice finalized (if process complete)
2 August 2026High-Risk AI Annex III obligations fully applicable
2 August 2027GPAI models under pre-existing rules must comply

The GPAI obligations entered force in August 2025 — they are already applicable. If your organization provides a GPAI model or integrates one in an Annex III application, compliance is not a future task.

What GPAI Providers Should Have Done by Now

  1. Technical documentation (Annex XI) — drafted, reviewed by legal, submitted to AI Office registration system
  2. Model card (Annex XII) — published publicly
  3. Copyright policy — implemented TDMRep + robots.txt scraping compliance + opt-out portal
  4. Training data summary (Art. 53(1)(d)) — published, up to date
  5. EU authorized representative — designated if provider is non-EU
  6. Incident reporting system — operational (for systemic risk models)
  7. Adversarial testing — completed pre-release evaluation (for systemic risk models)

What EU-Native GPAI Deployment Means for Developers

Building AI-powered applications on EU infrastructure using open-weight GPAI models (Llama 3, Mistral, Falcon) on an EU PaaS like sota.io achieves:

  1. Single regulatory jurisdiction: EU AI Act + GDPR only. No CLOUD Act parallel compliance pathway.
  2. CLOUD Act isolation: Inference data, user conversations, and embeddings stay in EU territory.
  3. Data residency for Art. 46 GDPR: No cross-border transfer SCCs needed for inference.
  4. Alignment with systemic risk avoidance: Self-hosted open-weight models below systemic risk threshold avoid the most demanding Art. 55 obligations at the deployer level.
  5. Audit trail within EU jurisdiction: All logs, model outputs, and monitoring data remain under EU judicial reach only.

For regulated industries (financial services under DORA, healthcare under MDR, critical infrastructure under NIS2), EU-native GPAI deployment is not just a compliance preference — it is increasingly a procurement requirement as regulators publish GPAI guidance specific to their sectors.

See Also