EU AI Act Art.29 Obligations for Providers of General-Purpose AI Models: Developer Guide (2026)
EU AI Act Article 29 — together with Articles 51 through 55 — establishes the compliance framework for providers of general-purpose AI (GPAI) models. GPAI models include foundation models, large language models (LLMs), and multimodal models that are made available to downstream providers who integrate them into AI systems placed on the EU market. Unlike the high-risk AI system obligations in Chapter III (Art.9–Art.28), GPAI obligations in Chapter V apply to the upstream model layer, not to finished AI applications. Any developer, company, or open-source project that trains and distributes a GPAI model used by others in the EU must understand Art.29's requirements.
The enforcement body for GPAI obligations is the EU AI Office, established under Art.64, not national market surveillance authorities. This creates a single EU-level enforcement track for GPAI compliance — significantly different from the decentralized national enforcement that governs high-risk AI systems. Fines for GPAI obligation violations under Art.99 reach €15 million or 3 % of global annual turnover (whichever is higher) for most violations, and €7.5 million or 1.5 % for providing incorrect information. GPAI providers cannot treat these as abstract regulatory concerns: the EU AI Office has investigative powers including audits, document requests, and capability assessments under Art.68.
This guide covers Art.29(1) technical documentation (Annex XI), Art.29(2) downstream provider access obligations, Art.29(3) systemic risk assessment and the 10²⁵ FLOP threshold, the Art.29 × Art.51/52/53/55 intersection matrix, the CLOUD Act jurisdiction risk for GPAI training data and model weights stored on US infrastructure, Python implementation for GPAIProviderRecord, DownstreamProviderAccessRecord, and GPAITransparencyChecker, and the 40-item Art.29 compliance checklist.
Art.29 in the GPAI Regulatory Architecture
Art.29 is the structural hub of Chapter V. It does not stand alone — it cross-references and is cross-referenced by Articles 51 through 55, each of which adds an obligation layer depending on the model's characteristics:
| Article | Title | What It Does | Applies To |
|---|---|---|---|
| Art.51 | Classification of GPAI models with systemic risk | Sets 10²⁵ FLOP threshold and capability-based classification | All GPAI providers |
| Art.52 | Transparency obligations for certain AI systems | AI-generated content disclosure, deepfake labelling, chatbot disclosure | All GPAI providers + deployers |
| Art.53 | Obligations for providers of GPAI models | Technical documentation, copyright summary, downstream policy, open-weight rules | All GPAI providers |
| Art.54 | Authorised representatives and cooperation with AI Office | Registration, representative designation for non-EU providers | Non-EU GPAI providers |
| Art.55 | Obligations for providers of GPAI models with systemic risk | Adversarial testing, incident reporting, cybersecurity, energy efficiency | Systemic-risk GPAI providers only |
Art.29 in the recitals refers to the full Art.53 obligation set as the "Art.29 obligations" framework — this terminology is used in EU AI Office guidance documents. When reading official guidance, "Art.29 obligations" typically means the complete Art.53 list applied to GPAI providers, while Art.55 adds the systemic-risk tier on top.
The critical developer insight: every GPAI provider faces Art.52 and Art.53 obligations. Only models that cross the Art.51 threshold face the additional Art.55 obligations. But Art.51 classification — whether by FLOP threshold or EU AI Office capability assessment — triggers automatically. There is no pre-notification grace period for Art.55 obligations once the threshold is crossed.
Art.29(1): Technical Documentation for GPAI Providers
Art.29(1) requires GPAI providers to draw up and maintain technical documentation before the GPAI model is placed on the market and throughout its lifecycle. The content requirements are specified in Annex XI of the EU AI Act.
Annex XI: Required Documentation Elements
Annex XI specifies nine categories of technical documentation that GPAI providers must maintain:
| # | Annex XI Element | What Developers Must Document |
|---|---|---|
| 1 | General description of the GPAI model | Architecture overview, modality (text, image, multimodal), intended use categories, languages supported |
| 2 | Description of model elements, training, and development process | Pre-training architecture, fine-tuning approach, RLHF/RLAIF methods, hardware used, training duration |
| 3 | Information on training data | Data sources, data categories, data collection methodologies, geographic coverage, time periods |
| 4 | Computational resources used | FLOP count (critical for Art.51 threshold assessment), GPU/TPU hours, energy consumption per training run |
| 5 | Benchmarks and evaluation results | Standardised evaluation scores (MMLU, HellaSwag, BIG-Bench, safety benchmarks), evaluation methodology |
| 6 | Known or foreseeable risks | Risk taxonomy from training data, misuse potential, known failure modes documented from red-teaming |
| 7 | Cybersecurity measures | Model hardening, adversarial robustness testing results, access controls for model weights/API |
| 8 | Capabilities documentation | Downstream use case categories the model is suitable for, limitations, contraindicated uses |
| 9 | Copyright summary | Training data copyright categories, DSM Directive Art.4 opt-out compliance, rights reservation records |
Annex XI documentation is not a one-time exercise. It must be updated when the model is modified through fine-tuning, updated with new training data, or re-evaluated on capability benchmarks. The EU AI Office can request Annex XI documentation at any time under its investigative powers.
What "Updated Throughout Lifecycle" Means in Practice
For developers maintaining living GPAI models (continuous learning, periodic retraining, LoRA fine-tuning), Art.29(1) implies a documentation versioning requirement. Each material change to the model should generate a new documentation version with:
- Updated training data description if new data was incorporated
- Updated FLOP count if retraining occurred (for Art.51 threshold tracking)
- Updated benchmark results if capability evaluations were re-run
- Updated risk assessment if new failure modes were identified
Teams using model registries (MLflow, W&B, Hugging Face Hub) should ensure that the registry entries capture all Annex XI fields, not just internal engineering metadata.
Art.29(2): Downstream Provider Access Obligations
Art.29(2) requires GPAI providers to make available to downstream providers — companies and developers who build AI systems using the GPAI model — the information and documentation they need to comply with their own obligations under the EU AI Act.
What Downstream Providers Need
Downstream providers building high-risk AI systems under Chapter III need specific inputs from the GPAI provider to complete their own compliance documentation:
| What Downstream Provider Needs | Source in Art.29(2) | Why They Need It |
|---|---|---|
| Technical documentation summary | Art.29(2)(a) | To document their AI system's technical foundation in their own Annex IV documentation |
| Copyright training data summary | Art.29(2)(b) | To assess whether their use of the GPAI model output triggers copyright liability |
| Known risks and limitations | Art.29(2)(c) | To complete their Art.9 risk management system for the high-risk AI system built on top |
| Capability categories | Art.29(2)(d) | To determine whether the GPAI model's use case fits within the intended use scope |
| Compliance summary document | Art.29(2)(e) | A standardised document that summarises the GPAI provider's EU AI Act compliance status |
The "compliance summary document" in Art.29(2)(e) is the key operational output. This document must be sufficient for a downstream provider to demonstrate that their upstream GPAI model layer is compliant, without the downstream provider having to independently assess the entire GPAI model. Practically, this functions like a software bill of materials (SBOM) but for AI compliance.
API Access and Model Weight Access
For GPAI models distributed via API access, Art.29(2) access obligations are fulfilled through API documentation, model cards, and the compliance summary document published alongside the API. Downstream providers accessing via API cannot inspect model internals directly — the API documentation and compliance summary are their primary compliance inputs.
For open-weight GPAI models (Llama-class, Mistral-class, Falcon-class), Art.29(2) access obligations are fulfilled through model card documentation that covers Annex XI elements, the copyright training data summary, and the downstream compliance summary. The EU AI Act specifically recognises the open-weight distribution model and provides adapted compliance pathways — but does not exempt open-weight providers from documentation obligations entirely.
Integration with Art.25: Value Chain Responsibilities
Art.25 establishes the rule that any entity that substantially modifies a GPAI model, or places it on the market under its own name for a new intended purpose, becomes a new provider subject to the full provider obligation set. This creates a direct interaction with Art.29(2):
- A company that fine-tunes a GPAI model on proprietary data and offers it via API is a new GPAI provider under Art.25 and must maintain its own Annex XI documentation, not just reference the upstream provider's documentation.
- A company that wraps a GPAI API in a high-risk AI system (e.g., a medical decision support tool) is a downstream provider under Art.29(2) who can rely on the upstream GPAI provider's compliance summary for the model layer, but must independently fulfil Art.9–Art.23 for their own high-risk AI system.
The key question that determines which regime applies: did the downstream entity materially change the model's capabilities or intended use? If yes, Art.25 transformation applies and the downstream entity becomes a provider. If no, Art.29(2) downstream access documentation is sufficient.
Art.29(3): Systemic Risk Assessment
Art.29(3) requires GPAI providers to assess whether their model meets the criteria for classification as a GPAI model with systemic risk under Art.51.
Art.51(1)(a): The 10²⁵ FLOP Threshold
Art.51(1)(a) establishes an automatic classification trigger: a GPAI model trained using a cumulative computational effort exceeding 10²⁵ floating-point operations (FLOPs) is presumed to have systemic risk. This threshold is:
- Technology-neutral: applies to any training paradigm (transformer pre-training, MoE, diffusion model training)
- Cumulative: counts the total FLOPs across all training stages (pre-training + fine-tuning in the same training run)
- Self-reported initially: GPAI providers must assess their own FLOP counts and report to the EU AI Office if they cross the threshold
- Subject to downward revision: the Commission can lower the threshold by delegated act as computational efficiency improves
As of 2026, models in the GPT-4 / Gemini Ultra / Claude 3 Opus class are generally understood to have crossed or approached the 10²⁵ FLOP threshold. Models in the 7B–70B parameter range trained on ≤ 10 trillion tokens are generally below the threshold, though exact FLOP counts depend on training duration and hardware efficiency.
Art.51(2): Capability-Based Classification
Art.51(2) allows the EU AI Office to classify a GPAI model as having systemic risk even below the 10²⁵ FLOP threshold based on a capability assessment. The criteria for capability-based classification include:
| Criterion | Description | Systemic Risk Indicators |
|---|---|---|
| Economic sector breadth | Number of economic sectors the model is deployed across | Models deployed in 5+ major sectors simultaneously |
| Critical infrastructure impact | Whether model is used in energy, finance, healthcare, transport, water | High-risk sector penetration without sector-specific safeguards |
| Fundamental rights impact | Whether model outputs affect fundamental rights at scale | Models used in hiring, credit, law enforcement, public benefit decisions |
| Societal reach | Number of EU users, volume of API calls, downstream AI systems built on model | > 10M EU users or > 100 downstream AI systems |
| Dual-use potential | Whether model capabilities extend to CBRN (chemical, biological, radiological, nuclear) risk generation | Models with uplift potential for weapons of mass destruction |
The capability-based classification process under Art.51(2) gives the EU AI Office discretion to designate models as systemic risk regardless of their FLOP count. This creates regulatory uncertainty for mid-size GPAI models (10²³–10²⁵ FLOPs) that have broad downstream deployment. GPAI providers in this range should monitor EU AI Office guidance and code of practice developments closely.
Notification Obligations
Once a GPAI provider determines that their model meets the Art.51 threshold — either by FLOP count or EU AI Office designation — they must:
- Notify the EU AI Office under Art.54(1)(a)
- Register in the EU database under Art.71 (GPAI model section)
- Implement Art.55 obligations immediately (adversarial testing, incident reporting, cybersecurity, energy efficiency)
- Update Annex XI documentation to reflect systemic risk classification
There is no grace period between crossing the threshold and implementing Art.55 obligations. GPAI providers training new models that approach the 10²⁵ FLOP threshold should implement Art.55 infrastructure before training concludes, not after.
Art.29 × Art.52: Transparency Obligations
Art.52 establishes transparency requirements that apply at the output level of GPAI model deployments. While Art.52 also applies to deployers who use AI systems that generate synthetic content, GPAI providers face specific obligations when they distribute models that generate text, images, audio, or video.
Art.52(1): AI-Generated Content Disclosure
Art.52(1) requires that persons interacting with a chatbot or other AI system that operates through natural conversation know they are interacting with an AI, unless this is obvious. For GPAI providers distributing conversation APIs, this obligation typically falls on the downstream deployer — but GPAI providers must ensure their API terms and documentation clearly communicate this downstream obligation so it can be fulfilled.
Art.52(2): Deep Fake Disclosure
Art.52(2) requires that AI-generated or AI-manipulated image, audio, or video content — particularly where it depicts real persons — be disclosed as artificially generated or manipulated in a machine-readable format. For GPAI models capable of generating deepfake content (image/video/audio generation models), this means:
- The model's output should include machine-readable metadata indicating AI provenance
- The API documentation must clearly state that Art.52(2) disclosure obligations apply to downstream uses of the model's image/video/audio generation capabilities
- GPAI providers should implement C2PA (Content Credentials) or equivalent technical watermarking to enable downstream compliance
Art.52(3): Synthetic Text Disclosure
Art.52(3) requires disclosure when AI-generated text is published on topics of public interest — including news, political commentary, and public affairs information. This creates a documentation obligation for GPAI text models: providers must clearly communicate in their downstream compliance documents that this disclosure obligation exists and that deployers using the model for public-interest content publishing must implement it.
Who Bears the Obligation: Provider vs Deployer
| Content Type | Primary Obligation Holder | GPAI Provider's Role |
|---|---|---|
| Chatbot disclosure (Art.52(1)) | Deployer who operates the chatbot | Document downstream obligation in compliance summary |
| Deepfake content (Art.52(2)) | Deployer who publishes the content; GPAI provider for technical enablement | Implement C2PA/watermarking in model output; document in downstream compliance |
| Synthetic text on public interest (Art.52(3)) | Deployer who publishes the text | Document downstream obligation; consider technical labelling |
Art.29 × Art.53: Core GPAI Provider Obligations
Art.53 is the primary obligation article for all GPAI providers — regardless of systemic risk classification. Art.53(1) lists four sub-obligations:
Art.53(1)(a): Technical Documentation + Copyright Summary
GPAI providers must maintain the Annex XI technical documentation (as described in the Art.29(1) section above) and produce a copyright training data summary. The copyright summary must include:
- Which data categories were used (web crawl, licensed data, public domain, proprietary)
- Whether the training data use falls within the DSM Directive Art.4 research exemption
- Which data sources implemented the TDM opt-out under DSM Art.4(3) and whether those opt-outs were respected
- A summary of any licensed data agreements covering training data
The copyright summary is particularly sensitive. The EU AI Act does not resolve the underlying question of whether training a model on copyrighted data without a license is lawful — that remains a matter for EU and member-state copyright law and ongoing litigation. What Art.53(1)(a) requires is transparency: GPAI providers must document what they did, not certify that it was lawful.
Art.53(1)(b): Policy for Downstream Providers
GPAI providers must publish and maintain a policy for downstream providers that describes how downstream providers can use the model in compliance with their own EU AI Act obligations. This policy must address:
- Which intended use categories are permitted
- Which use categories are restricted (e.g., prohibited applications under Art.5)
- What compliance documentation the downstream provider can rely on from the GPAI provider
- How to contact the GPAI provider if downstream compliance questions arise
- Update mechanisms if the GPAI provider's compliance status changes
Practically, this policy often takes the form of an Acceptable Use Policy (AUP) combined with a compliance summary document. For open-weight models, it may be embedded in the model card or licence terms.
Art.53(1)(c): DSM Directive Compliance and Opt-Out Mechanism
Art.53(1)(c) requires GPAI providers to comply with EU copyright law, including the DSM Directive's text and data mining (TDM) provisions. This specifically includes:
- Identifying and honouring TDM opt-outs implemented by rights holders via machine-readable means (robots.txt, IPTC Photo Metadata, licensing signals)
- Maintaining records of opt-out compliance in the copyright training data summary
- Not using data from sources that have implemented Art.4(3) TDM opt-outs for commercial training purposes
The practical challenge: at web-crawl scale, comprehensive opt-out tracking is technically difficult. The EU AI Office's code of practice for GPAI models (being developed in 2025–2026) is expected to provide technical standards for opt-out compliance at scale. GPAI providers should monitor the code of practice process and implement opt-out tracking infrastructure before the code becomes binding.
Art.53(1)(d): Training Data Transparency Summary
Art.53(1)(d) requires GPAI providers to make available to the EU AI Office, upon request, a detailed summary of the training data used. This summary goes beyond the Annex XI documentation and must include:
- The number of data sources and their geographic distribution
- The data collection and filtering pipeline description
- Quality assurance steps applied to training data
- Known data quality issues or biases identified during data preparation
Unlike Annex XI documentation (which is provided to downstream providers and published in accessible form), the Art.53(1)(d) training data summary may include commercially sensitive information that GPAI providers wish to protect. Art.78 of the EU AI Act establishes confidentiality protections for information provided to the AI Office — but these protections are narrower than trade secret protections in commercial contexts.
Art.29 × Art.55: Systemic Risk Additional Requirements
For GPAI models with systemic risk (Art.51 classification), Art.55 adds four categories of additional obligations on top of the Art.53 baseline:
Art.55(1)(a): Adversarial Testing / Red-Teaming
Art.55(1)(a) requires providers of systemic-risk GPAI models to perform model evaluations including adversarial testing to identify and document risks to health, safety, and fundamental rights. This must be done:
- Before the model is placed on the market
- Periodically after placement (the EU AI Office code of practice will define frequency)
- After material model updates that could affect systemic risk characteristics
Red-teaming for GPAI systemic risk purposes is broader than standard safety evaluations. It must cover:
| Red-Teaming Domain | Scope | Why Required |
|---|---|---|
| CBRN uplift | Can the model provide meaningful assistance in creating chemical, biological, radiological, or nuclear weapons? | Art.5 prohibition + systemic risk classification |
| Cyberattack facilitation | Can the model generate functional malware, explain critical infrastructure vulnerabilities, or assist in advanced persistent threats? | Cybersecurity risk |
| Large-scale manipulation | Can the model generate disinformation at scale, impersonate political figures convincingly, or enable coordinated inauthentic behaviour? | Democratic discourse and election integrity |
| Critical infrastructure attacks | Can the model provide operational planning for attacks on energy, water, finance, or transport infrastructure? | Critical infrastructure protection |
| Psychological harm | Does the model exhibit harmful tendencies in extended conversation contexts (encouraging self-harm, radicalization)? | Fundamental rights and health |
Adversarial testing results must be documented and retained. The EU AI Office can request these records at any time under its investigative powers. Results that reveal unresolved high-severity risks must be disclosed to the EU AI Office under Art.55(1)(b).
Art.55(1)(b): Incident Reporting to EU AI Office
Providers of systemic-risk GPAI models must report serious incidents to the EU AI Office. A serious incident for GPAI purposes includes:
- Deaths, serious injuries, or significant damage to property attributable to GPAI model outputs
- Significant disruptions to critical infrastructure facilitated by GPAI model outputs
- GPAI model use in large-scale criminal activity with significant societal impact
- Discovery that the GPAI model produces CBRN or critical infrastructure attack content at significant scale
The reporting timeline and format will be specified in EU AI Office implementing acts. GPAI providers should implement incident monitoring pipelines that can detect and escalate relevant incidents, particularly when models are used via API by downstream providers at scale — incidents may be reported to the GPAI provider by downstream providers and must be escalated to the EU AI Office.
Art.55(1)(c): Cybersecurity Measures
Art.55(1)(c) requires providers of systemic-risk GPAI models to implement adequate cybersecurity protection for the model itself — specifically model weights, training data, and capability evaluation records. Required cybersecurity measures include:
| Cybersecurity Measure | Rationale | Implementation |
|---|---|---|
| Model weight protection | Leaked weights create uncontrolled deployment outside Art.55 obligations | Encrypted storage, access controls, air-gapped training infrastructure |
| Training data security | Training data may contain sensitive information and is a compliance record | Data governance pipeline, access logging, tamper-evident storage |
| API security | Preventing jailbreak extraction of model capabilities and misuse at scale | Rate limiting, abuse detection, output monitoring |
| Evaluation record integrity | Capability evaluation records are regulatory documents that must be tamper-proof | Cryptographic signing of evaluation results, audit trail |
Art.55(1)(d): Energy Efficiency Reporting
Art.55(1)(d) requires providers of systemic-risk GPAI models to document and report energy consumption for both training and inference. This covers:
- Total training energy consumption (kWh) for each training run
- Per-token inference energy consumption at representative scale
- Data centre location and grid carbon intensity during training
- Projected inference energy consumption at deployment scale
Energy efficiency reporting is both a transparency obligation and, in future implementing acts, potentially a performance standard. GPAI providers should implement energy tracking infrastructure in their training pipelines from the start — retrofitting energy accounting onto completed training runs is significantly more difficult.
Systemic Risk Obligations Matrix
| Obligation | Art.55 Reference | Trigger | Timing |
|---|---|---|---|
| Adversarial testing | Art.55(1)(a) | Art.51 classification | Before market + periodic |
| Incident reporting | Art.55(1)(b) | Art.51 classification | Within 2 weeks of discovery (proposed) |
| Cybersecurity measures | Art.55(1)(c) | Art.51 classification | Before market + ongoing |
| Energy efficiency reporting | Art.55(1)(d) | Art.51 classification | Before market + per update |
| EU AI Office cooperation | Art.55(2) | Art.51 classification | Upon request |
| Code of practice participation | Art.56 | Art.51 classification | By mandate if no voluntary code |
CLOUD Act × Art.29: Jurisdiction Risk
The CLOUD Act (Clarifying Lawful Overseas Use of Data Act, 2018) allows US law enforcement to compel US-based cloud providers and their affiliates to produce data stored anywhere in the world, regardless of where the data physically resides. For GPAI providers using US infrastructure, this creates specific jurisdiction risks at the Art.29 compliance layer.
Training Data on US Infrastructure
GPAI training data stored in US-based cloud environments (AWS S3, Google Cloud Storage, Azure Blob) is subject to CLOUD Act disclosure requests. This creates two risks:
-
Copyright summary exposure: If the EU AI Office requests the Art.53(1)(d) training data summary under its investigative powers, and that summary contains information about training data sources that the GPAI provider considers commercially sensitive, the data may already be accessible to US law enforcement under a CLOUD Act order — without the EU AI Act's Art.78 confidentiality protections applying.
-
Litigation exposure: In copyright infringement litigation (EU or US), CLOUD Act-accessible training data records could be subpoenaed by opposing parties via US courts, bypassing EU procedural protections.
Model Weights as US Jurisdiction Assets
GPAI model weights stored on US infrastructure or trained on US hardware are subject to US export control regulations (EAR) and CLOUD Act jurisdiction. For systemic-risk GPAI models, this creates:
- Export control compliance risk: Advanced AI model weights may be subject to BIS export controls affecting distribution to certain jurisdictions
- Regulatory document integrity risk: If Art.55 evaluation records (adversarial testing results, incident reports) are stored on US infrastructure, they are simultaneously EU regulatory records (subject to EU AI Act) and US-jurisdiction assets (subject to CLOUD Act)
Capability Evaluation Records
Art.55(1)(a) adversarial testing results are particularly sensitive. These records document both the capabilities and the known failure modes of systemic-risk GPAI models. If stored on US infrastructure:
- They are accessible to US intelligence agencies under CLOUD Act / FISA orders
- They could be used to map the specific vulnerabilities and bypass methods of safety-critical AI systems
- Their integrity as regulatory compliance documents depends on tamper-evident storage that US jurisdiction access could compromise
EU-Native Infrastructure Advantage
GPAI providers building EU-native training and storage infrastructure — using EU-domiciled PaaS providers (Scaleway, OVHcloud, Hetzner) for training data storage, model weight storage, and compliance record storage — avoid CLOUD Act jurisdiction exposure. The practical compliance advantage:
| Compliance Record | US Infrastructure Risk | EU-Native Mitigation |
|---|---|---|
| Annex XI documentation | CLOUD Act accessible | Store in EU-native encrypted vault |
| Copyright training data summary | Litigation subpoena risk | EU-native document management with Art.78 protections |
| Art.55 adversarial testing results | Intelligence access risk | Air-gapped EU storage, zero-knowledge encryption |
| Incident reports to EU AI Office | Disclosure timing complications | EU-native secure channel to AI Office |
| Model weights | Export control + CLOUD Act | EU-native HPC/model storage infrastructure |
Python Implementation
The following Python implementation provides data structures and logic for tracking Art.29 / Art.53 / Art.55 obligations for GPAI providers.
from __future__ import annotations
from dataclasses import dataclass, field
from datetime import date
from typing import Literal
# ---------------------------------------------------------------------------
# GPAIProviderRecord — tracks Art.29/53 obligations
# ---------------------------------------------------------------------------
@dataclass
class GPAIProviderRecord:
"""Tracks Art.29 + Art.53 obligations for a GPAI model provider."""
model_id: str
model_name: str
parameter_count: int # e.g. 70_000_000_000 for 70B
training_flops: float # cumulative FLOPs; 1e25 = Art.51 threshold
open_weight: bool # True = open-weight distribution (Llama-class)
training_data_categories: list[str] = field(default_factory=list)
# Art.53(1)(a) — Annex XI documentation completeness
annex_xi_general_description: bool = False
annex_xi_training_process: bool = False
annex_xi_training_data: bool = False
annex_xi_computational_resources: bool = False
annex_xi_benchmark_results: bool = False
annex_xi_known_risks: bool = False
annex_xi_cybersecurity_measures: bool = False
annex_xi_capabilities: bool = False
annex_xi_copyright_summary: bool = False
# Art.53(1)(b) — downstream provider policy
downstream_policy_published: bool = False
downstream_policy_url: str | None = None
# Art.53(1)(c) — DSM opt-out compliance
tdm_optout_tracking_implemented: bool = False
copyright_compliance_record_available: bool = False
# Art.53(1)(d) — training data transparency summary
training_data_summary_available: bool = False
# Art.51 — systemic risk classification
eu_ai_office_notified: bool = False
eu_database_registered: bool = False
# Art.55 — systemic risk additional obligations (populated if systemic risk)
adversarial_testing_completed: bool = False
adversarial_testing_date: date | None = None
incident_reporting_pipeline_active: bool = False
cybersecurity_measures_implemented: bool = False
energy_consumption_tracked: bool = False
def is_systemic_risk(self) -> bool:
"""Art.51(1)(a): automatic classification above 10^25 FLOPs."""
return self.training_flops >= 1e25
def annex_xi_completeness_report(self) -> dict:
"""Returns a completeness report for all 9 Annex XI elements."""
elements = {
"general_description": self.annex_xi_general_description,
"training_process": self.annex_xi_training_process,
"training_data": self.annex_xi_training_data,
"computational_resources": self.annex_xi_computational_resources,
"benchmark_results": self.annex_xi_benchmark_results,
"known_risks": self.annex_xi_known_risks,
"cybersecurity_measures": self.annex_xi_cybersecurity_measures,
"capabilities": self.annex_xi_capabilities,
"copyright_summary": self.annex_xi_copyright_summary,
}
completed = sum(elements.values())
return {
"model_id": self.model_id,
"elements": elements,
"completed": completed,
"total": len(elements),
"completion_pct": round(completed / len(elements) * 100, 1),
"annex_xi_complete": completed == len(elements),
}
def art53_compliance_report(self) -> dict:
"""Returns Art.53(1)(a)-(d) compliance status."""
return {
"model_id": self.model_id,
"art53_1a_technical_doc": self.annex_xi_completeness_report()["annex_xi_complete"],
"art53_1b_downstream_policy": self.downstream_policy_published,
"art53_1c_copyright_optout": self.tdm_optout_tracking_implemented,
"art53_1d_training_data_summary": self.training_data_summary_available,
"all_art53_complete": all([
self.annex_xi_completeness_report()["annex_xi_complete"],
self.downstream_policy_published,
self.tdm_optout_tracking_implemented,
self.training_data_summary_available,
]),
}
def art55_compliance_report(self) -> dict | None:
"""Returns Art.55 compliance status. None if not systemic risk."""
if not self.is_systemic_risk():
return None
return {
"model_id": self.model_id,
"is_systemic_risk": True,
"eu_ai_office_notified": self.eu_ai_office_notified,
"eu_database_registered": self.eu_database_registered,
"art55_1a_adversarial_testing": self.adversarial_testing_completed,
"adversarial_testing_date": (
self.adversarial_testing_date.isoformat()
if self.adversarial_testing_date else None
),
"art55_1b_incident_reporting": self.incident_reporting_pipeline_active,
"art55_1c_cybersecurity": self.cybersecurity_measures_implemented,
"art55_1d_energy_tracking": self.energy_consumption_tracked,
"all_art55_complete": all([
self.eu_ai_office_notified,
self.eu_database_registered,
self.adversarial_testing_completed,
self.incident_reporting_pipeline_active,
self.cybersecurity_measures_implemented,
self.energy_consumption_tracked,
]),
}
def full_compliance_report(self) -> dict:
return {
"model_id": self.model_id,
"model_name": self.model_name,
"parameter_count": self.parameter_count,
"training_flops": self.training_flops,
"is_systemic_risk": self.is_systemic_risk(),
"annex_xi": self.annex_xi_completeness_report(),
"art53": self.art53_compliance_report(),
"art55": self.art55_compliance_report(),
}
# ---------------------------------------------------------------------------
# DownstreamProviderAccessRecord — Art.29(2) downstream access
# ---------------------------------------------------------------------------
@dataclass
class DownstreamProviderAccessRecord:
"""Tracks Art.29(2) downstream provider access obligations."""
upstream_gpai_model_id: str
downstream_provider_id: str
access_type: Literal["api", "model_weights", "both"]
# What the downstream provider has received
compliance_summary_received: bool = False
annex_xi_summary_received: bool = False
copyright_summary_received: bool = False
known_risks_documentation_received: bool = False
downstream_policy_accepted: bool = False
art53_documentation_available: bool = False
# Access metadata
access_granted_date: date | None = None
compliance_summary_version: str | None = None
def access_rights_summary(self) -> str:
access_map = {
"api": "API access only — model weights not available",
"model_weights": "Model weights access — open-weight distribution",
"both": "Full access — API + model weights",
}
return (
f"Downstream provider {self.downstream_provider_id} has "
f"{access_map[self.access_type]} to GPAI model "
f"{self.upstream_gpai_model_id}."
)
def downstream_compliance_readiness(self) -> dict:
"""Checks whether downstream provider has all Art.29(2) inputs."""
checks = {
"compliance_summary": self.compliance_summary_received,
"annex_xi_summary": self.annex_xi_summary_received,
"copyright_summary": self.copyright_summary_received,
"known_risks_documentation": self.known_risks_documentation_received,
"downstream_policy_accepted": self.downstream_policy_accepted,
}
complete = sum(checks.values())
return {
"upstream_model": self.upstream_gpai_model_id,
"downstream_provider": self.downstream_provider_id,
"access_type": self.access_type,
"checks": checks,
"ready": complete == len(checks),
"missing": [k for k, v in checks.items() if not v],
}
# ---------------------------------------------------------------------------
# GPAITransparencyChecker — Art.52 transparency obligations
# ---------------------------------------------------------------------------
@dataclass
class GPAITransparencyChecker:
"""Checks Art.52 transparency obligations for a GPAI model deployment."""
model_id: str
output_types: list[str] # e.g. ["text", "image", "audio", "video"]
# Art.52(1) — chatbot disclosure
chatbot_disclosure_implemented: bool = False
# Art.52(2) — deepfake / synthetic media disclosure
c2pa_watermarking_implemented: bool = False
machine_readable_ai_provenance: bool = False
# Art.52(3) — synthetic text on public interest topics
synthetic_text_labelling_implemented: bool = False
def generates_synthetic_media(self) -> bool:
return any(t in self.output_types for t in ["image", "audio", "video"])
def generates_text(self) -> bool:
return "text" in self.output_types
def check_ai_generated_disclosure(self, output_type: str) -> bool:
"""Returns True if disclosure obligation is met for given output type."""
if output_type == "chatbot":
return self.chatbot_disclosure_implemented
if output_type in ("image", "audio", "video"):
return self.c2pa_watermarking_implemented and self.machine_readable_ai_provenance
if output_type == "text_public_interest":
return self.synthetic_text_labelling_implemented
return False
def disclosure_completeness_report(self) -> dict:
"""Returns Art.52 compliance status for all applicable output types."""
checks: dict[str, bool] = {}
if self.generates_text():
checks["art52_1_chatbot_disclosure"] = self.chatbot_disclosure_implemented
checks["art52_3_synthetic_text_labelling"] = self.synthetic_text_labelling_implemented
if self.generates_synthetic_media():
checks["art52_2_c2pa_watermarking"] = self.c2pa_watermarking_implemented
checks["art52_2_machine_readable_provenance"] = self.machine_readable_ai_provenance
completed = sum(checks.values())
return {
"model_id": self.model_id,
"output_types": self.output_types,
"checks": checks,
"completed": completed,
"total": len(checks),
"art52_complete": completed == len(checks),
"missing_obligations": [k for k, v in checks.items() if not v],
}
Art.29 Compliance Checklist
Practical 40-item checklist for GPAI providers implementing Art.29 / Art.53 / Art.55 compliance:
Annex XI Technical Documentation (Art.53(1)(a))
- 1. Drafted general description of the GPAI model covering architecture, modality, and intended use categories
- 2. Documented training and development process including pre-training, fine-tuning, and RLHF/RLAIF methodology
- 3. Documented all training data sources including categories, geographic distribution, and time periods
- 4. Calculated and recorded total cumulative training FLOPs for Art.51 threshold assessment
- 5. Recorded hardware used (GPU/TPU type, count, training duration) in Annex XI documentation
- 6. Documented all benchmark evaluation results with methodology and evaluation dates
- 7. Documented known and foreseeable risks from training data and model capabilities
- 8. Documented cybersecurity measures protecting model weights, training data, and API access
- 9. Produced copyright training data summary covering DSM Directive Art.4 TDM analysis
DSM Directive Compliance (Art.53(1)(c))
- 10. Implemented TDM opt-out tracking in web crawl pipeline (robots.txt, IPTC, licensing signals)
- 11. Excluded from training data sources that implemented Art.4(3) TDM opt-out for commercial use
- 12. Maintained records of opt-out compliance decisions for EU AI Office audit readiness
- 13. Reviewed licensed data agreements for scope of training data use permissions
- 14. Documented data collection and filtering pipeline for copyright compliance audit trail
Downstream Provider Policy (Art.53(1)(b))
- 15. Published downstream provider policy describing permitted and restricted use categories
- 16. Identified Art.5 prohibited use categories and explicitly restricted them in downstream policy
- 17. Produced compliance summary document for downstream providers building on the GPAI model
- 18. Established mechanism for downstream providers to receive Art.29(2) documentation updates
- 19. Implemented process for downstream providers to report compliance questions to the GPAI provider
Training Data Transparency (Art.53(1)(d))
- 20. Prepared detailed training data transparency summary for EU AI Office requests
- 21. Documented data quality assurance steps and known training data quality issues
- 22. Prepared to provide training data summary under Art.78 confidentiality protections
Systemic Risk Assessment (Art.51 / Art.29(3))
- 23. Assessed whether training FLOPs exceed 10²⁵ threshold for Art.51(1)(a) automatic classification
- 24. Assessed whether model meets Art.51(2) capability-based criteria for systemic risk classification
- 25. Implemented monitoring to detect if FLOP threshold is crossed during ongoing training runs
- 26. If systemic risk: notified EU AI Office under Art.54(1)(a) before market placement
- 27. If systemic risk: registered in EU database under Art.71 (GPAI model section)
Adversarial Testing (Art.55(1)(a)) — Systemic Risk Models Only
- 28. Completed pre-market adversarial testing covering CBRN uplift, cyberattack facilitation, large-scale manipulation
- 29. Documented red-teaming methodology and results in retainable compliance records
- 30. Established periodic adversarial testing schedule post-deployment
- 31. Implemented process to escalate unresolved high-severity red-team findings to EU AI Office
Incident Reporting (Art.55(1)(b)) — Systemic Risk Models Only
- 32. Implemented incident monitoring pipeline for API-level misuse detection
- 33. Established downstream provider incident reporting channel (so downstream providers can report serious incidents to GPAI provider)
- 34. Prepared incident report template aligned with EU AI Office expected format
Cybersecurity (Art.55(1)(c)) — Systemic Risk Models Only
- 35. Implemented encrypted storage and access controls for model weights
- 36. Implemented API rate limiting and abuse detection for model access
- 37. Implemented cryptographic signing of capability evaluation records for tamper-evidence
Energy Efficiency (Art.55(1)(d)) — Systemic Risk Models Only
- 38. Implemented energy tracking in training pipeline (kWh per training run)
- 39. Measured per-token inference energy consumption at representative scale
- 40. Documented data centre grid carbon intensity for training runs
See Also
- EU AI Act Art.51 GPAI Model Classification: Systemic Risk Threshold and Provider Obligations — Art.51 establishes the GPAI tier classification that determines which upstream provider obligations cascade downstream to Art.29 integrators
- EU AI Act Art.52 GPAI Model General Obligations: Technical Documentation, Training Data & Copyright — Art.52 defines the technical documentation and model card that Art.29 downstream providers are entitled to demand from upstream GPAI providers
- EU AI Act Art.9 Formal Verification for High-Risk AI: Developer Guide — Art.9 risk management system obligations that downstream GPAI users must fulfil for high-risk AI systems
- EU AI Act Art.28 Obligations for Distributors: Developer Guide — Art.28 distributor obligations and the Art.25 value-chain transformation rules
- EU AI Act Art.26 Obligations for Deployers: Developer Guide — Art.26 deployer obligations applicable when downstream providers deploy GPAI-based AI systems