EU AI Act Art.97: Commission Evaluation — What Gets Reviewed and When — Developer Guide (2026)
The EU AI Act is not a static regulation. Its architecture includes multiple built-in feedback mechanisms that allow the framework to evolve as AI technology changes. Most developers know about the Art.85 comprehensive review scheduled for 2027. Fewer understand that Article 97 creates a parallel, more targeted evaluation mechanism with different scope, different timing, and different developer implications.
Article 97 is the Commission's ongoing evaluation engine. Where Art.85 is the legislative reset triggered by a 3-year review, Art.97 is the regulatory calibration instrument — it evaluates specific technical and operational parameters of the Act on a rolling basis, feeds those findings into Commission reports, and provides the legal basis for delegated act amendments without requiring full legislative procedure.
For developers, the Art.97 cycle matters because it is the mechanism most likely to change your specific compliance obligations: what Annex III categories cover, where the GPAI systemic risk threshold sits, and how enforcement levels are calibrated.
What Article 97 Actually Says
Article 97 mandates the Commission to evaluate and report on the following specific matters — separately and on a more technical basis than the comprehensive Art.85 review.
The core evaluation obligation:
The Commission shall, within 4 years of the date of application of this Regulation, carry out an assessment of and report to the European Parliament and to the Council on the following:
(a) Adequacy of the supervision framework — whether the governance structure under Chapter VI and VII is functioning effectively. This includes the AI Office's oversight of GPAI models, the European AI Board's coordination role, and the adequacy of national Market Surveillance Authority resources.
(b) Scope of prohibited practices under Art.5 — whether the list of absolutely forbidden AI applications remains appropriate, whether new capabilities have created gaps in the prohibition list, and whether any current prohibition has become technically obsolete or overly broad.
(c) Annex III high-risk classification — this is the most developer-relevant evaluation item. The Commission assesses whether the eight categories in Annex III remain appropriate, whether new AI application areas should be added, and whether any category should be removed or narrowed.
(d) Effectiveness of conformity assessment — whether the Art.43-49 conformity pathway is generating sufficient assurance, or whether new conformity mechanisms are needed for emerging AI architectures (multimodal models, foundation-model-based deployments).
(e) Penalty level calibration — whether Art.99 fine tiers are achieving deterrence. If enforcement data from MSA Art.84 annual reports shows systematic underfiling of violations or inadequate provider behaviour change, the Commission can propose penalty adjustments.
(f) GPAI threshold review — the 10²⁵ FLOP systemic risk threshold in Art.51 is explicitly subject to the Art.97 evaluation. As compute efficiency improves and more models approach or cross this threshold, the Commission evaluates whether the trigger level remains technically meaningful.
The Art.85 vs Art.97 Distinction
Developers building compliance roadmaps past 2027 need to understand the difference between these two review mechanisms, because they operate on different timelines and have different legislative consequences.
| Dimension | Art.85 Review | Art.97 Evaluation |
|---|---|---|
| Trigger | 3 years after application (≈ August 2027) | 4 years after application (≈ August 2028) |
| Scope | Comprehensive — entire Act | Targeted — specific parameters |
| Output | Report to Parliament + Council → can trigger legislative amendment | Commission evaluation report → can trigger delegated acts |
| Annex III | Covered as part of full Act review | Primary focus — specific assessment |
| GPAI thresholds | Covered generally | Explicitly evaluated |
| Legislative mechanism | Full legislative procedure required | Delegated acts under Art.97(3) possible without full Parliament/Council vote |
| Developer impact | Broad potential changes | Surgical Annex III and threshold changes most likely |
The critical distinction is the legislative mechanism. Art.85 amendments require the full EU legislative process — Commission proposal → Parliament debate → Council vote → potentially 18-24 months. Art.97 evaluation can result in delegated acts — Commission-only amendments to Annexes I-III, with a shorter timeline and less political friction.
This means Art.97-triggered Annex III expansions can happen faster than Art.85-triggered changes to the core text.
The Art.84 → Art.97 Data Pipeline
The Art.97 evaluation does not happen in a vacuum. The Commission's assessment of Annex III adequacy, enforcement effectiveness, and penalty calibration depends on data — and that data flows from MSA annual reports under Art.84.
The pipeline works as follows:
MSA investigations (Art.74-76)
↓
Serious incident reports (Art.73)
↓
MSA annual reports to Commission (Art.84)
↓ aggregated across all 27 member states
Commission evaluation database
↓
Art.97 evaluation report (Year 4)
↓ if Annex III scope issues identified
Delegated act amending Annex III (Art.97(3))
For developers, this pipeline has a specific implication: your Art.73 serious incident reports and your EUAIDB registration data enter the Commission's Art.97 evaluation input. Sectors with high incident reporting rates are most likely to receive increased scrutiny in the Art.97 assessment — and potentially Annex III expansion or stricter conformity requirements.
Three data categories are particularly influential in Art.97 evaluations:
1. Incident concentration by sector: If Art.84 annual reports show that a particular AI application type (say, AI-assisted insurance underwriting) generates disproportionate serious incident reports, Art.97 evaluation may trigger reclassification of that application into Annex III, even if it was previously considered non-high-risk.
2. Enforcement gap patterns: If MSAs consistently fail to identify Annex III violations in certain sectors — because current definitions are ambiguous, or because MSAs lack technical capacity for specific AI types — the Art.97 evaluation can recommend conformity procedure improvements or new harmonized standards.
3. Fine effectiveness signals: If Art.84 reports show that maximum Art.99 fines are being imposed without changing provider behaviour, the Art.97 evaluation provides the formal basis for proposing penalty escalation through delegated acts or legislative amendment recommendation.
Annex III Expansion Risk: The Developer's Core Concern
The Art.97 evaluation of Annex III scope is the parameter most likely to affect your compliance obligations. Understanding which current AI applications are at risk of reclassification helps prioritize compliance investment.
Current Annex III categories (for reference):
| Category | Description | Reclassification risk |
|---|---|---|
| 1. Biometric | Remote identification, categorization, emotion recognition | Stable — already comprehensive |
| 2. Critical Infrastructure | Utilities, transport, finance | Medium — IoT/edge AI gaps |
| 3. Education | Access/admission decisions | Low — well-defined |
| 4. Employment | Hiring, performance assessment, termination | Medium — gig economy AI gaps |
| 5. Essential Services | Credit, health, insurance, social benefits | High — AI scoring expansion likely |
| 6. Law Enforcement | Risk assessment, crime prediction, evidence evaluation | High — newer AI investigative tools |
| 7. Migration/Asylum | Risk assessment, document verification | Medium — biometric AI expansion |
| 8. Justice/Democracy | Electoral influence, court decisions | High — generative AI political risks |
Applications currently outside Annex III that face reclassification risk by 2028:
- AI-assisted insurance underwriting (currently non-high-risk if below direct benefit/exclusion trigger) — large incident volume in financial services Art.84 data likely
- Autonomous HR performance scoring (continuous monitoring, not just hiring decisions) — coverage gap identified by Art.84 labor enforcement data
- AI-driven content recommendation with political influence (currently Art.5(1)(a)(iii) covers electoral manipulation, but influence below direct manipulation threshold is in a gray zone)
- Medical triage AI (below Art.6(2) threshold if not autonomous diagnosis) — Art.84 healthcare incident data may push toward Annex III
GPAI Threshold Compression
The 10²⁵ FLOP systemic risk threshold for GPAI models under Art.51 is explicitly within the Art.97 evaluation scope. The trajectory suggests compression — not expansion.
Why the threshold is likely to decrease:
In 2024 when the threshold was set, reaching 10²⁵ FLOP was a frontier-model-only achievement. By 2026, multiple model families are approaching or crossing this threshold. Algorithmic improvements mean equivalent capability can be achieved at lower compute. By 2028 (Art.97 evaluation year), the 10²⁵ FLOP threshold may apply to what are then considered mid-tier commercial models.
If the Commission's Art.97 evaluation concludes that the systemic risk designation applies only to a shrinking set of models that become progressively less representative of actual AI risk, it will adjust the threshold downward — bringing more models under Art.55 systemic risk obligations (adversarial testing, model evaluation, incident reporting, cybersecurity measures).
For GPAI model providers below the current threshold:
- If your model is at 10²³–10²⁴ FLOP training compute, it is within the Art.97 evaluation range
- Proactive voluntary Art.56 code of practice adoption reduces regulatory risk if threshold moves
- Art.95 voluntary code of conduct adoption signals good faith to the Commission evaluation
- Maintain documentation structures compatible with Art.55 requirements even if not currently mandatory
CLOUD Act Implications in Commission Evaluation Reports
The Art.97 evaluation reports have an additional significance that is not obvious from the text of the article itself: they are one of the primary mechanisms by which the Commission can explicitly address CLOUD Act conflicts with EU AI Act compliance requirements.
The current AI Act text does not mention the CLOUD Act by name. But Art.97 evaluation reports are policy documents — the Commission can include findings and recommendations that go beyond the statutory text. If Art.84 data shows that CLOUD Act compellability of compliance documentation is creating systematic enforcement problems (MSA requests for documentation held on US infrastructure being contested or delayed by US government national security assertions), Art.97 is the vehicle for a formal Commission position.
What a CLOUD Act finding in an Art.97 evaluation report could mean:
| Commission finding | Delegated act consequence |
|---|---|
| US cloud infrastructure creates "systematic risk of CLOUD Act compellability" for Annex III compliance docs | Annex I amendment requiring EU-jurisdiction documentation storage for high-risk AI |
| GPAI systematic risk assessments accessible under US DOJ compulsion | Art.55 amendment specifying storage jurisdiction requirements |
| Art.73 incident reports from providers on US cloud compromised by CLOUD Act | Art.73 implementation act requiring EU-native reporting systems |
This is not speculative — the Commission's 2021 CLOUD Act assessment, the European Data Protection Board's 2023 guidelines, and multiple national DPA enforcement actions have all pointed toward this structural conflict. The Art.97 evaluation is the first formal legislative vehicle for addressing it.
For developers choosing infrastructure for AI compliance documentation today, this trajectory points in one direction: EU-native infrastructure now is insurance against regulatory mandate later.
Python Tooling: Regulatory Future-Proofing
from dataclasses import dataclass, field
from enum import Enum
from datetime import date
from typing import Optional
class AnnexIIICategory(Enum):
BIOMETRIC = "biometric_id_categorization"
CRITICAL_INFRA = "critical_infrastructure"
EDUCATION = "education_training"
EMPLOYMENT = "employment_hr"
ESSENTIAL_SERVICES = "essential_services_credit_health"
LAW_ENFORCEMENT = "law_enforcement"
MIGRATION_ASYLUM = "migration_asylum"
JUSTICE_DEMOCRACY = "justice_democracy_elections"
NOT_HIGH_RISK = "not_currently_high_risk"
class ReclassificationRisk(Enum):
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
IMMINENT = "imminent_2028"
@dataclass
class Art97ComplianceAssessment:
current_category: AnnexIIICategory
reclassification_risk: ReclassificationRisk
gpai_threshold_exposure: bool
cloud_act_documentation_risk: bool
recommended_actions: list[str]
art97_evaluation_year: int = 2028
art85_review_year: int = 2027
def assess_art97_exposure(
ai_application_type: str,
training_compute_flop: Optional[float],
documentation_on_us_cloud: bool,
current_annex_iii: bool,
sector: str
) -> Art97ComplianceAssessment:
"""
Assess a developer's exposure to Art.97 Commission evaluation changes.
Returns risk profile and recommended preparatory actions.
"""
# GPAI threshold exposure
gpai_exposed = (
training_compute_flop is not None and
training_compute_flop >= 1e23 # Within 100x of current 10^25 threshold
)
# Reclassification risk by sector
HIGH_RISK_SECTORS = {
"financial_services", "insurance", "healthcare_triage",
"political_content", "gig_economy_hr", "autonomous_hr_scoring"
}
MEDIUM_RISK_SECTORS = {
"employment_general", "essential_services", "law_enforcement_adjacent",
"critical_infrastructure_iot"
}
if sector in HIGH_RISK_SECTORS and not current_annex_iii:
reclass_risk = ReclassificationRisk.HIGH
category = AnnexIIICategory.ESSENTIAL_SERVICES # Most likely catch-all expansion
elif sector in MEDIUM_RISK_SECTORS and not current_annex_iii:
reclass_risk = ReclassificationRisk.MEDIUM
category = AnnexIIICategory.EMPLOYMENT
elif current_annex_iii:
reclass_risk = ReclassificationRisk.LOW # Already covered
category = AnnexIIICategory.ESSENTIAL_SERVICES
else:
reclass_risk = ReclassificationRisk.LOW
category = AnnexIIICategory.NOT_HIGH_RISK
# Build action plan
actions = []
if reclass_risk in {ReclassificationRisk.HIGH, ReclassificationRisk.IMMINENT}:
actions.extend([
"Begin Art.11 technical documentation now (before reclassification mandate)",
"Implement Art.9 risk management process in advance",
"Establish Art.12 logging with EU-jurisdiction storage",
"Register on Art.95 voluntary code of conduct as good-faith signal to Commission"
])
if gpai_exposed:
actions.extend([
f"Training compute {training_compute_flop:.1e} FLOP: within Art.97 threshold review range",
"Implement Art.55 documentation structures proactively",
"Monitor AI Office GPAI register for threshold adjustment announcements",
"Consider Art.95 voluntary code adoption before Art.55 becomes mandatory"
])
if documentation_on_us_cloud:
actions.extend([
"CLOUD Act risk in Art.97 evaluation trajectory: migrate compliance docs to EU-native storage",
"Review Art.73 incident report system jurisdiction before 2027 Art.85 review",
"Document CLOUD Act exposure in DPO risk register for Art.97 evaluation window"
])
actions.append(
f"Schedule Art.85 review prep for Q2 2027 and Art.97 evaluation prep for Q2 2028"
)
return Art97ComplianceAssessment(
current_category=category,
reclassification_risk=reclass_risk,
gpai_threshold_exposure=gpai_exposed,
cloud_act_documentation_risk=documentation_on_us_cloud,
recommended_actions=actions
)
@dataclass
class AnnexIIIExpansionTracker:
"""
Tracks indicators that a non-high-risk AI system may be
reclassified under Art.97 Commission evaluation.
"""
application_type: str
sector: str
incident_reports_submitted: int = 0
msa_inquiry_received: bool = False
competitor_reclassified: bool = False
art84_data_exposure: bool = False
def expansion_risk_score(self) -> float:
"""Returns 0.0 (no risk) to 1.0 (imminent reclassification)."""
score = 0.0
if self.msa_inquiry_received:
score += 0.3 # MSA attention = high signal
if self.incident_reports_submitted > 0:
score += min(0.2, self.incident_reports_submitted * 0.05)
if self.competitor_reclassified:
score += 0.25 # Sector-level reclassification wave pattern
if self.art84_data_exposure:
score += 0.15
return min(score, 1.0)
def pre_compliance_recommendation(self) -> str:
score = self.expansion_risk_score()
if score >= 0.5:
return (
"HIGH: Begin Art.11 technical documentation immediately. "
"Art.97 evaluation is likely to reclassify this application type. "
"Pre-compliance now prevents emergency remediation post-reclassification."
)
elif score >= 0.25:
return (
"MEDIUM: Prepare lightweight technical documentation skeleton. "
"Monitor AI Office Art.97 evaluation publications from 2027 onward."
)
return (
"LOW: Maintain Art.95 voluntary compliance posture. "
"Annual review of Art.97 Commission evaluation reports recommended."
)
The Developer's Art.97 Timeline
2025-08-02: EU AI Act applies (prohibited practices, GPAI, Art.95-96)
2026-08-02: Full Act application — Annex III high-risk obligations begin
2027-08-02: Art.85 comprehensive review published
→ Legislative amendments begin (if recommended)
2028-08-02: Art.97 evaluation published
→ Delegated acts amending Annexes possible (no full Parliament vote)
→ GPAI threshold adjustment most likely here
→ New Annex III categories possible
2029-early: Delegated act amendments from Art.97 evaluation take effect
2030-08-02: Second Art.97 evaluation cycle begins
For developers building AI systems today, the compliance architecture decision window is approximately 2026–2027 — before the Art.85 review that will generate recommendations, and well before the Art.97 delegated acts that could rapidly expand Annex III.
30-Item Art.97 Future-Proofing Checklist
Category 1: Understanding Your Review Exposure (5 items)
- 1. Art.97 evaluation timeline documented in your compliance calendar (2028 primary, 2032 secondary)
- 2. Your AI system's sector mapped against Annex III reclassification risk table above
- 3. GPAI training compute logged — if within 100x of 10²⁵ FLOP, Art.97 threshold review is relevant
- 4. Art.84 data exposure assessed — do your Art.73 incident reports enter the Commission's Art.97 input pipeline?
- 5. Art.85 (2027) and Art.97 (2028) reviews distinguished in compliance roadmap
Category 2: Annex III Pre-Compliance (5 items)
- 6. If reclassification risk is HIGH: Art.9 risk management process initiated (framework in place, not just planned)
- 7. If reclassification risk is HIGH: Art.11 technical documentation skeleton prepared (can be expanded, not built from scratch)
- 8. If reclassification risk is HIGH: Art.12 logging infrastructure in place (EU-jurisdiction storage)
- 9. If reclassification risk is MEDIUM: lightweight documentation template prepared for rapid expansion
- 10. Art.5 prohibited practice audit complete regardless of current Annex III status (applies from August 2025)
Category 3: GPAI Threshold Monitoring (5 items)
- 11. Training compute of any models you develop or deploy documented in internal registry
- 12. AI Office GPAI model register monitored for threshold guidance publications
- 13. If within threshold range: Art.55 documentation structures prepared voluntarily
- 14. If within threshold range: Art.95 voluntary code of conduct adoption considered as good-faith signal
- 15. Supply chain: GPAI models you integrate assessed for systemic risk classification status
Category 4: Art.84 Data Management (5 items)
- 16. Art.73 serious incident reports filed for all qualifying events (underreporting increases reclassification risk)
- 17. EUAIDB registration complete and up-to-date (MSA-accessible data feeds Art.84 annual reports)
- 18. Post-market monitoring records (Art.72) maintained and available for MSA review
- 19. Sector-level incident data monitored — if your sector shows high Art.84 incident concentration, proactive pre-compliance
- 20. Compliance documentation stored on infrastructure outside US CLOUD Act jurisdiction
Category 5: Infrastructure Future-Proofing (5 items)
- 21. Art.11 technical documentation and Annex IV records stored on EU-native infrastructure
- 22. Art.12 logs jurisdiction confirmed — EU MSA accessible without CLOUD Act compellability risk
- 23. Art.73 incident reporting system hosted in EU jurisdiction
- 24. GPAI model cards (if applicable) stored on EU infrastructure (Art.97 evaluation may mandate this)
- 25. Infrastructure jurisdiction change timeline assessed — migration before 2027 Art.85 review is lower risk than after
Category 6: Commission Monitoring and Response (5 items)
- 26. AI Office publication tracker in place — Art.97 evaluation pre-reports and consultations announced 6-12 months before publication
- 27. Industry association engagement in place — Art.97 consultation processes include stakeholder input
- 28. Legal counsel briefed on Art.97 delegated act mechanism — amendments can come without full Parliament vote
- 29. 2028 compliance budget reserved for potential Annex III reclassification costs
- 30. Board/executive briefing on Art.97 timeline and developer implications scheduled for Q4 2026
See Also
- EU AI Act Art.85: The Review Clause — the 2027 comprehensive review that precedes Art.97
- EU AI Act Art.84: Annual Market Surveillance Reporting — MSA data that feeds Art.97 evaluation
- EU AI Act Art.99: Penalties and Fines — fine tiers subject to Art.97 calibration evaluation
- EU AI Act Art.96: SME Guidelines — SME-specific Art.97 implications (proportionality obligation)
- EU AI Act Art.95: Codes of Conduct — voluntary compliance as Art.97 good-faith signal