2026-04-12·12 min read·sota.io team

EU AI Act Art.97: Commission Evaluation — What Gets Reviewed and When — Developer Guide (2026)

The EU AI Act is not a static regulation. Its architecture includes multiple built-in feedback mechanisms that allow the framework to evolve as AI technology changes. Most developers know about the Art.85 comprehensive review scheduled for 2027. Fewer understand that Article 97 creates a parallel, more targeted evaluation mechanism with different scope, different timing, and different developer implications.

Article 97 is the Commission's ongoing evaluation engine. Where Art.85 is the legislative reset triggered by a 3-year review, Art.97 is the regulatory calibration instrument — it evaluates specific technical and operational parameters of the Act on a rolling basis, feeds those findings into Commission reports, and provides the legal basis for delegated act amendments without requiring full legislative procedure.

For developers, the Art.97 cycle matters because it is the mechanism most likely to change your specific compliance obligations: what Annex III categories cover, where the GPAI systemic risk threshold sits, and how enforcement levels are calibrated.

What Article 97 Actually Says

Article 97 mandates the Commission to evaluate and report on the following specific matters — separately and on a more technical basis than the comprehensive Art.85 review.

The core evaluation obligation:

The Commission shall, within 4 years of the date of application of this Regulation, carry out an assessment of and report to the European Parliament and to the Council on the following:

(a) Adequacy of the supervision framework — whether the governance structure under Chapter VI and VII is functioning effectively. This includes the AI Office's oversight of GPAI models, the European AI Board's coordination role, and the adequacy of national Market Surveillance Authority resources.

(b) Scope of prohibited practices under Art.5 — whether the list of absolutely forbidden AI applications remains appropriate, whether new capabilities have created gaps in the prohibition list, and whether any current prohibition has become technically obsolete or overly broad.

(c) Annex III high-risk classification — this is the most developer-relevant evaluation item. The Commission assesses whether the eight categories in Annex III remain appropriate, whether new AI application areas should be added, and whether any category should be removed or narrowed.

(d) Effectiveness of conformity assessment — whether the Art.43-49 conformity pathway is generating sufficient assurance, or whether new conformity mechanisms are needed for emerging AI architectures (multimodal models, foundation-model-based deployments).

(e) Penalty level calibration — whether Art.99 fine tiers are achieving deterrence. If enforcement data from MSA Art.84 annual reports shows systematic underfiling of violations or inadequate provider behaviour change, the Commission can propose penalty adjustments.

(f) GPAI threshold review — the 10²⁵ FLOP systemic risk threshold in Art.51 is explicitly subject to the Art.97 evaluation. As compute efficiency improves and more models approach or cross this threshold, the Commission evaluates whether the trigger level remains technically meaningful.

The Art.85 vs Art.97 Distinction

Developers building compliance roadmaps past 2027 need to understand the difference between these two review mechanisms, because they operate on different timelines and have different legislative consequences.

Dimension	Art.85 Review	Art.97 Evaluation
Trigger	3 years after application (≈ August 2027)	4 years after application (≈ August 2028)
Scope	Comprehensive — entire Act	Targeted — specific parameters
Output	Report to Parliament + Council → can trigger legislative amendment	Commission evaluation report → can trigger delegated acts
Annex III	Covered as part of full Act review	Primary focus — specific assessment
GPAI thresholds	Covered generally	Explicitly evaluated
Legislative mechanism	Full legislative procedure required	Delegated acts under Art.97(3) possible without full Parliament/Council vote
Developer impact	Broad potential changes	Surgical Annex III and threshold changes most likely

The critical distinction is the legislative mechanism. Art.85 amendments require the full EU legislative process — Commission proposal → Parliament debate → Council vote → potentially 18-24 months. Art.97 evaluation can result in delegated acts — Commission-only amendments to Annexes I-III, with a shorter timeline and less political friction.

This means Art.97-triggered Annex III expansions can happen faster than Art.85-triggered changes to the core text.

The Art.84 → Art.97 Data Pipeline

The Art.97 evaluation does not happen in a vacuum. The Commission's assessment of Annex III adequacy, enforcement effectiveness, and penalty calibration depends on data — and that data flows from MSA annual reports under Art.84.

The pipeline works as follows:

MSA investigations (Art.74-76)
         ↓
Serious incident reports (Art.73)
         ↓
MSA annual reports to Commission (Art.84)
         ↓ aggregated across all 27 member states
Commission evaluation database
         ↓
Art.97 evaluation report (Year 4)
         ↓ if Annex III scope issues identified
Delegated act amending Annex III (Art.97(3))

For developers, this pipeline has a specific implication: your Art.73 serious incident reports and your EUAIDB registration data enter the Commission's Art.97 evaluation input. Sectors with high incident reporting rates are most likely to receive increased scrutiny in the Art.97 assessment — and potentially Annex III expansion or stricter conformity requirements.

Three data categories are particularly influential in Art.97 evaluations:

1. Incident concentration by sector: If Art.84 annual reports show that a particular AI application type (say, AI-assisted insurance underwriting) generates disproportionate serious incident reports, Art.97 evaluation may trigger reclassification of that application into Annex III, even if it was previously considered non-high-risk.

2. Enforcement gap patterns: If MSAs consistently fail to identify Annex III violations in certain sectors — because current definitions are ambiguous, or because MSAs lack technical capacity for specific AI types — the Art.97 evaluation can recommend conformity procedure improvements or new harmonized standards.

3. Fine effectiveness signals: If Art.84 reports show that maximum Art.99 fines are being imposed without changing provider behaviour, the Art.97 evaluation provides the formal basis for proposing penalty escalation through delegated acts or legislative amendment recommendation.

Annex III Expansion Risk: The Developer's Core Concern

The Art.97 evaluation of Annex III scope is the parameter most likely to affect your compliance obligations. Understanding which current AI applications are at risk of reclassification helps prioritize compliance investment.

Current Annex III categories (for reference):

Category	Description	Reclassification risk
1. Biometric	Remote identification, categorization, emotion recognition	Stable — already comprehensive
2. Critical Infrastructure	Utilities, transport, finance	Medium — IoT/edge AI gaps
3. Education	Access/admission decisions	Low — well-defined
4. Employment	Hiring, performance assessment, termination	Medium — gig economy AI gaps
5. Essential Services	Credit, health, insurance, social benefits	High — AI scoring expansion likely
6. Law Enforcement	Risk assessment, crime prediction, evidence evaluation	High — newer AI investigative tools
7. Migration/Asylum	Risk assessment, document verification	Medium — biometric AI expansion
8. Justice/Democracy	Electoral influence, court decisions	High — generative AI political risks

Applications currently outside Annex III that face reclassification risk by 2028:

AI-assisted insurance underwriting (currently non-high-risk if below direct benefit/exclusion trigger) — large incident volume in financial services Art.84 data likely
Autonomous HR performance scoring (continuous monitoring, not just hiring decisions) — coverage gap identified by Art.84 labor enforcement data
AI-driven content recommendation with political influence (currently Art.5(1)(a)(iii) covers electoral manipulation, but influence below direct manipulation threshold is in a gray zone)
Medical triage AI (below Art.6(2) threshold if not autonomous diagnosis) — Art.84 healthcare incident data may push toward Annex III

GPAI Threshold Compression

The 10²⁵ FLOP systemic risk threshold for GPAI models under Art.51 is explicitly within the Art.97 evaluation scope. The trajectory suggests compression — not expansion.

Why the threshold is likely to decrease:

In 2024 when the threshold was set, reaching 10²⁵ FLOP was a frontier-model-only achievement. By 2026, multiple model families are approaching or crossing this threshold. Algorithmic improvements mean equivalent capability can be achieved at lower compute. By 2028 (Art.97 evaluation year), the 10²⁵ FLOP threshold may apply to what are then considered mid-tier commercial models.

If the Commission's Art.97 evaluation concludes that the systemic risk designation applies only to a shrinking set of models that become progressively less representative of actual AI risk, it will adjust the threshold downward — bringing more models under Art.55 systemic risk obligations (adversarial testing, model evaluation, incident reporting, cybersecurity measures).

For GPAI model providers below the current threshold:

If your model is at 10²³–10²⁴ FLOP training compute, it is within the Art.97 evaluation range
Proactive voluntary Art.56 code of practice adoption reduces regulatory risk if threshold moves
Art.95 voluntary code of conduct adoption signals good faith to the Commission evaluation
Maintain documentation structures compatible with Art.55 requirements even if not currently mandatory

CLOUD Act Implications in Commission Evaluation Reports

The Art.97 evaluation reports have an additional significance that is not obvious from the text of the article itself: they are one of the primary mechanisms by which the Commission can explicitly address CLOUD Act conflicts with EU AI Act compliance requirements.

The current AI Act text does not mention the CLOUD Act by name. But Art.97 evaluation reports are policy documents — the Commission can include findings and recommendations that go beyond the statutory text. If Art.84 data shows that CLOUD Act compellability of compliance documentation is creating systematic enforcement problems (MSA requests for documentation held on US infrastructure being contested or delayed by US government national security assertions), Art.97 is the vehicle for a formal Commission position.

What a CLOUD Act finding in an Art.97 evaluation report could mean:

Commission finding	Delegated act consequence
US cloud infrastructure creates "systematic risk of CLOUD Act compellability" for Annex III compliance docs	Annex I amendment requiring EU-jurisdiction documentation storage for high-risk AI
GPAI systematic risk assessments accessible under US DOJ compulsion	Art.55 amendment specifying storage jurisdiction requirements
Art.73 incident reports from providers on US cloud compromised by CLOUD Act	Art.73 implementation act requiring EU-native reporting systems

This is not speculative — the Commission's 2021 CLOUD Act assessment, the European Data Protection Board's 2023 guidelines, and multiple national DPA enforcement actions have all pointed toward this structural conflict. The Art.97 evaluation is the first formal legislative vehicle for addressing it.

For developers choosing infrastructure for AI compliance documentation today, this trajectory points in one direction: EU-native infrastructure now is insurance against regulatory mandate later.

Python Tooling: Regulatory Future-Proofing

from dataclasses import dataclass, field
from enum import Enum
from datetime import date
from typing import Optional

class AnnexIIICategory(Enum):
    BIOMETRIC = "biometric_id_categorization"
    CRITICAL_INFRA = "critical_infrastructure"
    EDUCATION = "education_training"
    EMPLOYMENT = "employment_hr"
    ESSENTIAL_SERVICES = "essential_services_credit_health"
    LAW_ENFORCEMENT = "law_enforcement"
    MIGRATION_ASYLUM = "migration_asylum"
    JUSTICE_DEMOCRACY = "justice_democracy_elections"
    NOT_HIGH_RISK = "not_currently_high_risk"

class ReclassificationRisk(Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    IMMINENT = "imminent_2028"

@dataclass
class Art97ComplianceAssessment:
    current_category: AnnexIIICategory
    reclassification_risk: ReclassificationRisk
    gpai_threshold_exposure: bool
    cloud_act_documentation_risk: bool
    recommended_actions: list[str]
    art97_evaluation_year: int = 2028
    art85_review_year: int = 2027

def assess_art97_exposure(
    ai_application_type: str,
    training_compute_flop: Optional[float],
    documentation_on_us_cloud: bool,
    current_annex_iii: bool,
    sector: str
) -> Art97ComplianceAssessment:
    """
    Assess a developer's exposure to Art.97 Commission evaluation changes.
    
    Returns risk profile and recommended preparatory actions.
    """
    
    # GPAI threshold exposure
    gpai_exposed = (
        training_compute_flop is not None and 
        training_compute_flop >= 1e23  # Within 100x of current 10^25 threshold
    )
    
    # Reclassification risk by sector
    HIGH_RISK_SECTORS = {
        "financial_services", "insurance", "healthcare_triage", 
        "political_content", "gig_economy_hr", "autonomous_hr_scoring"
    }
    MEDIUM_RISK_SECTORS = {
        "employment_general", "essential_services", "law_enforcement_adjacent",
        "critical_infrastructure_iot"
    }
    
    if sector in HIGH_RISK_SECTORS and not current_annex_iii:
        reclass_risk = ReclassificationRisk.HIGH
        category = AnnexIIICategory.ESSENTIAL_SERVICES  # Most likely catch-all expansion
    elif sector in MEDIUM_RISK_SECTORS and not current_annex_iii:
        reclass_risk = ReclassificationRisk.MEDIUM
        category = AnnexIIICategory.EMPLOYMENT
    elif current_annex_iii:
        reclass_risk = ReclassificationRisk.LOW  # Already covered
        category = AnnexIIICategory.ESSENTIAL_SERVICES
    else:
        reclass_risk = ReclassificationRisk.LOW
        category = AnnexIIICategory.NOT_HIGH_RISK
    
    # Build action plan
    actions = []
    
    if reclass_risk in {ReclassificationRisk.HIGH, ReclassificationRisk.IMMINENT}:
        actions.extend([
            "Begin Art.11 technical documentation now (before reclassification mandate)",
            "Implement Art.9 risk management process in advance",
            "Establish Art.12 logging with EU-jurisdiction storage",
            "Register on Art.95 voluntary code of conduct as good-faith signal to Commission"
        ])
    
    if gpai_exposed:
        actions.extend([
            f"Training compute {training_compute_flop:.1e} FLOP: within Art.97 threshold review range",
            "Implement Art.55 documentation structures proactively",
            "Monitor AI Office GPAI register for threshold adjustment announcements",
            "Consider Art.95 voluntary code adoption before Art.55 becomes mandatory"
        ])
    
    if documentation_on_us_cloud:
        actions.extend([
            "CLOUD Act risk in Art.97 evaluation trajectory: migrate compliance docs to EU-native storage",
            "Review Art.73 incident report system jurisdiction before 2027 Art.85 review",
            "Document CLOUD Act exposure in DPO risk register for Art.97 evaluation window"
        ])
    
    actions.append(
        f"Schedule Art.85 review prep for Q2 2027 and Art.97 evaluation prep for Q2 2028"
    )
    
    return Art97ComplianceAssessment(
        current_category=category,
        reclassification_risk=reclass_risk,
        gpai_threshold_exposure=gpai_exposed,
        cloud_act_documentation_risk=documentation_on_us_cloud,
        recommended_actions=actions
    )


@dataclass
class AnnexIIIExpansionTracker:
    """
    Tracks indicators that a non-high-risk AI system may be 
    reclassified under Art.97 Commission evaluation.
    """
    application_type: str
    sector: str
    incident_reports_submitted: int = 0
    msa_inquiry_received: bool = False
    competitor_reclassified: bool = False
    art84_data_exposure: bool = False
    
    def expansion_risk_score(self) -> float:
        """Returns 0.0 (no risk) to 1.0 (imminent reclassification)."""
        score = 0.0
        if self.msa_inquiry_received:
            score += 0.3  # MSA attention = high signal
        if self.incident_reports_submitted > 0:
            score += min(0.2, self.incident_reports_submitted * 0.05)
        if self.competitor_reclassified:
            score += 0.25  # Sector-level reclassification wave pattern
        if self.art84_data_exposure:
            score += 0.15
        return min(score, 1.0)
    
    def pre_compliance_recommendation(self) -> str:
        score = self.expansion_risk_score()
        if score >= 0.5:
            return (
                "HIGH: Begin Art.11 technical documentation immediately. "
                "Art.97 evaluation is likely to reclassify this application type. "
                "Pre-compliance now prevents emergency remediation post-reclassification."
            )
        elif score >= 0.25:
            return (
                "MEDIUM: Prepare lightweight technical documentation skeleton. "
                "Monitor AI Office Art.97 evaluation publications from 2027 onward."
            )
        return (
            "LOW: Maintain Art.95 voluntary compliance posture. "
            "Annual review of Art.97 Commission evaluation reports recommended."
        )

The Developer's Art.97 Timeline

2025-08-02: EU AI Act applies (prohibited practices, GPAI, Art.95-96)
2026-08-02: Full Act application — Annex III high-risk obligations begin
2027-08-02: Art.85 comprehensive review published
              → Legislative amendments begin (if recommended)
2028-08-02: Art.97 evaluation published
              → Delegated acts amending Annexes possible (no full Parliament vote)
              → GPAI threshold adjustment most likely here
              → New Annex III categories possible
2029-early:  Delegated act amendments from Art.97 evaluation take effect
2030-08-02:  Second Art.97 evaluation cycle begins

For developers building AI systems today, the compliance architecture decision window is approximately 2026–2027 — before the Art.85 review that will generate recommendations, and well before the Art.97 delegated acts that could rapidly expand Annex III.

30-Item Art.97 Future-Proofing Checklist

Category 1: Understanding Your Review Exposure (5 items)

1. Art.97 evaluation timeline documented in your compliance calendar (2028 primary, 2032 secondary)
2. Your AI system's sector mapped against Annex III reclassification risk table above
3. GPAI training compute logged — if within 100x of 10²⁵ FLOP, Art.97 threshold review is relevant
4. Art.84 data exposure assessed — do your Art.73 incident reports enter the Commission's Art.97 input pipeline?
5. Art.85 (2027) and Art.97 (2028) reviews distinguished in compliance roadmap

Category 2: Annex III Pre-Compliance (5 items)

6. If reclassification risk is HIGH: Art.9 risk management process initiated (framework in place, not just planned)
7. If reclassification risk is HIGH: Art.11 technical documentation skeleton prepared (can be expanded, not built from scratch)
8. If reclassification risk is HIGH: Art.12 logging infrastructure in place (EU-jurisdiction storage)
9. If reclassification risk is MEDIUM: lightweight documentation template prepared for rapid expansion
10. Art.5 prohibited practice audit complete regardless of current Annex III status (applies from August 2025)

Category 3: GPAI Threshold Monitoring (5 items)

11. Training compute of any models you develop or deploy documented in internal registry
12. AI Office GPAI model register monitored for threshold guidance publications
13. If within threshold range: Art.55 documentation structures prepared voluntarily
14. If within threshold range: Art.95 voluntary code of conduct adoption considered as good-faith signal
15. Supply chain: GPAI models you integrate assessed for systemic risk classification status

Category 4: Art.84 Data Management (5 items)

16. Art.73 serious incident reports filed for all qualifying events (underreporting increases reclassification risk)
17. EUAIDB registration complete and up-to-date (MSA-accessible data feeds Art.84 annual reports)
18. Post-market monitoring records (Art.72) maintained and available for MSA review
19. Sector-level incident data monitored — if your sector shows high Art.84 incident concentration, proactive pre-compliance
20. Compliance documentation stored on infrastructure outside US CLOUD Act jurisdiction

Category 5: Infrastructure Future-Proofing (5 items)

21. Art.11 technical documentation and Annex IV records stored on EU-native infrastructure
22. Art.12 logs jurisdiction confirmed — EU MSA accessible without CLOUD Act compellability risk
23. Art.73 incident reporting system hosted in EU jurisdiction
24. GPAI model cards (if applicable) stored on EU infrastructure (Art.97 evaluation may mandate this)
25. Infrastructure jurisdiction change timeline assessed — migration before 2027 Art.85 review is lower risk than after

Category 6: Commission Monitoring and Response (5 items)

26. AI Office publication tracker in place — Art.97 evaluation pre-reports and consultations announced 6-12 months before publication
27. Industry association engagement in place — Art.97 consultation processes include stakeholder input
28. Legal counsel briefed on Art.97 delegated act mechanism — amendments can come without full Parliament vote
29. 2028 compliance budget reserved for potential Annex III reclassification costs
30. Board/executive briefing on Art.97 timeline and developer implications scheduled for Q4 2026