2026-06-03·5 min read·sota.io Team

EU AI Act Art.10 Data Governance Finale: Complete Training Data Compliance Checklist Before August 2026

Post #5 (Finale) in the sota.io EU AI Act Data Governance Sprint — August 2026 Deadline

EU AI Act Art.10 Data Governance compliance checklist for August 2026 deadline

Sixty days from now, high-risk AI systems operating in the EU must be fully compliant with Art.10 of the EU AI Act. The deadline — August 2, 2026 — is not a filing date. It is the date from which non-compliant systems face market surveillance scrutiny, notified body audits, and penalties reaching €15 million or 3% of global annual turnover under Art.99.

This finale post consolidates everything from the four-part sprint into a single, actionable compliance checklist. Use it to audit your current data governance posture, identify gaps, and assign remediation tasks before the deadline arrives.

The Sprint So Far:


What Art.10 Covers: The Full Scope

Art.10 of the EU AI Act governs data and data governance for high-risk AI systems. It applies to providers — organisations that develop or place high-risk AI systems on the EU market — and by extension to the training, validation, and testing datasets those systems depend on.

The article's requirements span six functional areas:

AreaArticle ReferenceCore Obligation
Data governance practicesArt.10(1)Appropriate management practices for all training, validation, and testing data
Documentation requirementsArt.10(2)(a)–(g)Seven specific documentation categories covering design choices through gap analysis
Dataset quality standardsArt.10(3)Relevant, representative, complete, error-free datasets to the extent possible
Bias monitoring with sensitive dataArt.10(4)Special category data processing permitted solely for bias detection and correction
Regulatory sandbox accessArt.10(5)Competent authority access to datasets in sandbox contexts
General purpose AI applicabilityArt.10(6)Art.10 requirements apply where GPAI systems are high-risk

The August 2, 2026 deadline applies to all high-risk AI systems under Annex III of the EU AI Act — including systems in biometric identification, critical infrastructure, education, employment, access to services, law enforcement, migration management, and administration of justice.


The Complete Compliance Checklist

Section 1: Data Governance Framework (Art.10(1))

Governance Structure

Dataset Registry

Governance Process Integration


Section 2: Documentation Requirements (Art.10(2)(a)–(g))

Art.10(2)(a) — Design Choices

Art.10(2)(b) — Data Collection Processes and Origin

Art.10(2)(c) — Data Preparation Operations

Art.10(2)(d) — Assumptions

Art.10(2)(e) — Availability, Quantity, and Suitability Assessment

Art.10(2)(f) — Bias Examination

Art.10(2)(g) — Data Gaps and Shortcomings


Section 3: Dataset Quality Standards (Art.10(3))

Relevance

Representativeness

Freedom from Errors

Completeness


Section 4: Sensitive Data for Bias Monitoring (Art.10(4))

Art.10(4) creates a narrow exception: providers may process special category personal data under GDPR Art.9 solely for bias detection and correction in their high-risk AI system.


Section 5: CI/CD Integration Verification (Cross-Reference Post 4)

These items verify that your Art.10 compliance is automated and enforced at the pipeline level:


Section 6: Audit Readiness

These items ensure you can respond to a competent authority inquiry within a reasonable timeframe:


Scoring Your Current Posture

Score each section against your current state:

Section 1: Data Governance Framework     ___/10 items
Section 2: Documentation Requirements   ___/35 items  
Section 3: Dataset Quality Standards    ___/14 items
Section 4: Sensitive Data (if applicable) ___/7 items
Section 5: CI/CD Integration            ___/7 items
Section 6: Audit Readiness              ___/6 items
                                        ___/79 total

Interpretation:


The 10 Highest-Risk Gaps (What Auditors Look for First)

Based on the audit patterns that have emerged from early AI Act enforcement guidance and analogous GDPR enforcement decisions:

  1. No provenance chain for third-party datasets — "We licensed it" is not sufficient without the collection method and original purpose documented
  2. Bias testing only on training set, not test set — Art.10(2)(f) requires examination before deployment; validation set bias alone does not cover this
  3. Undocumented cleaning operations — If you dropped 15% of your dataset to remove outliers and didn't document why, that is a gap
  4. Assumptions documented nowhere — Temporal and geographic assumptions are the most commonly missing
  5. No gap register — Art.10(2)(g) requires that you identified gaps, not just that none exist
  6. Art.10(4) data mixed with training data — Processing sensitive data for bias detection must be strictly separated
  7. Design choices not version-controlled — If you can't show what choices were made for a specific model version, you cannot demonstrate compliance at audit time
  8. No designated governance owner — "The team is responsible" does not satisfy the governance framework obligation
  9. Error rates not measured — "We cleaned the data" without recorded error rates before and after is not sufficient
  10. Art.10 documentation not linked to Annex IV — Competent authorities review technical documentation first; if Art.10 evidence is in a separate system with no pointer, it may as well not exist

Implementation Timeline: June 3 to August 2, 2026

June 3–15  (12 days): Audit current state using this checklist. Score each section.
                      Assign owners to each gap. Triage: Critical / Important / Nice-to-have.

June 16–30 (15 days): Close Critical gaps. Priority order:
                       1. Governance owner designation
                       2. Dataset registry creation
                       3. Provenance documentation for all training datasets
                       4. Bias testing with recorded metrics

July 1–15  (15 days): Close Important gaps:
                       1. Art.10(2)(a)–(g) documentation for all datasets
                       2. CI/CD gate implementation
                       3. Error rate measurement and recording

July 16–25  (10 days): Internal compliance audit.
                        Simulate a competent authority documentation request.
                        Fix remaining gaps identified.

July 26–31   (6 days): Final review. Lock documentation versions.
                        Confirm technical documentation package is complete.

August 2, 2026: Deadline. System must be compliant from this date.

Quick-Reference: Key Article Numbers for Art.10

For your documentation and during auditor discussions:

ReferenceWhat It Covers
Art.10(1)General data governance obligations
Art.10(2)(a)Design choice documentation
Art.10(2)(b)Data collection processes and origin
Art.10(2)(c)Data preparation operations
Art.10(2)(d)Relevant assumptions
Art.10(2)(e)Dataset availability, quantity, suitability
Art.10(2)(f)Bias examination
Art.10(2)(g)Data gaps and shortcomings
Art.10(3)Dataset quality: relevance, representativeness, error-free
Art.10(4)Sensitive data exception for bias detection
Art.10(5)Competent authority access in sandboxes
Art.10(6)Application to general purpose AI in high-risk context
Art.11Technical documentation (links to Art.10 evidence)
Art.17Quality management system (governance framework home)
Art.99(4)Penalties: €15M or 3% global annual turnover

Closing the Sprint

This five-post sprint covered the complete Art.10 compliance lifecycle — from understanding what the article requires, through bias testing methodology, provenance logging architecture, CI/CD gate implementation, and now this consolidated checklist.

The August 2 deadline is fixed. The obligations are specific. The checklist above gives you a concrete audit surface: 79 items that, when checked, represent a defensible Art.10 compliance posture.

The organisations that will face enforcement action in the first wave after August 2 are not primarily the ones that tried and fell slightly short — they are the ones that have no documentation at all, no governance owner, no record of ever having considered Art.10. If you have completed this sprint, you are already in a significantly better position than that.


This post is part of the sota.io EU AI Act Compliance Series. Related: Art.10 Data Governance FoundationsDataset Diversity and Bias TestingProvenance LoggingCI/CD Data Governance Gates

sota.io is EU-native managed PaaS — deploy compliant AI infrastructure on Hetzner Germany, no CLOUD Act exposure. Get started →

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.