Blog — sota.io

2026-04-15·13 min read

The Question Every High-Risk AI Developer Faces After Compliance

You built a CV screening tool. You spent three months completing your Annex VI internal control assessment — the Art.9 risk register, the Art.10 data governance statement, the Annex IV technical documentation package, the Declaration of Conformity. You filed the EU AI Database registration. Your system is compliant as of August 2, 2026.

Then your data team retrains the model on the past six months of hiring outcomes. Your product manager adds a salary benchmarking module. Your ML engineer upgrades the embedding model from a 110M to a 340M parameter architecture.

The question your legal counsel will ask: do any of these changes require a new conformity assessment?

Art.3(23) of the EU AI Act defines the threshold: "substantial modification." Get it wrong in either direction — treating a substantial modification as routine, or treating every model update as a compliance reset — and the consequences are significant. Undertreating a substantial modification leaves you operating a non-compliant system. Overtreating it makes your CI/CD pipeline operationally impossible.

This guide covers the two-trigger test Art.3(23) establishes, how to classify common AI system changes, the downstream obligations that trigger on a substantial modification finding, and Python tooling to build a change management gate into your deployment pipeline.

What Art.3(23) Actually Says

The EU AI Act defines substantial modification at Art.3(23):

"substantial modification means a change to the AI system after its placing on the market or putting into service which affects the compliance of the AI system with this Regulation or results in a modification to the intended purpose for which the AI system has been assessed"

Two independent triggers. A change is substantial if it satisfies either:

Trigger 1 — Compliance-affecting change: The modification affects how the AI system complies with any requirement in the Regulation. This includes changes that affect Art.9 (risk management), Art.10 (training data quality), Art.13 (transparency information), Art.14 (human oversight mechanisms), Art.15 (accuracy, robustness, cybersecurity), or Art.17 (quality management system outputs).

Trigger 2 — Intended purpose modification: The modification results in a change to the intended purpose for which the system was originally assessed. This is the cleaner trigger — if your Declaration of Conformity says "CV screening for English-language applications in the financial services sector" and you expand to healthcare hiring, the intended purpose has changed.

The Downstream Obligations That Follow

A substantial modification finding activates four mandatory obligations:

Art.43(4) — New Conformity Assessment Required:

"Where substantial modifications are made to high-risk AI systems already placed on the market or already put into service, a new conformity assessment procedure shall be conducted..."

This is the core consequence. The new assessment is not a delta-review of what changed — it is a full conformity assessment for the modified system. If your system previously qualified for Annex VI internal control self-assessment (no notified body required), you can use the same track again, provided the modified system still does not fall under Annex VII mandatory third-party assessment categories.

Art.16(f) — Technical Documentation Update:

Art.16 requires providers to keep the technical documentation referred to in Art.11 up to date. After a substantial modification, the entire Annex IV technical documentation package must be revised to reflect the new system state before the modified version is placed on the market or put into service.

Art.9(2) — Risk Management System Update:

Art.9's "living document" obligation applies explicitly to substantial modifications. The risk management system must be updated to assess risks introduced by the modification. If your model retraining changes the training data distribution, the Art.9(3) risk identification step must be re-executed for the new distribution profile.

Art.48 / Art.22 — New Declaration of Conformity and EU AI Database Update:

After a substantial modification, the provider must sign a new Declaration of Conformity reflecting the modified system and update the EU AI Database registration record. Art.48(4) requires that the Declaration of Conformity contain information about substantial modifications.

Classifying Changes: What Is and Is Not Substantial

Art.3(23) does not provide a change classification table. Recital 66 offers some guidance: "Minor changes should not constitute a substantial modification." The Commission's GPAI Code of Practice and market surveillance guidance are still developing. Based on the two-trigger test, the following classification framework applies:

Almost Always Substantial

New intended purpose or deployment context: If your system's Declaration of Conformity describes the intended purpose as "creditworthiness assessment for retail consumers in the Netherlands" and you begin using it for insurance underwriting or expand to Germany, the intended purpose has changed. Every expansion of the Annex III category scope — new industry, new population, new use type — is substantial.

Expansion to a new Annex III category: A system originally certified as Annex III(4) employment AI (CV screening) that gains an Annex III(5) function (credit scoring module) requires a new assessment for the combined system. The new Annex III category creates a separate compliance obligation.

Performance degradation past declared thresholds: Art.15 requires providers to declare accuracy metrics for their system. If your post-market monitoring data shows the model's accuracy has dropped below the Art.15-declared threshold, operating the degraded model may itself constitute non-compliance. A modification that formally revises the accuracy declaration downward is compliance-affecting and triggers the two-trigger test.

Fundamental changes to human oversight mechanisms: Art.14 requires specific human oversight capabilities — override mechanisms, authority delegation, operator training documentation. A change that removes, restructures, or substantially alters these mechanisms affects compliance with Art.14 directly.

Context-Dependent — Apply the Two-Trigger Test

Model retraining on new data: The core question is whether the new training data changes the system's behavior materially beyond the declared accuracy envelope and risk register assumptions. Retraining on new data from the same distribution (additional months of historical hiring data from the same employer population) is typically not substantial. Retraining on data from a new population (expanding from EU to non-EU labor markets, or adding data from a new industry sector) may trigger Trigger 1 because Art.10(4) statistical properties and Art.9(3) risk assumptions may no longer hold.

Model architecture change: Upgrading from a smaller to a larger parameter version of the same architecture family (110M → 340M embedding model with the same task head) is typically not substantial if the system's input-output behavior and risk profile remain within declared bounds. A switch to a fundamentally different architecture (BERT → GPT-based) that changes how the system reaches outputs may be substantial if it affects the Art.9(3) risk identification (e.g., different failure modes for protected characteristic proxies).

Fine-tuning and adapter layers: Fine-tuning adds task-specific behavior without replacing the base model. If the fine-tuning uses Art.10-compliant data and does not expand the intended purpose, it is typically not substantial. The exception: if the fine-tuning introduces new capability domains (e.g., fine-tuning a hiring model to also evaluate performance reviews) that were not in the original scope, that is substantial.

New geographic market within the EU: Operating in a new EU member state does not change the intended purpose if the system's scope is declared as "EU market." However, if the Art.10(4) statistical properties of the new member state's labor market differ materially, a risk assessment update may be required, even if a full conformity assessment is not.

Almost Never Substantial

Security patches and bug fixes: Updates that correct software defects, update dependencies, or patch vulnerabilities without changing the AI system's decision logic or risk profile are not substantial. Recital 66 explicitly signals that such routine maintenance is excluded.

UI and interface changes: Changes to the user interface that do not affect the system's outputs, the Art.13 transparency information provided to users, or the Art.14 human oversight mechanisms are not substantial.

Infrastructure and deployment changes: Moving the system from one EU-sovereign datacenter to another, or upgrading server hardware, without changing software behavior is not substantial. Note: migrating from EU-sovereign to US-cloud hosting may raise separate Art.18 documentation retention jurisdiction issues independent of substantial modification.

Logging and monitoring enhancements: Adding more granular logging, improving Art.12 compliance, or enhancing post-market monitoring capabilities strengthens compliance — it does not trigger it.

The Substantial Modification Assessment Procedure

When a proposed change is identified, providers should conduct a formal substantial modification assessment before deployment. The assessment has four steps:

Step 1 — Change Documentation: Document the proposed modification in terms that map to the compliance framework: what in the system changes (model, data, logic, scope, infrastructure), and what declared properties the change affects (accuracy thresholds, risk register assumptions, intended purpose description, training data properties).

Step 2 — Two-Trigger Test: Apply each trigger independently. Trigger 1 (compliance-affecting): does the change affect any Art.9, 10, 13, 14, 15, or 17 requirement? Trigger 2 (intended purpose): does the change modify what was declared as the system's intended purpose in the original Declaration of Conformity? If either trigger is satisfied: substantial modification.

Step 3 — Consequence Mapping: For each substantial modification finding, identify which downstream obligations are activated: Art.43(4) new conformity assessment, Art.16(f) documentation update, Art.9(2) risk management update, Art.48 new Declaration of Conformity, Art.22 EU AI Database update.

Step 4 — Deployment Gate: The modified system must not be placed on the market or put into service until the new conformity assessment is complete and the Declaration of Conformity is signed. This is a hard gate, not a parallel process. Art.43(4) requires the new assessment before deployment, not after.

Python SubstantialModificationAssessor

The following implementation provides a structured approach to the two-trigger test suitable for integration into a CI/CD pipeline change management workflow:

from dataclasses import dataclass, field
from enum import Enum
from typing import Optional
import json
from datetime import date


class ChangeType(Enum):
    MODEL_RETRAIN = "model_retrain"
    MODEL_ARCHITECTURE = "model_architecture"
    FINE_TUNING = "fine_tuning"
    TRAINING_DATA_DISTRIBUTION = "training_data_distribution"
    INTENDED_PURPOSE_EXPANSION = "intended_purpose_expansion"
    NEW_ANNEX_III_CATEGORY = "new_annex_iii_category"
    HUMAN_OVERSIGHT_MECHANISM = "human_oversight_mechanism"
    ACCURACY_THRESHOLD_REVISION = "accuracy_threshold_revision"
    TRANSPARENCY_INFORMATION = "transparency_information"
    SECURITY_PATCH = "security_patch"
    UI_CHANGE = "ui_change"
    INFRASTRUCTURE = "infrastructure"
    LOGGING_ENHANCEMENT = "logging_enhancement"
    GEOGRAPHIC_EXPANSION_EU = "geographic_expansion_eu"
    PERFORMANCE_DEGRADATION = "performance_degradation"


@dataclass
class ProposedChange:
    change_type: ChangeType
    description: str
    # Trigger 1: compliance-affecting
    affects_art9_risk_assumptions: bool = False
    affects_art10_data_properties: bool = False
    affects_art13_transparency_info: bool = False
    affects_art14_human_oversight: bool = False
    affects_art15_accuracy_thresholds: bool = False
    affects_art17_qms_outputs: bool = False
    # Trigger 2: intended purpose
    modifies_declared_intended_purpose: bool = False
    modifies_annex_iii_category_scope: bool = False
    # Context
    population_change: bool = False
    industry_sector_change: bool = False
    proposed_by: str = ""
    proposed_date: str = field(default_factory=lambda: str(date.today()))


@dataclass
class SubstantialModificationResult:
    change: ProposedChange
    trigger_1_satisfied: bool
    trigger_2_satisfied: bool
    is_substantial: bool
    trigger_1_reasons: list[str]
    trigger_2_reasons: list[str]
    required_actions: list[str]
    deployment_blocked: bool

    def to_dict(self) -> dict:
        return {
            "change_type": self.change.change_type.value,
            "description": self.change.description,
            "trigger_1_satisfied": self.trigger_1_satisfied,
            "trigger_2_satisfied": self.trigger_2_satisfied,
            "is_substantial": self.is_substantial,
            "trigger_1_reasons": self.trigger_1_reasons,
            "trigger_2_reasons": self.trigger_2_reasons,
            "required_actions": self.required_actions,
            "deployment_blocked": self.deployment_blocked,
        }


PRESUMPTIVELY_NON_SUBSTANTIAL = {
    ChangeType.SECURITY_PATCH,
    ChangeType.UI_CHANGE,
    ChangeType.INFRASTRUCTURE,
    ChangeType.LOGGING_ENHANCEMENT,
}


def assess_substantial_modification(
    change: ProposedChange,
) -> SubstantialModificationResult:
    """
    Apply the Art.3(23) two-trigger test to a proposed change.
    Returns a SubstantialModificationResult with deployment gate status.
    """

    # Presumptively non-substantial changes (Recital 66)
    if change.change_type in PRESUMPTIVELY_NON_SUBSTANTIAL:
        # Still apply checks in case context flags override
        if not any([
            change.affects_art9_risk_assumptions,
            change.affects_art10_data_properties,
            change.affects_art13_transparency_info,
            change.affects_art14_human_oversight,
            change.affects_art15_accuracy_thresholds,
            change.affects_art17_qms_outputs,
            change.modifies_declared_intended_purpose,
            change.modifies_annex_iii_category_scope,
        ]):
            return SubstantialModificationResult(
                change=change,
                trigger_1_satisfied=False,
                trigger_2_satisfied=False,
                is_substantial=False,
                trigger_1_reasons=[],
                trigger_2_reasons=[],
                required_actions=["Update change log in technical documentation"],
                deployment_blocked=False,
            )

    # Trigger 1: compliance-affecting
    trigger_1_reasons = []
    if change.affects_art9_risk_assumptions:
        trigger_1_reasons.append(
            "Art.9(3): change affects risk identification assumptions — new risk register evaluation required"
        )
    if change.affects_art10_data_properties:
        trigger_1_reasons.append(
            "Art.10(3)/(4): change affects training data properties — representativeness and statistical property re-assessment required"
        )
    if change.affects_art13_transparency_info:
        trigger_1_reasons.append(
            "Art.13: change modifies transparency information provided to users or deployers"
        )
    if change.affects_art14_human_oversight:
        trigger_1_reasons.append(
            "Art.14: change affects human oversight mechanisms — override authority and operator training documentation must be updated"
        )
    if change.affects_art15_accuracy_thresholds:
        trigger_1_reasons.append(
            "Art.15: change revises declared accuracy, robustness, or cybersecurity thresholds — new Declaration of Conformity required"
        )
    if change.affects_art17_qms_outputs:
        trigger_1_reasons.append(
            "Art.17: change affects Quality Management System outputs or documented procedures"
        )
    if change.change_type == ChangeType.PERFORMANCE_DEGRADATION:
        trigger_1_reasons.append(
            "Art.15: performance degradation past declared threshold constitutes non-compliance"
        )
    if change.population_change:
        trigger_1_reasons.append(
            "Art.10(4): new population group may require re-assessment of statistical properties and bias examination"
        )

    trigger_1_satisfied = len(trigger_1_reasons) > 0

    # Trigger 2: intended purpose
    trigger_2_reasons = []
    if change.modifies_declared_intended_purpose:
        trigger_2_reasons.append(
            "Art.3(23) Trigger 2: modification to intended purpose as declared in the original conformity assessment"
        )
    if change.modifies_annex_iii_category_scope:
        trigger_2_reasons.append(
            "Art.3(23) Trigger 2: expansion to new Annex III category requires separate conformity assessment for the new scope"
        )
    if change.industry_sector_change:
        trigger_2_reasons.append(
            "Art.3(23) Trigger 2: change of industry sector constitutes modification to intended purpose deployment context"
        )

    trigger_2_satisfied = len(trigger_2_reasons) > 0

    is_substantial = trigger_1_satisfied or trigger_2_satisfied

    # Required actions
    required_actions = []
    if is_substantial:
        required_actions.append(
            "Art.43(4): conduct new conformity assessment before deploying modified system"
        )
        required_actions.append(
            "Art.16(f): update Annex IV technical documentation to reflect modified system"
        )
        required_actions.append(
            "Art.9(2): update risk management system — re-execute risk identification for changed scope"
        )
        required_actions.append(
            "Art.48: sign new Declaration of Conformity for the modified system"
        )
        required_actions.append(
            "Art.22: update EU AI Database registration record with modification details"
        )
        if trigger_2_satisfied:
            required_actions.append(
                "Art.43(4): verify modified system still qualifies for Annex VI track (no new Annex VII mandatory assessment categories)"
            )
    else:
        required_actions.append(
            "Art.11 / Art.18: document change in technical documentation change log"
        )
        required_actions.append(
            "Art.9(2): review whether risk register requires supplementary annotation (no full re-execution required)"
        )

    return SubstantialModificationResult(
        change=change,
        trigger_1_satisfied=trigger_1_satisfied,
        trigger_2_satisfied=trigger_2_satisfied,
        is_substantial=is_substantial,
        trigger_1_reasons=trigger_1_reasons,
        trigger_2_reasons=trigger_2_reasons,
        required_actions=required_actions,
        deployment_blocked=is_substantial,
    )


# Example: CV screening tool expanding to healthcare hiring
change = ProposedChange(
    change_type=ChangeType.INTENDED_PURPOSE_EXPANSION,
    description="Add healthcare hiring module to CV screening platform",
    modifies_declared_intended_purpose=True,
    modifies_annex_iii_category_scope=False,  # Still Annex III(4) employment
    affects_art9_risk_assumptions=True,  # Healthcare data = new risk profile
    affects_art10_data_properties=True,  # New training data population
    industry_sector_change=True,
    population_change=True,
    proposed_by="product-team",
)

result = assess_substantial_modification(change)
print(json.dumps(result.to_dict(), indent=2))

# Output:
# {
#   "change_type": "intended_purpose_expansion",
#   "description": "Add healthcare hiring module to CV screening platform",
#   "trigger_1_satisfied": true,
#   "trigger_2_satisfied": true,
#   "is_substantial": true,
#   "trigger_1_reasons": [
#     "Art.9(3): change affects risk identification assumptions...",
#     "Art.10(3)/(4): change affects training data properties...",
#     "Art.10(4): new population group may require re-assessment...",
#   ],
#   "trigger_2_reasons": [
#     "Art.3(23) Trigger 2: modification to intended purpose...",
#     "Art.3(23) Trigger 2: change of industry sector..."
#   ],
#   "required_actions": [...],
#   "deployment_blocked": true
# }

The Deployment Gate Pattern for CI/CD Pipelines

High-risk AI systems in production need a formal change management gate that runs the substantial modification assessment before any deployment reaches staging or production. The pattern:

import subprocess
import sys

def deployment_gate(change: ProposedChange, environment: str) -> bool:
    """
    CI/CD gate: blocks deployment if substantial modification requires
    new conformity assessment first. Logs assessment result as artifact.
    """
    result = assess_substantial_modification(change)

    assessment_record = {
        "pipeline_run": environment,
        "assessment_date": str(date.today()),
        "result": result.to_dict(),
    }

    # Write artifact for compliance audit trail
    with open("substantial-modification-assessment.json", "w") as f:
        json.dump(assessment_record, f, indent=2)

    if result.deployment_blocked:
        print("DEPLOYMENT BLOCKED — Substantial Modification Detected")
        print("Required actions before deployment:")
        for action in result.required_actions:
            print(f"  - {action}")
        print("\nArt.43(4) requires new conformity assessment before deploying.")
        print("File: substantial-modification-assessment.json")
        return False

    print("DEPLOYMENT CLEARED — Not a substantial modification")
    print("Non-substantial change log entry created for Annex IV documentation.")
    return True


if __name__ == "__main__":
    # Called from CI/CD with change metadata from environment variables
    # or a change-descriptor YAML committed with the PR
    change = load_change_from_pr_description()
    if not deployment_gate(change, environment=sys.argv[1]):
        sys.exit(1)

The assessment record produced by this gate functions as the compliance artifact demonstrating that the provider applied the Art.3(23) two-trigger test. Keep it in version control alongside the change that was assessed.

Art.18 Documentation Retention for Substantial Modifications

Art.18(2) requires that technical documentation and quality management system documentation be kept for 10 years after the AI system is placed on the market or put into service. For systems that undergo substantial modifications, this means:

Each version of the system is a separate compliance record. The original Declaration of Conformity and technical documentation package for version 1.0 must be retained separately from the documentation for version 2.0 (post-substantial modification).
The 10-year clock restarts for each substantial modification. If version 2.0 is deployed on March 1, 2027, its documentation must be retained until March 1, 2037.
EU jurisdiction matters. Art.18 documentation kept in US-cloud storage is subject to CLOUD Act access — a separate risk issue from the retention obligation itself, but relevant when choosing where to store conformity records.

A versioned documentation tree with immutable archives for each substantial modification version is the recommended structure.

25-Item Substantial Modification Change Management Checklist

Before the Change (Change Assessment)

1. Document the proposed change in modification-log format with affected system components
2. Apply Trigger 1: identify which Art.9/10/13/14/15/17 requirements the change affects
3. Apply Trigger 2: determine if the change modifies intended purpose or Annex III category scope
4. Record substantial modification assessment result (substantial or not-substantial) with reasoning
5. If substantial: block deployment pipeline pending new conformity assessment completion
6. If not-substantial: document reasoning in technical documentation change log

Conformity Assessment for Substantial Modifications

7. Verify modified system still qualifies for Annex VI self-assessment track (no new Annex VII triggers)
8. Execute full Art.9 risk management re-assessment for the modified system scope
9. Update Art.10 data governance statement for any new or changed training data
10. Re-verify Art.13 transparency information is accurate for the modified system's outputs
11. Test Art.14 human oversight mechanisms on the modified system
12. Re-run Art.15 accuracy and robustness benchmarks — confirm modified system meets declared thresholds
13. Update Art.17 QMS documentation to reflect modified development procedures

Technical Documentation Update (Art.11 / Annex IV)

14. Update Annex IV Section 1 (general description) with modified intended purpose
15. Update Annex IV Section 2 (technical description) with new architecture or training details
16. Update Annex IV Section 3 (training datasets) if training data changed
17. Update Annex IV Section 4 (risk management lifecycle log) with new risk assessment
18. Update Annex IV Section 5 (human oversight documentation) with any mechanism changes
19. Update Annex IV Section 6 (logging) if logging configuration changed
20. Archive previous version of technical documentation as immutable record

Declarations and Registration

21. Sign new Declaration of Conformity under Art.48 for the modified system
22. Update CE marking documentation if applicable (Art.49)
23. Update EU AI Database registration under Art.22 with modification record
24. Notify market surveillance authority if required under Art.20 (corrective action obligation)

Post-Deployment

25. Restart post-market monitoring baseline metrics collection from the modified system's first production day

Common Mistakes

Treating the substantial modification assessment as the legal team's job: The two-trigger test requires technical inputs — which AI system properties changed, which compliance requirements are affected. Legal teams cannot make this assessment without engineers mapping the change to the compliance framework. The assessment must be a joint engineering-legal exercise.

Deploying while the new conformity assessment is in progress: Art.43(4) is clear: the new conformity assessment must be completed before the modified system is placed on the market or put into service. Running the assessment in parallel with a staged rollout is non-compliant.

Treating every CI/CD model update as presumptively non-substantial: Some teams attempt to exempt all automated retraining from substantial modification review by labeling it "routine maintenance." Automated retraining that changes the system's behavior materially — particularly if it introduces new training data distributions or moves accuracy metrics outside the declared envelope — is not routine maintenance.

Forgetting the EU AI Database update: Many providers correctly execute the new conformity assessment but forget that Art.22 requires the EU AI Database registration to be updated to reflect the modification. The database record must remain current.

Discarding the previous-version documentation: Version 1.0's technical documentation must be retained under Art.18 even after version 2.0 is deployed. Market surveillance authorities investigating a complaint about version 1.0 behavior need access to version 1.0's compliance record.

Infrastructure Note: Where You Store Modification Records Matters

Art.18 requires 10-year documentation retention. Conformity assessment packages, declarations of conformity, and modification assessment records are technical documentation within the Art.18 scope. Storing these on US-cloud infrastructure creates CLOUD Act jurisdiction exposure — US government entities can compel access to documentation held by US-incorporated cloud providers regardless of where the data physically resides.

EU-sovereign storage for conformity documentation eliminates this exposure. sota.io operates exclusively on EU infrastructure incorporated in the EU — your modification assessment records, risk registers, and declarations of conformity are not reachable via CLOUD Act subpoena.

This guide reflects the EU AI Act (Regulation (EU) 2024/1689) as amended by the Digital Omnibus (EU) 2025/1337 and associated Commission implementing guidance current as of April 2026. Art.43(4) obligations apply from August 2, 2026 for new systems and August 2, 2027 for existing systems undergoing substantial modification.

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View plans