EU AI Act Art.15 Accuracy, Robustness & Cybersecurity: Developer Guide (High-Risk AI 2026)
EU AI Act Article 15 is the technical performance obligation in the high-risk AI compliance chain: providers must ensure high-risk AI systems achieve appropriate accuracy levels, remain robust under deployment conditions, and withstand adversarial attacks. Art.15 closes the compliance arc that begins with Art.9 risk management and runs through Art.10 (training data), Art.11 (documentation), Art.12 (logging), Art.13 (transparency), and Art.14 (human oversight).
This guide covers Art.15(1)-(5) implementation: accuracy declaration and metrics (Annex IV Section 6), robustness testing patterns including reproducibility and consistency requirements, failsafe design for errors and faults, the five cybersecurity vectors developers must address, static analysis tooling (Astrée, Frama-C), NIS2 × Art.15 intersection, the CLOUD Act attack surface argument for EU-native deployment, Python implementation for compliance validation, and the 40-item Art.15 compliance checklist.
Art.15 in the High-Risk AI Compliance Chain
Art.15 is the final provider obligation in the high-risk AI technical compliance sequence:
| Article | Obligation | Art.15 Interface |
|---|---|---|
| Art.9 | Risk Management System | Cybersecurity risks are Art.9 measures; Art.15 is the implementation |
| Art.10 | Training Data Governance | Training data poisoning (Art.15(5)) requires Art.10 data integrity controls |
| Art.11 | Technical Documentation | Accuracy metrics documented in Annex IV Section 6 |
| Art.12 | Logging & Record-Keeping | Robustness failures generate Art.12 mandatory log events |
| Art.13 | Transparency & IFU | Accuracy declarations disclosed to deployers in Art.13(3)(b)(ii) |
| Art.14 | Human Oversight | Art.14 oversight compensates for Art.15 performance limitations |
| Art.15 | Accuracy, Robustness, Cybersecurity | This article |
Art.15 differs from Arts.9–14 in that it is an ongoing obligation rather than a pre-market documentation requirement. Accuracy and robustness must be maintained throughout the operational lifetime of the high-risk AI system, not just at the point of market placement.
Art.15(1) — Scope: All High-Risk AI Systems
"High-risk AI systems shall be designed and developed in such a way that they achieve, in the light of their intended purpose, an appropriate level of accuracy, robustness and cybersecurity, and perform consistently in those respects throughout their lifecycle."
Art.15(1) applies to all Annex III high-risk AI systems without carve-out. The standard is purposive — "appropriate level" is measured against the intended purpose and the risk profile of the specific application domain.
The three obligations of Art.15(1):
| Obligation | Legal Standard | Developer Implication |
|---|---|---|
| Accuracy | Appropriate level for intended purpose | Must declare and maintain accuracy metrics; failure to meet declared metrics = Art.15 breach |
| Robustness | Perform consistently throughout lifecycle | Degradation under operational drift = Art.15 breach even if initial accuracy was acceptable |
| Cybersecurity | Appropriate level for intended purpose | Must assess and mitigate adversarial attack vectors as part of pre-market development |
"Throughout their lifecycle" is architecturally significant:
Art.15(1) creates a temporal obligation. A model that meets accuracy thresholds at deployment but degrades six months later due to distribution shift has violated Art.15(1) — even if no code was changed. This drives the need for continuous monitoring, post-market performance tracking (Art.72), and accuracy logging (Art.12).
Art.15(2) — Accuracy Declaration & Metrics
"The levels of accuracy and the relevant accuracy metrics for a given high-risk AI system shall be declared in the accompanying instructions for use."
Art.15(2) creates a transparency obligation that connects to Art.13: providers must declare the accuracy levels their systems achieve and the metrics used to measure them. These declarations become part of the Art.13 Instructions for Use and must be documented in Annex IV Section 6.
What must be declared:
| Declaration Element | Legal Basis | Documentation Location |
|---|---|---|
| Accuracy metric(s) chosen | Art.15(2) | Annex IV Section 6 + Art.13(3)(b)(ii) IFU |
| Accuracy level(s) achieved | Art.15(2) | Annex IV Section 6 + Art.13(3)(b)(ii) IFU |
| Evaluation methodology | Art.11 Annex IV | Annex IV Section 3 (test datasets) |
| Demographic disaggregation | Art.10(3) | Annex IV Section 3 + Art.13(3)(b)(v) |
| Limitations and known failure modes | Art.13(3)(b)(iii) | Art.13 IFU |
Accuracy metrics by application domain:
| High-Risk AI Domain | Annex III Category | Relevant Metrics |
|---|---|---|
| Biometric identification | Annex III Cat.1 | FAR (False Accept Rate), FRR (False Reject Rate), EER |
| Medical device AI | Annex III Cat.5 | Sensitivity, Specificity, AUC-ROC, PPV/NPV |
| Employment screening | Annex III Cat.4 | Disparate Impact Ratio, Equalized Odds, Precision@K |
| Credit scoring | Annex III Cat.5(b) | Gini Coefficient, KS Statistic, PSI (Population Stability Index) |
| Critical infrastructure AI | Annex III Cat.2 | MTTF, Availability, False Positive Rate for anomaly detection |
| Student assessment AI | Annex III Cat.3 | Predictive Parity, Calibration, Individual Fairness metrics |
from dataclasses import dataclass, field
from typing import Optional
from enum import Enum
class AccuracyMetricType(Enum):
ACCURACY = "accuracy"
PRECISION = "precision"
RECALL = "recall"
F1 = "f1_score"
AUC_ROC = "auc_roc"
FAR = "false_accept_rate"
FRR = "false_reject_rate"
EER = "equal_error_rate"
DISPARATE_IMPACT_RATIO = "disparate_impact_ratio"
POPULATION_STABILITY_INDEX = "population_stability_index"
@dataclass
class AccuracyDeclaration:
"""
Art.15(2) + Annex IV Section 6 compliant accuracy declaration.
Must be included in Art.13 Instructions for Use.
"""
system_id: str
intended_purpose: str
annex_iii_category: str
# Primary metric (required)
primary_metric: AccuracyMetricType
primary_metric_value: float
primary_metric_threshold: float # minimum acceptable level
# Evaluation dataset (Art.10 reference)
test_dataset_size: int
test_dataset_period: str # "2025-01 to 2025-12"
test_dataset_source: str
# Demographic disaggregation (Art.10(3))
demographic_breakdown: dict = field(default_factory=dict)
# {"gender": {"male": 0.87, "female": 0.85}, "age_group": {"18-30": 0.89, "60+": 0.82}}
# Known limitations (Art.13(3)(b)(iii))
known_limitations: list = field(default_factory=list)
# Secondary metrics (optional but recommended)
secondary_metrics: dict = field(default_factory=dict)
# Lifecycle monitoring threshold (Art.15(1) ongoing obligation)
revalidation_trigger_pct: float = 5.0 # revalidate if accuracy drops >5% from declared
def validate_declaration(self) -> list:
"""Returns list of Art.15(2) compliance gaps."""
gaps = []
if self.primary_metric_value < self.primary_metric_threshold:
gaps.append(f"PRIMARY METRIC BELOW THRESHOLD: {self.primary_metric.value} "
f"{self.primary_metric_value:.3f} < {self.primary_metric_threshold:.3f}")
if not self.demographic_breakdown:
gaps.append("MISSING: demographic_breakdown required by Art.10(3) + Art.13(3)(b)(v)")
if not self.known_limitations:
gaps.append("MISSING: known_limitations required by Art.13(3)(b)(iii)")
if self.test_dataset_size < 100:
gaps.append(f"SMALL TEST DATASET: {self.test_dataset_size} samples may be insufficient")
return gaps
def to_ifu_section(self) -> str:
"""Generate Art.13 IFU Section 2(b) content for accuracy declaration."""
lines = [
f"## Accuracy Declaration (Art.15(2) / Annex IV Section 6)",
f"",
f"**System:** {self.system_id}",
f"**Intended Purpose:** {self.intended_purpose}",
f"**Annex III Category:** {self.annex_iii_category}",
f"",
f"**Primary Performance Metric:** {self.primary_metric.value.upper()}",
f"**Declared Accuracy Level:** {self.primary_metric_value:.4f}",
f"**Minimum Acceptable Level:** {self.primary_metric_threshold:.4f}",
f"",
f"**Evaluation Dataset:** {self.test_dataset_size} samples ({self.test_dataset_period})",
]
if self.demographic_breakdown:
lines.append(f"\n**Performance by Demographic Group:**")
for group, metrics in self.demographic_breakdown.items():
for demographic, value in metrics.items():
lines.append(f" - {group}/{demographic}: {value:.4f}")
if self.known_limitations:
lines.append(f"\n**Known Limitations:**")
for lim in self.known_limitations:
lines.append(f" - {lim}")
return "\n".join(lines)
Art.15(3) — Robustness: Reproducibility & Consistency
"The robustness of high-risk AI systems may be achieved through technical redundancy solutions, which may include backup or fail-safe plans."
Art.15(3) addresses operational robustness — the requirement that high-risk AI systems perform consistently across the range of conditions they may encounter, not just under ideal laboratory conditions.
Robustness dimensions under Art.15(3):
| Dimension | Definition | Testing Approach |
|---|---|---|
| Reproducibility | Same input → same output under identical conditions | Determinism testing, random seed control |
| Consistency across conditions | Accuracy maintained across input distribution shifts | Out-of-distribution (OOD) testing |
| Temporal consistency | Performance maintained over time (concept drift) | Monitoring with Population Stability Index (PSI) |
| Operational consistency | Performance maintained under concurrent load | Stress testing, load testing |
| Cross-deployment consistency | Same accuracy across deployment environments | Canary testing, A/B validation |
Reproducibility requirements for regulated AI:
For high-risk AI domains (biometric identification, medical, employment), non-deterministic outputs are a potential Art.15(3) violation. If the same patient data produces different risk scores on two consecutive runs, deployer oversight (Art.14) becomes meaningless.
import hashlib
import json
from typing import Any
from dataclasses import dataclass
@dataclass
class RobustnessTestResult:
test_name: str
passed: bool
metric_before: float
metric_after: float
delta_pct: float
threshold_pct: float # max acceptable degradation
class RobustnessTestSuite:
"""
Art.15(3) robustness validation suite for high-risk AI systems.
Run before market placement (Art.15(1)) and at monitoring intervals (Art.72).
"""
def __init__(self, model, accuracy_declaration: AccuracyDeclaration):
self.model = model
self.declaration = accuracy_declaration
self.results: list[RobustnessTestResult] = []
def test_reproducibility(self, test_inputs: list, n_runs: int = 10) -> RobustnessTestResult:
"""
Verify deterministic output for identical inputs.
Critical for Art.15(3) and Art.14 meaningful oversight.
"""
outputs_per_input = {}
for inp in test_inputs[:50]: # sample 50 inputs
input_hash = hashlib.sha256(json.dumps(inp, sort_keys=True).encode()).hexdigest()
outputs = [self.model.predict(inp) for _ in range(n_runs)]
unique_outputs = len(set(str(o) for o in outputs))
outputs_per_input[input_hash] = unique_outputs
reproducible = sum(1 for v in outputs_per_input.values() if v == 1)
reproducibility_rate = reproducible / len(outputs_per_input)
result = RobustnessTestResult(
test_name="reproducibility",
passed=reproducibility_rate >= 0.95,
metric_before=1.0,
metric_after=reproducibility_rate,
delta_pct=(1.0 - reproducibility_rate) * 100,
threshold_pct=5.0
)
self.results.append(result)
return result
def test_ood_robustness(self, id_test_data, ood_test_data,
metric_fn) -> RobustnessTestResult:
"""
Measure accuracy degradation under out-of-distribution data.
Art.15(3): consistent performance across use conditions.
"""
id_score = metric_fn(self.model, id_test_data)
ood_score = metric_fn(self.model, ood_test_data)
delta_pct = ((id_score - ood_score) / id_score) * 100 if id_score > 0 else 100.0
threshold_pct = 15.0 # >15% degradation = robustness failure
result = RobustnessTestResult(
test_name="ood_robustness",
passed=delta_pct <= threshold_pct,
metric_before=id_score,
metric_after=ood_score,
delta_pct=delta_pct,
threshold_pct=threshold_pct
)
self.results.append(result)
return result
def test_population_stability(self, baseline_scores: list,
current_scores: list) -> RobustnessTestResult:
"""
Population Stability Index (PSI) for distribution drift.
PSI > 0.2 = significant drift = Art.15(1) lifecycle obligation triggered.
"""
import numpy as np
baseline_hist, bins = np.histogram(baseline_scores, bins=10, range=(0, 1))
current_hist, _ = np.histogram(current_scores, bins=bins)
# Avoid division by zero
baseline_pct = (baseline_hist + 0.0001) / len(baseline_scores)
current_pct = (current_hist + 0.0001) / len(current_scores)
psi = float(np.sum((current_pct - baseline_pct) * np.log(current_pct / baseline_pct)))
result = RobustnessTestResult(
test_name="population_stability_index",
passed=psi <= 0.2,
metric_before=0.0,
metric_after=psi,
delta_pct=psi * 100,
threshold_pct=20.0 # PSI 0.2 = 20%
)
self.results.append(result)
return result
def generate_art15_robustness_report(self) -> dict:
"""Generate Annex IV Section 6 robustness evidence."""
return {
"system_id": self.declaration.system_id,
"test_date": __import__("datetime").date.today().isoformat(),
"tests_run": len(self.results),
"tests_passed": sum(1 for r in self.results if r.passed),
"tests_failed": sum(1 for r in self.results if not r.passed),
"results": [
{
"name": r.test_name,
"passed": r.passed,
"metric_before": r.metric_before,
"metric_after": r.metric_after,
"delta_pct": round(r.delta_pct, 2),
"threshold_pct": r.threshold_pct
}
for r in self.results
],
"art15_compliant": all(r.passed for r in self.results)
}
Art.15(4) — Resilience to Errors, Faults, and Inconsistencies
"The high-risk AI systems shall be resilient as regards errors, faults or inconsistencies that may occur within the system or the environment in which the system operates, in particular due to their interaction with natural persons or other systems."
Art.15(4) addresses operational resilience — the system must handle unexpected conditions without catastrophic failure. For high-risk AI systems, "catastrophic failure" means producing outputs that cause the harm the system was designed to prevent.
Failure modes covered by Art.15(4):
| Failure Type | Source | Example | Required Response |
|---|---|---|---|
| Input errors | Human operator | Malformed data, missing fields | Graceful degradation, not silent failure |
| Sensor/data faults | Environment | Corrupted sensor feed, missing API data | Fallback to safe state |
| Inconsistencies | System interaction | Contradictory inputs from upstream systems | Conflict detection + human escalation |
| Distributional inconsistencies | Deployment drift | Inputs outside training distribution | OOD detection + low-confidence flag |
| Infrastructure faults | Hardware/software | Memory error, network timeout | Fail-safe to last-known-good state |
Fail-safe design patterns for Art.15(4):
Art.15(4) explicitly states resilience "may be achieved through technical redundancy solutions, which may include backup or fail-safe plans." The regulation suggests but does not mandate specific architectures. For high-risk AI, three fail-safe patterns are common:
- Safe state fallback: When fault detected → revert to predefined safe output (refuse to decide, flag for human)
- Redundant inference: Multiple model paths; if results diverge beyond threshold → escalate to human (Art.14)
- Graceful degradation: Reduce functionality rather than fail entirely; log degradation event (Art.12)
from enum import Enum
from dataclasses import dataclass, field
from typing import Optional, Any
import logging
class FailureMode(Enum):
INPUT_ERROR = "input_error"
SENSOR_FAULT = "sensor_fault"
OOD_INPUT = "ood_input"
INCONSISTENCY = "inconsistency"
INFRASTRUCTURE_FAULT = "infrastructure_fault"
ACCURACY_DEGRADATION = "accuracy_degradation"
@dataclass
class Art15FailsafeEvent:
"""Art.12 mandatory log event for Art.15(4) resilience actions."""
system_id: str
failure_mode: FailureMode
input_hash: str # Art.12 privacy-preserving input reference
safe_state_action: str # what the system did instead of proceeding
escalated_to_human: bool # Art.14 trigger
timestamp_utc: str
class Art15ResilientInference:
"""
Art.15(4)-compliant inference wrapper with fail-safe mechanisms.
Logs all resilience events to Art.12 audit trail.
"""
def __init__(self, model, accuracy_declaration: AccuracyDeclaration,
ood_detector=None):
self.model = model
self.declaration = accuracy_declaration
self.ood_detector = ood_detector
self.logger = logging.getLogger(f"art15.{accuracy_declaration.system_id}")
self._failsafe_events: list[Art15FailsafeEvent] = []
def _hash_input(self, inp: Any) -> str:
"""Privacy-preserving input reference per Art.12 + GDPR Art.5(1)(f)."""
import hashlib, json
return hashlib.sha256(json.dumps(inp, sort_keys=True,
default=str).encode()).hexdigest()[:16]
def safe_predict(self, inp: Any, context: dict = None) -> dict:
"""
Art.15(4)-compliant prediction with comprehensive fail-safe handling.
Returns: {output, confidence, failsafe_triggered, escalate_to_human}
"""
import datetime
input_hash = self._hash_input(inp)
timestamp = datetime.datetime.utcnow().isoformat() + "Z"
# Step 1: Input validation
validation_error = self._validate_input(inp)
if validation_error:
event = Art15FailsafeEvent(
system_id=self.declaration.system_id,
failure_mode=FailureMode.INPUT_ERROR,
input_hash=input_hash,
safe_state_action="refused_to_decide_invalid_input",
escalated_to_human=True,
timestamp_utc=timestamp
)
self._log_failsafe_event(event)
return {"output": None, "confidence": 0.0,
"failsafe_triggered": True, "escalate_to_human": True,
"reason": f"input_validation_failed: {validation_error}"}
# Step 2: OOD detection (Art.15(3) + Art.15(4))
if self.ood_detector and self.ood_detector.is_ood(inp):
event = Art15FailsafeEvent(
system_id=self.declaration.system_id,
failure_mode=FailureMode.OOD_INPUT,
input_hash=input_hash,
safe_state_action="low_confidence_flag_human_review",
escalated_to_human=True,
timestamp_utc=timestamp
)
self._log_failsafe_event(event)
return {"output": None, "confidence": 0.0,
"failsafe_triggered": True, "escalate_to_human": True,
"reason": "ood_input_detected"}
# Step 3: Inference with error handling
try:
output = self.model.predict(inp)
confidence = getattr(output, "confidence", 1.0)
# Step 4: Low confidence → human escalation (Art.14)
min_confidence = 0.70
if confidence < min_confidence:
event = Art15FailsafeEvent(
system_id=self.declaration.system_id,
failure_mode=FailureMode.ACCURACY_DEGRADATION,
input_hash=input_hash,
safe_state_action="low_confidence_escalated_to_human",
escalated_to_human=True,
timestamp_utc=timestamp
)
self._log_failsafe_event(event)
return {"output": output, "confidence": confidence,
"failsafe_triggered": True, "escalate_to_human": True,
"reason": f"confidence_below_threshold_{min_confidence}"}
return {"output": output, "confidence": confidence,
"failsafe_triggered": False, "escalate_to_human": False}
except Exception as e:
event = Art15FailsafeEvent(
system_id=self.declaration.system_id,
failure_mode=FailureMode.INFRASTRUCTURE_FAULT,
input_hash=input_hash,
safe_state_action="inference_failed_safe_state",
escalated_to_human=True,
timestamp_utc=timestamp
)
self._log_failsafe_event(event)
return {"output": None, "confidence": 0.0,
"failsafe_triggered": True, "escalate_to_human": True,
"reason": f"inference_exception: {type(e).__name__}"}
def _validate_input(self, inp: Any) -> Optional[str]:
"""Override in subclass for domain-specific input validation."""
if inp is None:
return "null_input"
return None
def _log_failsafe_event(self, event: Art15FailsafeEvent):
"""Log to Art.12 audit trail."""
self.logger.warning(
"ART15_FAILSAFE event_type=%s system=%s input_hash=%s action=%s human=%s",
event.failure_mode.value, event.system_id, event.input_hash,
event.safe_state_action, event.escalated_to_human
)
self._failsafe_events.append(event)
Art.15(5) — Cybersecurity Provisions
"High-risk AI systems shall be resilient as regards attempts by unauthorised third parties to alter their use, outputs or performance by exploiting the system vulnerabilities."
Art.15(5) enumerates the specific adversarial attack vectors that high-risk AI providers must defend against. This is the most technically detailed cybersecurity obligation in the EU AI Act.
The five Art.15(5) attack vectors:
| Attack Vector | Technical Name | Description | Annex III Risk Level |
|---|---|---|---|
| Training data poisoning | Data poisoning attack | Adversary injects malicious samples into training dataset to corrupt model behavior | Existential for all high-risk AI |
| Adversarial examples | Adversarial attack | Imperceptible perturbations to input cause misclassification | High for biometric/medical/autonomous AI |
| Model corruption | Model backdoor / Trojan | Hidden behavior triggered by specific input pattern | High for pre-trained model use |
| Data integrity attacks | Runtime data manipulation | Manipulation of inference-time data pipeline | Medium for all inference systems |
| Confidentiality attacks | Model extraction / membership inference | Extraction of training data or model weights via query interface | High for sensitive domain AI |
Art.15(5) defense requirements by vector:
Training Data Poisoning (Art.15(5) + Art.10)
Art.10's data governance requirements (representativeness, bias examination, data gap documentation) are the foundational defense against training data poisoning. Art.15(5) adds the adversarial threat model: an attacker who controls even a small fraction (as few as 1-3%) of training samples can embed backdoor behavior.
Minimum controls:
- Data provenance tracking for all training samples (Art.10(2)(a))
- Anomaly detection on training data (statistical outlier detection)
- Clean-label attack detection (samples with correct labels but adversarial features)
- Training data integrity verification (cryptographic hash chain for dataset versions)
Adversarial Examples
For Annex III Cat.1 (biometric), Cat.2 (critical infrastructure), and Cat.5 (medical) systems, adversarial example robustness is a concrete Art.15(5) requirement. EU AI Act does not specify which defense technique to use — it requires that the declared accuracy levels hold under adversarial conditions.
Defense techniques recognized in practice:
- Adversarial training (TRADES, PGD adversarial training)
- Certified defenses (randomized smoothing for ℓ2-bounded perturbations)
- Input preprocessing (feature squeezing, JPEG compression for image inputs)
- Ensemble defenses (multiple model paths; unanimous agreement required)
from dataclasses import dataclass, field
from typing import Optional
@dataclass
class CybersecurityThreatAssessment:
"""Single threat vector assessment for Art.15(5) + Art.9."""
vector: str # "training_data_poisoning", "adversarial_examples", etc.
applicable: bool
risk_level: str # "critical", "high", "medium", "low"
controls_implemented: list
controls_gaps: list
residual_risk: str
art9_risk_management_measure: str # reference to Art.9 RMS entry
class Art15CybersecurityChecker:
"""
Art.15(5) cybersecurity compliance checker.
Covers all five attack vectors mandated by Art.15(5).
"""
REQUIRED_CONTROLS = {
"training_data_poisoning": [
"data_provenance_tracking", # Art.10(2)(a)
"training_data_integrity_hashing", # Art.15(5) + Art.10
"statistical_outlier_detection",
"dataset_versioning",
"access_control_training_pipeline", # Art.9 measure
],
"adversarial_examples": [
"adversarial_robustness_evaluation", # Art.15(2) accuracy under attack
"adversarial_training_or_certified_defense",
"input_validation_range_check", # Art.15(4)
"confidence_threshold_for_outputs", # Art.14 escalation
"robustness_testing_in_annex_iv", # Art.11 Annex IV Section 6
],
"model_corruption": [
"model_integrity_verification", # hash of trained weights
"supply_chain_security_pretrained", # Art.9 + NIS2 Art.21(2)(d)
"backdoor_detection_testing",
"model_source_documentation", # Art.11 Annex IV Section 2
"model_signing",
],
"data_integrity": [
"inference_pipeline_integrity",
"input_validation_schema_enforcement", # Art.15(4)
"api_authentication_inference_endpoint",
"tls_for_data_in_transit",
"audit_log_inference_inputs", # Art.12
],
"confidentiality": [
"rate_limiting_inference_api",
"differential_privacy_if_sensitive_data",
"model_output_perturbation",
"monitoring_for_model_extraction", # Art.12 anomaly detection
"gdpr_compliant_training_data", # GDPR Art.25
]
}
def __init__(self, system_id: str, annex_iii_category: str):
self.system_id = system_id
self.annex_iii_category = annex_iii_category
self.assessments: list[CybersecurityThreatAssessment] = []
def assess_vector(self, vector: str,
implemented_controls: list) -> CybersecurityThreatAssessment:
"""Assess one Art.15(5) threat vector."""
required = self.REQUIRED_CONTROLS.get(vector, [])
gaps = [c for c in required if c not in implemented_controls]
coverage = len(implemented_controls) / len(required) if required else 1.0
if coverage >= 0.9:
residual_risk = "low"
elif coverage >= 0.7:
residual_risk = "medium"
else:
residual_risk = "high"
assessment = CybersecurityThreatAssessment(
vector=vector,
applicable=True,
risk_level="high" if gaps else "low",
controls_implemented=implemented_controls,
controls_gaps=gaps,
residual_risk=residual_risk,
art9_risk_management_measure=f"RMS-{vector.upper()[:10]}-001"
)
self.assessments.append(assessment)
return assessment
def full_art15_audit(self, implemented_controls_by_vector: dict) -> dict:
"""Run full Art.15(5) cybersecurity audit across all 5 vectors."""
results = {}
for vector, required in self.REQUIRED_CONTROLS.items():
implemented = implemented_controls_by_vector.get(vector, [])
results[vector] = self.assess_vector(vector, implemented)
overall_gaps = sum(len(a.controls_gaps) for a in self.assessments)
overall_pass = all(a.residual_risk in ("low", "medium")
for a in self.assessments)
return {
"system_id": self.system_id,
"annex_iii_category": self.annex_iii_category,
"vectors_assessed": len(self.assessments),
"overall_compliance": "PASS" if overall_pass else "FAIL",
"total_gaps": overall_gaps,
"results": {
v: {
"residual_risk": a.residual_risk,
"gaps": a.controls_gaps
}
for v, a in results.items()
}
}
Art.15 × Art.9 Risk Management: Cybersecurity as Mandatory Risk Measure
Art.15(5) does not exist in isolation — it is the specific technical manifestation of the Art.9 Risk Management System's cybersecurity dimension. Art.9 requires providers to identify, analyze, and mitigate risks to health, safety, and fundamental rights. For AI systems, adversarial attacks on the model are a recognized risk category.
Art.9 × Art.15 integration requirements:
| Art.9 RMS Element | Art.15 Implementation |
|---|---|
| Risk identification (Art.9(2)(a)) | Threat modeling for all 5 Art.15(5) vectors |
| Risk analysis (Art.9(2)(b)) | Adversarial robustness evaluation (Art.15(2)) |
| Risk management measures (Art.9(2)(c)) | Technical controls per Art.15(3)-(5) |
| Residual risk assessment (Art.9(5)) | Art15CybersecurityChecker.full_art15_audit() |
| Risk management monitoring (Art.9(7)) | Continuous accuracy monitoring (Art.72 PMM) |
| Testing procedures (Art.9(8)) | RobustnessTestSuite (Art.15(3)) |
Treating cybersecurity as a first-class Art.9 risk means the Art.9 RMS must document:
- Specific adversarial threat model (attacker capabilities, incentives, access)
- Technical controls and their effectiveness against each vector
- Residual risk acceptance criteria
- Monitoring plan for emerging adversarial techniques
Static Analysis for Art.15 Robustness: Astrée and Frama-C
For safety-critical high-risk AI systems (medical devices, critical infrastructure, autonomous systems), static analysis tools provide Art.15(3) robustness evidence that runtime testing alone cannot supply.
EU-developed static analysis tools:
| Tool | Origin | Art.15 Application |
|---|---|---|
| Astrée | CNRS/INRIA Paris | Proves absence of runtime errors in C/C++ embedded AI inference code; used for DO-178C aviation + IEC 62443 industrial certification |
| Frama-C | CEA/INRIA Paris | WP plugin for Hoare-logic verification of C code; ACSL annotations for inference bounds verification |
| Coq | INRIA France | Formal proof of classifier properties (monotonicity, fairness guarantees) — used in academic AI verification research |
| Why3 | INRIA/LRI Paris | Deductive verification platform for ML property proofs |
Astrée for AI inference robustness:
Astrée analyzes C/C++ inference code statically and can prove absence of:
- Array out-of-bounds access (critical for tensor operations)
- Integer overflow in fixed-point inference
- Null pointer dereferences
- Division by zero in normalization layers
For Annex III systems deployed in embedded or safety-critical environments, Astrée analysis outputs constitute Annex IV Section 6 evidence of Art.15(3) robustness — a level of assurance that probabilistic testing cannot provide.
Frama-C ACSL annotations for inference bounds:
// Frama-C ACSL annotation for EU AI Act Art.15(3) robustness proof
// Proves: output is always in [0,1] range for softmax layer
// Art.15(2) accuracy declaration relies on this invariant
/*@
requires \valid_read(input + (0 .. n-1));
requires n > 0;
requires \forall int i; 0 <= i < n ==> \is_finite(input[i]);
assigns output[0 .. n-1];
ensures \forall int i; 0 <= i < n ==>
0.0 <= output[i] <= 1.0; // Art.15(2) range guarantee
ensures \sum(output, 0, n) == 1.0; // probability simplex
*/
void softmax(float *input, float *output, int n) {
// ... implementation ...
}
NIS2 × Art.15 Cybersecurity Overlap
For high-risk AI systems deployed by Essential or Important Entities under NIS2 Directive (2022/2555), the Art.15(5) cybersecurity requirements overlap with NIS2 Art.21 ICT risk management measures:
| Cybersecurity Domain | Art.15(5) Requirement | NIS2 Art.21 Measure |
|---|---|---|
| Supply chain security | Model integrity verification, pre-trained model source validation | Art.21(2)(d) Supply chain security |
| Access control | Training pipeline access control, inference API authentication | Art.21(2)(i) Access control policies |
| Incident detection | Monitoring for adversarial attack patterns, model extraction detection | Art.21(2)(b) Incident handling |
| Data integrity | Training data integrity hashing, inference pipeline integrity | Art.21(2)(c) Business continuity |
| Testing | Adversarial robustness testing in Annex IV | Art.21(2)(e) Security testing |
NIS2 × Art.15 unified control framework:
Organizations that are both AI providers under AI Act and Essential/Important Entities under NIS2 should implement a unified cybersecurity control framework that satisfies both obligations simultaneously. The AI Act's Annex IV Section 6 robustness documentation can incorporate NIS2 Art.21 measure evidence — avoiding duplicate documentation.
Dual incident reporting:
An adversarial attack on a high-risk AI system that causes a significant security incident triggers:
- NIS2 Art.23: 24-hour early warning to CSIRT + competent national authority
- AI Act Art.73: Serious incident report to market surveillance authority (MSA)
Two timelines, two authorities, two report formats — but the underlying security event is one.
CLOUD Act × Art.15: US Infrastructure as Attack Vector
Art.15(5) positions adversarial attacks as a cybersecurity compliance obligation. For high-risk AI systems deployed on US cloud infrastructure, the CLOUD Act creates a legally sanctioned pathway for US government access to:
- Training datasets (Art.10 provenance + bias data): CLOUD Act-compellable → evidence base for Art.15(5) poisoning attacks is accessible to foreign governments
- Model weights and architecture (Art.11 Annex IV Section 2): CLOUD Act-compellable → model extraction via legal process
- Inference logs (Art.12 audit trail): CLOUD Act-compellable → attack reconnaissance and timing data
The CLOUD Act threat model is not adversarial in the traditional sense — it is not a vulnerability exploit but a legal mechanism. However, for Art.9 risk management purposes, it is a data integrity risk and confidentiality risk that must be assessed:
| CLOUD Act Exposure | Art.15(5) Vector | Risk |
|---|---|---|
| Training data access | Data poisoning reconnaissance | Attacker maps data gaps before poisoning |
| Model weight access | Model extraction (confidentiality attack) | Facilitates targeted adversarial example crafting |
| Inference log access | Attack timing and input pattern analysis | Enables adaptive adversarial attacks |
| Oversight records (Art.14) | Social engineering of oversight personnel | Attacker identifies human oversight weak points |
EU-native cybersecurity posture:
| Art.15 Dimension | US Cloud | EU-Native (sota.io) |
|---|---|---|
| Training data jurisdiction | CLOUD Act compellable | EU law governs exclusively |
| Model weights at rest | Dual-jurisdiction exposure | Single EU regime |
| Inference logs (Art.12) | US DOJ accessible | GDPR + EU MSA accessible only |
| Cybersecurity audit evidence | Dual compellability risk | Single EU jurisdiction |
| Art.11 10-year documentation retention | Decade of CLOUD Act exposure | Single EU regime for decade |
For Annex III providers deploying AI systems that process personal data — particularly healthcare, employment, biometric identification — EU-native infrastructure eliminates the CLOUD Act dimension from an Art.15 threat model that is already complex.
Art.15 Cross-Article Matrix
| Article | Obligation | Art.15 Interface |
|---|---|---|
| Art.9 Risk Management | Risk mitigation system | Art.15 cybersecurity controls ARE Art.9 risk measures |
| Art.10 Training Data | Bias examination + data integrity | Art.10 data integrity controls = Art.15(5) poisoning defense |
| Art.11 Annex IV | Technical documentation | Art.15 accuracy + robustness evidence in Annex IV Section 6 |
| Art.12 Logging | Automatic event logs | Art.15(4) failsafe events + Art.15(5) attack detection are Art.12 entries |
| Art.13 IFU | Instructions for use | Art.15(2) accuracy declaration disclosed in Art.13(3)(b)(ii) |
| Art.14 Human Oversight | Override + interrupt | Art.14 escalation triggered by Art.15(4) low-confidence failsafe |
| Art.47 Declaration of Conformity | Conformity self-declaration | Art.15 compliance is a conformity declaration element |
| Art.72 Post-Market Monitoring | PMM system | Accuracy drift = Art.15(1) lifecycle failure = Art.72 trigger |
| Art.73 Incident Reporting | Serious incidents | Successful adversarial attack on high-risk AI = Art.73 reportable |
| NIS2 Art.21 | ICT security measures | Art.15(5) cybersecurity overlaps with NIS2 Art.21(2)(d)(e)(i) |
| GDPR Art.25 | Privacy by design | Training data GDPR controls = Art.15(5) poisoning defense foundation |
| GDPR Art.32 | Security of processing | Inference pipeline security = Art.15(5) data integrity controls |
Art.15 Compliance Checklist (40-Item)
Accuracy (Art.15(2)) — 10 items:
- 1. Primary accuracy metric declared in Annex IV Section 6 and Art.13 IFU
- 2. Accuracy level achieved meets or exceeds declared threshold
- 3. Test dataset size, period, and source documented in Annex IV Section 3
- 4. Demographic disaggregation of accuracy metrics (Art.10(3) + Art.13(3)(b)(v))
- 5. Known accuracy limitations disclosed in Art.13 IFU
- 6. Secondary metrics documented where applicable (sensitivity/specificity/AUC)
- 7. Accuracy evaluation reproducible (same test split → same result)
- 8. Accuracy monitoring process defined for post-deployment (Art.72 PMM)
- 9. Revalidation trigger threshold defined (e.g., >5% accuracy drop)
- 10. Accuracy declaration updated after substantial modification (Art.11)
Robustness (Art.15(3)) — 10 items:
- 11. Reproducibility test results documented (same input → same output)
- 12. OOD robustness evaluated (accuracy degradation under distribution shift ≤15%)
- 13. Population Stability Index (PSI) baseline established for monitoring
- 14. Temporal consistency monitoring plan defined
- 15. Cross-environment consistency validated (dev ≈ staging ≈ production)
- 16. Load/stress testing performed for inference consistency under concurrent use
- 17. Robustness test results included in Annex IV Section 6 documentation
- 18. Robustness thresholds defined and accepted by designated responsible person (Art.14(3)(c))
- 19. Robustness re-testing schedule defined for operational lifetime
- 20. Robustness failures generate Art.12 audit log entries
Resilience / Failsafe (Art.15(4)) — 10 items:
- 21. Input validation implemented and tested
- 22. OOD detection implemented or OOD scope explicitly bounded
- 23. Low-confidence threshold defined → Art.14 human escalation triggered
- 24. Infrastructure fault handling implemented (graceful degradation not silent failure)
- 25. Safe state defined and documented for each critical failure mode
- 26. Failsafe events logged to Art.12 audit trail
- 27. Failsafe mechanisms tested under adversarial conditions
- 28. Failsafe behavior disclosed in Art.13 IFU (what system does when it cannot decide)
- 29. Recovery procedure documented after failsafe activation
- 30. Failsafe escalation path to designated human oversight person (Art.14(3)(c)) documented
Cybersecurity (Art.15(5)) — 10 items:
- 31. Training data poisoning controls: provenance tracking + integrity hashing
- 32. Adversarial example robustness evaluation on test set
- 33. Model integrity verification: hash of trained weights stored and verified at deployment
- 34. Pre-trained model supply chain security assessed and documented
- 35. Inference pipeline access control and authentication implemented
- 36. Inference API rate limiting to prevent model extraction attacks
- 37. Adversarial attack monitoring in production (Art.12 anomaly detection)
- 38. Art.15(5) cybersecurity controls mapped to Art.9 Risk Management System
- 39. NIS2 Art.21 measures assessed for overlap (Essential/Important Entities only)
- 40. CLOUD Act risk assessment for infrastructure jurisdiction (Art.9 risk register)
Enforcement & Fines
Art.15 violations fall under the standard AI Act fine structure:
| Violation Category | Maximum Fine |
|---|---|
| Placing high-risk AI without adequate accuracy/robustness (Art.15) | €15,000,000 or 3% of global annual turnover |
| Providing false accuracy declarations (Art.15(2)) | €15,000,000 or 3% of global annual turnover |
| Insufficient cybersecurity controls leading to serious incident (Art.15(5) + Art.73) | €15,000,000 or 3% + Art.73 incident reporting obligations |
For SMBs, the fine regime uses the lower of the absolute EUR cap or the percentage of global annual turnover — but for any company with revenue above €500M, the percentage threshold produces higher fines.
See Also
- EU AI Act Art.14 Human Oversight: Developer Guide
- EU AI Act Art.13 Transparency Obligations: Developer Guide
- EU AI Act Art.12 Logging & Record-Keeping: Developer Guide
- EU AI Act Art.11 Technical Documentation: Annex IV Deep Dive Developer Guide
- EU NIS2 + AI Act: The Double Compliance Burden for Critical Infrastructure Developers