2026-06-09·5 min read·sota.io Team

EU AI Act Art.9 Risk Identification Methodology: How to Find and Document Every Risk in Your High-Risk AI System (2026)

Post #2 in the EU AI Act Art.9 Risk Management System 2026 Series

EU AI Act Art.9 Risk Identification Methodology — structured risk tree with hazard analysis branches

Art.9(3) of the EU AI Act requires that your Risk Management System "identify and analyse" all "known and reasonably foreseeable risks" that the high-risk AI system "can pose to health, safety or fundamental rights." The language sounds straightforward. The execution is not.

Most teams interpret this as: create a risk register, list some things that could go wrong, assign a 1-5 likelihood and impact score, close the document. That interpretation will fail a conformity assessment. The regulation expects a structured, evidence-based identification process that is specific to your Annex III use case, your deployment context, and your user population — not a generic risk template repurposed from a software QA checklist.

This post covers how to actually run Art.9-compliant risk identification: which techniques to apply, what the outputs must contain, how to handle foreseeable misuse, and how to structure the documentation that goes to your conformity assessor. It is the second post in a five-part series. Post 1 covered the Art.9 architectural requirements and how to map the RMS to your SDLC. Post 3 covers testing requirements under Art.9(6)–(8).

The Scope Problem: What Counts as a Risk Under Art.9?

Before running any identification technique, you need clarity on the scope of what Art.9 treats as a risk. The regulation is deliberately broad, and teams that read it narrowly end up with incomplete risk identification.

Health and safety risks are the most obvious category. For a medical device AI system (Annex III, point 1(a)), these include diagnostic errors, missed detections, false positives that trigger unnecessary treatment, and system failures during critical decision points. For employment screening tools (Annex III, point 4), health risks are less direct — but psychosocial harms from discriminatory hiring outcomes are within scope.

Fundamental rights risks require specific attention. Art.9(2) explicitly names fundamental rights as a risk dimension, and recital 47 lists the rights most at stake: non-discrimination (Art.21 EU Charter), privacy and data protection (Arts.7–8), dignity, fair trial rights, and the right to an effective remedy. For any Annex III system that processes personal data or makes or assists decisions about individuals, fundamental rights are not an edge case — they are a primary risk axis.

Reasonably foreseeable risks expand the scope beyond what has already gone wrong. The regulation uses "reasonably foreseeable" — not just "documented in incident logs." A credit scoring AI that has never produced a discriminatory outcome has nonetheless a foreseeable risk of doing so if it is trained on data that reflects historical discrimination. You must document risks that have not yet materialised but that a competent developer in your domain would recognise as plausible.

Risks under foreseeable misuse are separately required by Art.9(4)(a). Misuse is not just malicious misuse — it includes use outside the intended purpose (e.g., deploying an employment-screening tool for unintended professional categories), use by unintended operators, and use in contexts where the safeguards in the instructions for use are not followed. Each of these generates a distinct risk profile that must be separately documented.

Technique 1: Use-Case and Misuse-Case Analysis

The most fundamental Art.9 risk identification technique is systematic use-case analysis extended to misuse scenarios. It is the methodological foundation that makes every subsequent technique meaningful.

Start with the intended purpose. Art.9(3) requires risk identification to be grounded in the system's specific intended purpose. The intended purpose must be defined before you can identify risks, because the same AI capability (e.g., facial recognition) generates completely different risk profiles depending on whether it is deployed for physical access control, law enforcement identification, or employee time-tracking.

Document the intended purpose at four levels of specificity:

System level: What does the system do? (e.g., "classifies CVs for initial screening in recruitment processes")
User level: Who are the intended operators and deployers? (e.g., "HR professionals in enterprises of 50+ employees")
Context level: In what organisational and technical context is it used? (e.g., "integrated via API into ATS platforms, decision made by HR before shortlisting")
Population level: Who are the affected persons? (e.g., "job applicants across all professional categories available in the deployer's ATS")

Map each intended use to risks. For each use-case scenario in your intended-purpose definition, identify what can go wrong. Use a structured format:

Use case: [description]
Risk ID: [unique identifier]
Risk description: [what harm occurs, to whom, under what condition]
Risk category: safety / fundamental rights / both
Affected party: operator / deployer / affected person / third party
Triggering condition: [what system state or user action leads to harm]

Generate misuse-case scenarios. For each intended use, systematically identify:

Out-of-scope deployment: Using the system for purposes adjacent to but outside the intended purpose (e.g., using a recruitment screener for contractor selection, which may involve different legal protections)
Misinterpretation of outputs: Treating probabilistic scores as deterministic verdicts; using confidence values without understanding calibration
Safeguard bypass: Deployers or operators who disable or ignore the human oversight mechanisms your system requires under Art.14
Adversarial input: Deliberate manipulation of inputs to achieve preferred outputs (e.g., keyword stuffing in CVs to game the classifier)
Context drift: Deploying a system trained on one population in a context where the population distribution has shifted (e.g., deploying a model trained on Western European applicants for a role with significant applicants from other regions)

For high-risk AI systems in critical infrastructure, law enforcement, or public services (Annex III, points 2, 6, 7, 8), misuse-case analysis should also cover authorised-but-incorrect use: operators applying the system correctly within their role definition but for purposes the system is not validated for.

Technique 2: Failure Mode and Effects Analysis (FMEA)

FMEA is a structured technique for identifying how components of a system can fail and what the downstream effects are. It is well-established in safety engineering and medical device regulation (ISO 14971), and it maps naturally onto Art.9 requirements for high-risk AI systems.

Why FMEA fits Art.9: FMEA forces you to think about failure modes at the component level, not just the system level. Art.9(3) requires analysis of risks that the system "can pose" — meaning the regulation cares about failure pathways, not just end-state harms. A conformity assessor reviewing your RMS will want to see that you have traced risks back to identifiable failure modes in the system, not just listed harms abstractly.

Apply FMEA at three levels:

Data layer: Where can the training data, input data pipeline, or data preprocessing fail? Failure modes include: distribution shift between training and deployment populations (foreseeable risk for any demographic-sensitive application), label noise in training data (particularly relevant when ground truth was historically set by biased human decisions), missing values handled in ways that introduce systematic bias, and data preprocessing steps that inadvertently encode protected characteristics.

For each data failure mode, document:

Effect: What performance degradation or harm results?
Severity: How severe is the harm to affected persons?
Occurrence: How likely is this failure mode given the data pipeline design?
Detection: How easily can this failure be detected before or during deployment?

Model layer: Where can the model itself fail? Failure modes include: spurious correlations learned during training (the model uses a proxy for a protected characteristic), poor calibration (confidence scores do not reflect actual accuracy rates), brittleness on edge cases not represented in training data, and adversarial vulnerability where small input perturbations produce dramatically different outputs.

Document calibration as a failure mode. A poorly calibrated model that reports 90% confidence when it is actually correct 60% of the time creates a systematic risk of overreliance — operators and deployers will act on outputs with more confidence than is warranted. This is not a theoretical failure mode: overreliance on AI outputs is one of the primary documented harm pathways in deployed high-risk systems.

Integration layer: Where can the system fail in the context of its deployment environment? Failure modes include: API rate limiting causing timeouts that result in default-pass or default-fail outcomes, failure to propagate system uncertainty to the operator interface (so operators cannot distinguish high-confidence from low-confidence outputs), integration configurations that disable or bypass required logging for Art.12 compliance, and deployment in contexts where the operator lacks the competence to interpret outputs correctly.

FMEA output format for the RMS:

Failure Mode ID: FM-[number]
Component: [data / model / integration]
Failure mode: [description of how the component fails]
Failure effect: [downstream harm to operators, deployers, or affected persons]
Severity (1-5): [1=negligible, 5=critical/irreversible]
Occurrence (1-5): [1=remote, 5=frequent]
Detection (1-5): [1=detectable immediately, 5=undetectable in normal operation]
Risk Priority Number: [S × O × D]
Linked mitigation: [reference to mitigation measure in RMS]
Residual RPN after mitigation: [recalculated]

Risk Priority Number (RPN) is a standard FMEA output. It does not map directly onto Art.9's risk-benefit analysis, but it provides a structured basis for prioritising mitigation investment and for demonstrating to conformity assessors that risk analysis was systematic and quantified.

Technique 3: Fundamental Rights Impact Assessment (FRIA)

Art.9(2) requires the RMS to consider risks to fundamental rights specifically. For many Annex III systems — particularly those in employment, education, credit scoring, public services, and law enforcement — fundamental rights risks are not subsidiary to safety risks; they are primary.

The EU AI Act does not define a specific FRIA methodology. ENISA guidance and the AI Office have aligned around a framework derived from Data Protection Impact Assessments (DPIAs) under GDPR Art.35, adapted for AI-specific risk vectors.

FRIA scope for Annex III systems:

Non-discrimination (EU Charter Art.21): Any system that makes or assists decisions about individuals across categories listed in Annex III must assess discrimination risk. Discrimination risk under Art.9 is not limited to intentional discrimination — disparate impact (where a facially neutral system produces systematically different outcomes for different demographic groups) is explicitly within scope. Document:

Protected characteristics relevant to the deployment context (age, sex, racial or ethnic origin, disability, religion, sexual orientation, as applicable under Art.21 and the Equal Treatment Directives)
Historical bias in training data that may encode past discrimination
Proxy variables that correlate with protected characteristics without being protected characteristics themselves (zip code as a proxy for race; job title as a proxy for gender)
Subpopulation performance metrics (accuracy, false positive rate, false negative rate) disaggregated by protected characteristic, where such data can be obtained

Privacy and data protection (Arts.7–8): Any system processing personal data — which includes virtually all Annex III systems — must assess privacy risks. These overlap with GDPR DPIA obligations but are not coextensive. Art.9 requires consideration of risks to the right to privacy, not just compliance with data protection processing rules. Document:

Inference risks (what does the system infer about individuals beyond what was directly provided?)
Re-identification risks (can outputs be used to re-identify individuals from ostensibly anonymised data?)
Aggregation risks (can multiple low-sensitivity outputs be combined to produce high-sensitivity inferences?)

Dignity and autonomy: Systems that make or assist decisions with significant life impact — employment, credit, education, housing, healthcare — create risks to human dignity and autonomy that are separate from and additional to discrimination risk. Document:

Whether the system provides meaningful explanations of decisions to affected persons, as required for Art.13 compliance
Whether affected persons have effective recourse to challenge decisions (linked to Art.14 and human oversight requirements)
Whether the system produces decisions at a scale or speed that effectively eliminates human consideration of individual circumstances

Structuring the FRIA output:

The FRIA output for the RMS should be structured as a rights-by-context matrix: for each relevant right (at minimum: non-discrimination, privacy, dignity/autonomy, fair process/effective remedy), document the specific risk, the affected population, the severity of the potential violation, and the mitigation measure. Link each FRIA finding back to the technical RMS (FMEA failure modes) where there is a technical root cause, and to the organisational mitigation measures (operator training, escalation procedures, human review thresholds) where the mitigation is procedural rather than technical.

Technique 4: Sector-Specific Foreseeable Risk Libraries

Annex III defines eight sectors. Each has documented foreseeable risk patterns drawn from prior deployment experience, academic research, and regulatory enforcement. Your risk identification should incorporate sector-specific risk patterns, not rely solely on generic analysis.

Annex III Point 1 — Biometric identification and categorisation: Foreseeable risks include: differential accuracy across demographic groups (documented across facial recognition systems — false match rates are typically highest for darker-skinned individuals and women in systems trained on non-representative datasets); function creep (systems deployed for one biometric purpose used for another); consent failures in public-space deployments.

Annex III Point 2 — Critical infrastructure: Foreseeable risks include: single-point-of-failure risks where AI-assisted control decision failures cascade; adversarial attack on AI inputs used to induce incorrect control actions; over-automation where operators lack the situational awareness to override incorrect AI outputs.

Annex III Point 3 — Education and vocational training: Foreseeable risks include: differential performance on assessments for students with different socioeconomic backgrounds, learning disabilities, or language profiles; lock-in effects where early-career categorisation by AI limits later educational opportunities; opacity in assessment rationale affecting students' ability to improve.

Annex III Point 4 — Employment, workers management, access to self-employment: Foreseeable risks include: disparate impact in CV screening across race, gender, age, and disability; performance monitoring that creates illegal working conditions (e.g., continuous monitoring that amounts to surveillance in violation of Art.8 EU Charter); gig-economy algorithmic management that lacks transparency required by Arts.13–14.

Annex III Point 5 — Access to essential private services and public benefits: Foreseeable risks include: systematic exclusion from credit, insurance, or housing for protected groups; opacity preventing individuals from understanding why they are denied; appeal processes that are formally available but not practically accessible.

Annex III Point 6 — Law enforcement: Foreseeable risks include: face recognition false matches leading to wrongful stops, searches, or arrests; risk scoring tools that produce discriminatory policing patterns; translation or communication AI that fails for speakers of minority languages, compromising fair trial rights.

Annex III Points 7–8 — Migration, asylum, border control; judicial and democratic processes: Foreseeable risks include: systemic errors in identity verification affecting asylum seekers; AI-assisted judicial scoring that violates EU Charter Art.47 (right to a fair trial); election-related AI that undermines democratic participation.

For each Annex III point, document which sector-specific foreseeable risks apply to your system and why. Even if a sector-level risk does not apply to your specific implementation, document the exclusion and the reasoning — this demonstrates that the risk identification was comprehensive rather than cherry-picked.

Documenting Risk Identification for Conformity Assessment

Risk identification documentation has two functions: it is an input to the rest of the RMS (feeding into mitigation selection, test design, and residual risk disclosure), and it is evidence for conformity assessment. How you document risk identification therefore needs to satisfy both consumers.

Structure the risk catalogue as a living artefact. Do not produce a static PDF risk register. Conformity assessors expect to see a version-controlled risk catalogue with a change history that demonstrates the RMS ran throughout the development lifecycle — not a document produced in the final weeks before submission.

Version control your risk catalogue alongside your codebase. Every sprint that changes a feature with risk implications should produce an updated risk catalogue version. Use semantic versioning: major version increment when new Annex III risks are identified, minor version for risk re-estimation updates, patch for documentation corrections.

Each risk entry must include:

Unique risk identifier (stable across versions)
Risk source (which technique identified it: use-case analysis, FMEA, FRIA, sector library)
Risk description (harm, affected party, triggering condition)
Art.9(3) compliance statement (is this a known risk or a foreseeable risk? what is the knowledge basis?)
Preliminary estimation (likelihood and severity before mitigation)
Linked failure mode (FMEA reference, if applicable)
Linked fundamental right (FRIA reference, if applicable)
Status (open / mitigated / accepted with rationale)

The completeness declaration. At the conclusion of each risk identification cycle, include a written completeness declaration: a statement that the identified risks cover all Annex III categories applicable to your system, all known risks from deployment experience, and all foreseeable risks identified through the applied techniques. The completeness declaration is not a guarantee — new risks can emerge — but it demonstrates that the identification process was bounded and systematic.

Linking risk identification to mitigation. For the conformity assessor, every identified risk should have a visible link to either (a) a mitigation measure in the RMS, (b) a residual risk acceptance with rationale, or (c) a risk transfer to the deployer via the instructions for use (Art.13). Risks that float in the catalogue without one of these three dispositions will fail conformity review.

The Foreseeable Misuse Documentation Requirement

Art.9(4)(a) creates a specific documentation obligation for foreseeable misuse: the RMS must include "the assessment of the risks that may emerge when the high-risk AI system is used in accordance with its intended purpose and under conditions of reasonably foreseeable misuse." This is not optional and not covered by general risk identification — it requires a separate, explicit misuse analysis.

Categorise foreseeable misuse scenarios at three levels:

Level 1 — Adjacent use: Uses that are outside the intended purpose but within the same general domain. An employment screening tool used for contractor selection, or a medical imaging AI used for screening populations outside the validated demographic range. These create risk because the system's validation does not cover the new use case but the system will behave similarly enough that operators may not recognise the gap.

Level 2 — Operator error: Foreseeable errors by intended operators acting in good faith. Misinterpreting confidence scores; applying manual override thresholds inconsistently; failing to follow human oversight procedures under time pressure. These risks are within your control — they can be mitigated through interface design, training requirements, and operator competency thresholds documented in the instructions for use.

Level 3 — Adversarial misuse: Deliberate manipulation of inputs or deployment context to produce desired outputs. For high-stakes decisions (employment, credit, law enforcement), adversarial input manipulation is a foreseeable risk in any system whose output affects individual interests. Document the attack surface: what inputs does the system process, which are controllable by affected persons, and what is the feasibility of manipulation?

Residual misuse risks and Art.13 coupling. After applying technical and organisational mitigations to foreseeable misuse scenarios, the remaining residual risks must be communicated to deployers in the instructions for use. This creates a direct coupling between the misuse analysis in your RMS and the Art.13 documentation. Every residual misuse risk should have a corresponding instruction in the Art.13 package that tells deployers how to operate the system in a way that manages that risk.

Practical Implementation: Risk Identification in a CI/CD Pipeline

Art.9 requires a continuous RMS, which means risk identification cannot be a once-at-launch exercise. For teams operating a CI/CD pipeline for a high-risk AI system, here is a practical integration pattern.

Gate 1 — Feature-level risk triage (at PR creation): Any PR that modifies model architecture, training data, feature engineering, output schema, or deployment configuration should trigger a risk triage step. This does not require a full FMEA run — it requires a structured answer to: "Does this change affect any identified risks in the risk catalogue? Does it introduce new failure modes or affect any fundamental rights risk dimensions?"

Automate this as a PR checklist enforced by your CI pipeline. The checklist is lightweight but mandatory: it creates the evidentiary trail that the RMS was active at the code-change level, not just at quarterly review milestones.

Gate 2 — Sprint-level risk catalogue update: At each sprint boundary, review the risk catalogue for:

New risks identified during the sprint (from testing, from operator feedback, from monitoring)
Changes to risk estimation based on sprint test results
New failure modes identified through integration testing

This produces the version history that conformity assessors expect to see.

Gate 3 — Deployment-level risk snapshot: Before each production deployment, produce a risk snapshot: the current state of the risk catalogue, the current set of open risks, and the assertion that no deployment is proceeding with an unmitigated P1 (critical severity) risk. The risk snapshot becomes part of the deployment artefact, linked to the specific model version being deployed.

Hosting and jurisdiction note: Risk catalogue documentation, FMEA outputs, and FRIA findings constitute technical documentation under Art.11 and must be maintained by the provider. They are subject to market surveillance access under Art.74 — national competent authorities can request them. If your documentation storage infrastructure is hosted on US-parent cloud providers, Art.74 access rights for EU authorities and CLOUD Act reach for US authorities may conflict. EU-hosted infrastructure (providers without a US parent entity, deployed on Hetzner Germany or equivalent) eliminates this conflict point.

What the Conformity Assessor Checks

Understanding what a Notified Body or conformity assessor reviews during risk management system evaluation helps you structure documentation more effectively. Based on the AI Office's published conformity assessment guidance and standards ISO/IEC 23894:2023 and ISO 42001:2023:

Completeness check: Does the risk identification cover all applicable Annex III risk categories? Are sector-specific foreseeable risks addressed? Is foreseeable misuse analysed?

Process traceability: Can the assessor trace which risk identification technique produced each risk entry? Is the identification process documented (not just the outcomes)?

Lifecycle coverage: Does the version history of the risk catalogue demonstrate that identification ran throughout the development lifecycle? Or was the catalogue produced close to submission?

Fundamental rights specificity: Are fundamental rights risks identified with specificity — named rights, named affected populations, named scenarios — rather than general acknowledgement that fundamental rights exist?

Misuse completeness: Are foreseeable misuse scenarios documented with enough specificity to derive mitigation measures?

Linkage coherence: Is every identified risk linked to either a mitigation measure, a residual risk acceptance, or an Art.13 disclosure? Are any risks floating unlinked?

The most common gap in notified body reviews of RMS documentation is not that risks were missed (though that also occurs) — it is that the linkage between identified risks and subsequent RMS activities is incomplete or absent. Risks that appear in the catalogue but have no visible path to mitigation, testing, or residual risk disclosure are red flags.

What's Next in This Series

Post 3 covers the testing requirements under Art.9(6)–(8): what Art.9 requires from your test procedures (not just "testing was done"), how to structure subpopulation testing documentation, what Art.9(8) says about testing data, and how test results feed back into risk estimation in the RMS.

Post 4 covers continuous monitoring integration: how Art.9's "continuous and iterative process" requirement maps onto post-market monitoring under Art.72, what constitutes a monitoring-triggered risk review, and how to close the loop between production monitoring and RMS updates.

Post 5 covers the documentation package for conformity assessment: what a complete Art.9 documentation submission looks like, how to structure it for Notified Body review, and the Art.11 technical documentation coupling that makes or breaks the submission.

The August 2026 deadline for high-risk AI system compliance is 54 days away. Post #2 in the EU AI Act Art.9 Risk Management System series. Published as part of sota.io's EU AI Act compliance guide library — 1,600+ articles covering every major EU digital regulation.

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View plans