2026-04-16·14 min read·

NIS2 Art.21(2)(f): Effectiveness Assessment of Cybersecurity Measures — SaaS Developer Guide (2026)

A security control that has never been tested may as well not exist. NIS2 Directive Art.21(2)(f) encodes this principle into law: essential and important entities must maintain formal policies and procedures for assessing the effectiveness of cybersecurity risk-management measures. Not deploying them — assessing whether they actually work.

This is the feedback loop that makes the other nine Art.21(2) measures meaningful. Without it, organisations operate on assumption rather than evidence. Pentests are skipped because "nothing has happened." SIEM rules are never validated against real attack patterns. Backup restores are documented but untested. NCA auditors in June 2026 will probe exactly this gap.

This guide builds the Art.21(2)(f) framework for SaaS development teams: KPI definitions, testing cadence, measurement methodology, and audit-grade evidence collection.


1. Art.21(2)(f) in the Full NIS2 Context

NIS2 Art.21(2) mandates ten cybersecurity risk-management measures. Art.21(2)(f) is the sixth — and the only one explicitly focused on measuring the other nine.

The Ten Mandatory Measures

SubparagraphRequirementPrimary Owner
Art.21(2)(a)Risk analysis and information system security policiesCISO / Management
Art.21(2)(b)Incident handling (see incident handling guide)SOC / DevSecOps
Art.21(2)(c)Business continuity, backup management, disaster recovery (see BCM guide)Ops / SRE
Art.21(2)(d)Supply chain securityProcurement / DevSecOps
Art.21(2)(e)Security in acquisition, development and maintenance (see SDL guide)Engineering
Art.21(2)(f)Policies to assess effectiveness of cybersecurity measuresAudit / GRC
Art.21(2)(g)Basic cyber hygiene and trainingHR / Security Awareness
Art.21(2)(h)Cryptography and encryption policies (see cryptography guide)Architecture
Art.21(2)(i)HR security, access control and asset management (see IAM guide)IT / HR / Engineering
Art.21(2)(j)Multi-factor authentication and continuous authentication (see MFA guide)IT / IAM / Engineering

Art.21(2)(f) does not introduce new security controls — it requires proof that the other controls work. It is the audit engine of Art.21(2).

The Exact Regulatory Text

Art.21(2)(f) requires:

"policies and procedures to assess the effectiveness of cybersecurity risk-management measures"

ENISA's technical guidelines and the ENISA NIS2 Implementation Guidance (2023) expand this into four operational requirements:

  1. Defined metrics — measurable indicators for each major control category
  2. Testing cadence — scheduled and ad-hoc assessments (pentests, tabletops, restore tests)
  3. Evidence trail — documented results that survive NCA inspection
  4. Remediation tracking — findings linked to closure deadlines and owner accountability

"Policy exists" is not sufficient. Auditors ask for the last test date, the findings, and the remediation status.


2. Why SaaS Developers Are the Primary Audience

SaaS organisations face a specific challenge with Art.21(2)(f): the controls being assessed span multiple layers — infrastructure (often cloud-provider managed), application code, CI/CD pipelines, identity systems, and operational processes. Effectiveness assessment must reach all layers.

The Cloud-Shared-Responsibility Gap

Cloud providers offer security features; they do not test your use of them. Your SIEM may ingest CloudTrail logs — but are the detection rules tuned? Your WAF may be enabled — but is it blocking OWASP Top 10? Your KMS may hold encryption keys — but are key rotation policies enforced and verified?

Art.21(2)(f) requires organisations to answer these questions with evidence, not assumptions.

The Developer's Role in Assessment

Developers create and own the controls that Art.21(2)(f) assesses:

Developers must understand what "effectiveness" means for each control they build — and instrument it to be measurable.


3. Building the KPI Framework

3.1 Control Categories and Measurement Dimensions

Control CategoryArt.21(2) LinkPrimary KPISecondary KPI
Incident Detection(b)Mean Time to Detect (MTTD)Alert-to-ticket conversion rate
Incident Response(b)Mean Time to Contain (MTTC)Post-Incident Review completion rate
Backup Integrity(c)Restore success rate (last 90 days)RTO/RPO met vs target
BCM Testing(c)Tabletop exercises per yearDR drill pass rate
Supply Chain(d)Dependency CVE resolution timeSBOM freshness (days since last update)
SDL Security(e)Critical/High findings per releaseSAST/DAST pass rate in pipeline
Access Control(i)Access review completion ratePrivileged account ratio
Offboarding(i)Offboarding SLA met (<24h)Orphaned account rate
MFA Coverage(j)MFA enrollment rate (privileged)MFA bypass incident rate
Key Rotation(h)Key rotation compliance rateSecrets expiry breach count
Pentest Coverage(f)Critical/High findings remediatedTime-to-remediate (P1: ≤14d, P2: ≤30d)
Vulnerability Management(f)Mean Time to Remediate (MTTR)Open Critical CVEs >30d

3.2 KPI Targets for June 2026 NCA Audit Readiness

KPITargetNCA Audit Threshold
MTTD (critical incidents)≤4 hoursDocumented SLA
MTTC (critical incidents)≤8 hoursDocumented SLA
Backup restore success rate100% (last 3 tests)Tested within 90 days
Dependency CVE P1 resolution≤7 daysPolicy + evidence
Pentest P1 remediation100% within 14 daysPentest report + fix commits
MFA enrollment (privileged)100%Configuration export
Offboarding SLA≥95% within 24hHRIS/IAM audit trail
Access review completion100% quarterlyCertification campaign logs
SAST critical findings per release0 (block gate)Pipeline configuration

These are the thresholds auditors consider baseline compliance. Organisations below them face recommendations or — for essential entities — enforcement measures.

3.3 Metric Collection Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Effectiveness KPI Pipeline                    │
│                                                                 │
│  Data Sources          Aggregation          Dashboard           │
│  ─────────────         ───────────          ─────────           │
│  SIEM alerts    ──►   Metric Store    ──►  GRC Dashboard        │
│  CI/CD pipeline ──►   (TimescaleDB    ──►  Monthly Reports      │
│  HRIS events    ──►    or similar)    ──►  NCA Evidence Pack     │
│  Pentest reports──►                   ──►  Trend Analysis        │
│  Backup logs    ──►                                              │
│  IAM audit logs ──►                                              │
└─────────────────────────────────────────────────────────────────┘

Each data source feeds structured events. The aggregation layer normalises them into KPI time series. The output is three artefacts: a live dashboard (internal), monthly reports (management), and an evidence pack (NCA audit).


4. Pentesting and Vulnerability Scanning Cadence

4.1 Minimum Cadence Requirements

Art.21(2)(f) does not specify test frequencies — but ENISA guidance, DORA Art.26 (for financial sector entities also under DORA), and NCA enforcement practice converge on practical minimums:

Assessment TypeMinimum FrequencyTrigger Conditions
External penetration testAnnual+ After major infrastructure changes
Internal penetration test / red teamAnnual+ After significant application releases
Vulnerability scan (authenticated)Weekly+ After dependency updates
SAST pipeline scanEvery commit
DAST scanEvery release
Third-party/supply chain reviewAnnual+ After new critical vendors
BCM/DR tabletop exerciseAnnual
Backup restore testEvery 90 days+ After backup system changes
Social engineering / phishing simulationSemi-annual

For essential entities (typically: DNS, TLD operators, cloud providers, datacentre operators, CDN, trust service providers, public electronic communications networks), NCA practice suggests annual external pentests are a minimum floor, not a maximum.

4.2 Pentest Scope Definition

Effective pentests require documented scope. An undocumented scope prevents meaningful year-over-year comparison — auditors will notice.

Minimum scope definition fields:

pentest_scope:
  id: "PT-2026-01"
  date_range: "2026-03-01 to 2026-03-05"
  methodology: "PTES + OWASP WSTG v4.2"
  provider: "Accredited third party (BSI-certified / CREST / OSCP)"
  scope:
    in_scope:
      - "api.example.com (production)"
      - "admin.example.com"
      - "CI/CD pipeline (GitLab)"
      - "AWS account ID: 123456789012 (eu-west-1)"
    out_of_scope:
      - "Third-party SaaS (Stripe, Sendgrid)"
      - "DDoS simulation"
  rules_of_engagement:
    destructive_allowed: false
    data_exfiltration_sim_allowed: true
    social_engineering_allowed: false
  findings_classification:
    critical: "CVSS ≥9.0 OR direct RCE/data-breach path"
    high: "CVSS 7.0–8.9"
    medium: "CVSS 4.0–6.9"
    low: "CVSS <4.0"
  remediation_sla:
    critical: "14 calendar days"
    high: "30 calendar days"
    medium: "90 calendar days"

NCA auditors expect this level of documentation. An informal pentest with a PDF report and no tracking is insufficient.

4.3 Vulnerability Management Workflow

Discovery          Triage              Remediation          Verification
─────────          ──────              ───────────          ────────────
CVE feed     ──►  Severity score  ──►  Owner assign   ──►  Retest / scan
Pentest      ──►  Exploit assess  ──►  Fix commit     ──►  CVSS re-eval
SAST/DAST    ──►  Business risk   ──►  Deploy         ──►  Close ticket
Dep scanner  ──►  Priority rank   ──►  Rollout        ──►  Record KPI

Key process requirements:

4.4 Automated Dependency Vulnerability Tracking

# Example: Dependabot + Slack/Telegram integration for SLA monitoring
import datetime
import json

def check_vuln_sla(findings: list[dict]) -> list[dict]:
    """
    Returns findings that have breached or are about to breach SLA.
    Each finding: {'id', 'severity', 'discovered_at', 'status', 'assignee'}
    """
    SLA_DAYS = {'critical': 7, 'high': 30, 'medium': 90, 'low': 180}
    now = datetime.datetime.utcnow()
    breaches = []
    for f in findings:
        sla = SLA_DAYS.get(f['severity'].lower(), 90)
        discovered = datetime.datetime.fromisoformat(f['discovered_at'])
        age_days = (now - discovered).days
        if f['status'] != 'closed' and age_days >= sla:
            breaches.append({**f, 'age_days': age_days, 'sla_days': sla})
    return breaches

This pattern — continuous SLA monitoring with alerting — is what auditors mean by "procedures to assess effectiveness." Manual spreadsheet reviews are not sufficient for organisations processing above NIS2 thresholds.


5. NIST CSF Integration and Measurement

NIST CSF 2.0 provides the most widely accepted measurement vocabulary for cybersecurity effectiveness. NCA auditors across DE, NL, and AT increasingly accept NIST CSF maturity scoring as evidence of structured effectiveness assessment.

5.1 NIST CSF 2.0 Functions and NIS2 Mapping

NIST CSF 2.0 FunctionKey CategoriesNIS2 Art.21(2) Links
Govern (GV)Organisational Context, Risk Management Strategy(a) risk analysis
Identify (ID)Asset Management, Risk Assessment, Improvement(a), (f)
Protect (PR)Identity Management, Data Security, Platform Security(h), (i), (j)
Detect (DE)Continuous Monitoring, Adverse Event Analysis(b), (f)
Respond (RS)Incident Management, Communication, Analysis(b)
Recover (RC)Incident Recovery, Communication(c)

For Art.21(2)(f) specifically, the relevant NIST categories are:

5.2 Maturity Tiers for NCA Audit

TierDescriptionNIS2 Compliance Implication
Tier 1 — PartialAd-hoc, reactive. No documented assessment cadence.Non-compliant — findings expected
Tier 2 — Risk InformedSome processes exist but not enterprise-wide.Borderline — depends on entity classification
Tier 3 — RepeatablePolicies documented, assessment cadence defined, results tracked.Compliant baseline for important entities
Tier 4 — AdaptiveContinuous improvement loop. KPIs trend over time. Lessons feed back into policy.Expected for essential entities

Most SaaS organisations subject to NIS2 as important entities should target Tier 3. Essential entities should target Tier 4.

5.3 Self-Assessment Scoring

class NIS2EffectivenessScorer:
    """
    Maps assessment results to NIST CSF tiers for NCA reporting.
    """
    WEIGHTS = {
        'pentest_cadence': 0.20,
        'vuln_sla_compliance': 0.20,
        'kpi_defined': 0.15,
        'kpi_measured': 0.15,
        'evidence_trail': 0.15,
        'remediation_tracking': 0.10,
        'management_review': 0.05,
    }

    def score(self, results: dict) -> dict:
        total = sum(
            results.get(k, 0) * w
            for k, w in self.WEIGHTS.items()
        )
        tier = 1 if total < 0.4 else 2 if total < 0.6 else 3 if total < 0.8 else 4
        return {
            'composite_score': round(total, 2),
            'nist_tier': tier,
            'nca_readiness': tier >= 3,
            'breakdown': {k: results.get(k, 0) for k in self.WEIGHTS}
        }

6. Continuous Control Monitoring (CCM)

Point-in-time assessments (annual pentest, quarterly access review) are necessary but insufficient for essential entities. Art.21(2)(f) implicitly requires continuous effectiveness monitoring for controls where drift between assessments is possible.

6.1 Controls That Require Continuous Monitoring

ControlDrift RiskMonitoring Method
MFA enrollmentNew accounts skip MFAIAM query, daily
Key expiryAuto-rotation disabledSecrets manager API, daily
Privileged account countRole creepIAM diff against baseline, weekly
Open Critical CVEsNew CVEs published dailyCVE feed integration, daily
Backup job successSilent failures commonBackup API, every job
SIEM rule coverageRules disabled after false-positive tuningSIEM API, weekly
Certificate expiryOften missed until outageCertificate transparency log / ACME, daily

6.2 SIEM Rule Validation

A common failure pattern: a detection rule is created after a security incident, then silenced because it generates too many false positives — and never re-tuned. The rule exists in the SIEM; it is not effective.

Art.21(2)(f) requires organisations to validate that detection rules actually detect. The industry term is "purple teaming" — controlled simulation of attack scenarios to verify detection coverage.

Minimum purple team cadence for Art.21(2)(f) compliance:

Attack CategorySimulation MethodMinimum Frequency
Brute force / credential stuffingControlled login attemptsQuarterly
Impossible travelAuth from two geographiesQuarterly
Privileged escalationRole assumption simulationSemi-annual
Data exfiltrationControlled S3/blob downloadSemi-annual
Ransomware precursor (lateral movement)Internal port scanAnnual

Results must be documented: test date, scenario, expected alert, actual alert (or gap), and remediation action.

6.3 Backup Restore Testing Protocol

backup_restore_test:
  id: "BRT-2026-Q1"
  date: "2026-03-15"
  target_backup_date: "2026-03-10"
  tested_by: "ops-engineer@example.com"
  procedure:
    - "Identify backup from 5 days ago (2026-03-10 23:00 UTC)"
    - "Provision isolated restore environment (separate VPC)"
    - "Execute restore from snapshot"
    - "Validate data integrity via checksum"
    - "Verify application start-up in restored environment"
    - "Record RTO achieved vs RTO target"
    - "Destroy restore environment"
  results:
    restore_success: true
    rto_achieved_min: 47
    rto_target_min: 60
    data_integrity: "checksum_match"
    issues_found: []
  evidence:
    screenshot_restore_complete: "s3://evidence/BRT-2026-Q1-restore.png"
    log_export: "s3://evidence/BRT-2026-Q1-logs.txt"

Three successful restore tests within 90 days, each documented to this standard, satisfy the Art.21(2)(c)/(f) intersection for backup effectiveness.


7. NCA Audit Evidence Pack

7.1 Artefacts Auditors Expect

NCA auditors assess Art.21(2)(f) by requesting evidence of documented procedures and executed assessments. Verbal descriptions are insufficient.

ArtefactFormatMinimum Content
Effectiveness Assessment PolicyPDF/ConfluenceScope, cadence, KPIs, owner, review cycle
Pentest Report (last 12 months)PDF (accredited provider)Scope, methodology, findings with severity, remediation status
Vulnerability Management RegisterJira/Linear export or CSVAll findings, severity, discovery date, status, SLA compliance
KPI Dashboard ExportPDF/screenshotLast 3 months, all major control categories
Backup Restore Test RecordsYAML/PDFDate, success/fail, RTO achieved, issues
Purple Team / Detection Validation RecordsPDF/markdownTest scenarios, expected vs actual alerts
Management Review MinutesPDFDate, KPIs reviewed, decisions taken

7.2 Evidence Pack Structure

evidence-pack-2026/
├── 01-policy/
│   └── effectiveness-assessment-policy-v2.1.pdf
├── 02-pentest/
│   ├── PT-2026-01-report.pdf
│   └── PT-2026-01-remediation-tracker.csv
├── 03-vulnerability-management/
│   ├── vuln-register-2026-Q1.csv
│   └── sla-compliance-report-2026-Q1.pdf
├── 04-kpi-dashboard/
│   ├── kpi-jan-2026.pdf
│   ├── kpi-feb-2026.pdf
│   └── kpi-mar-2026.pdf
├── 05-backup-restore-tests/
│   ├── BRT-2026-Q1.yaml
│   └── BRT-2026-Q4-2025.yaml
├── 06-detection-validation/
│   └── purple-team-2026-01.pdf
└── 07-management-review/
    └── grc-committee-2026-03-15.pdf

This structure can be uploaded to a shared drive or submitted directly to NCA during an on-site audit.


8. Python NIS2EffectivenessAssessor

The following tool evaluates an organisation's current Art.21(2)(f) posture across seven dimensions and produces a structured compliance gap report.

#!/usr/bin/env python3
"""
NIS2EffectivenessAssessor — Art.21(2)(f) compliance gap analysis.
Assesses effectiveness assessment maturity across KPIs, testing, evidence, and CCM.
"""
import json
import datetime
from dataclasses import dataclass, field, asdict
from typing import Optional

@dataclass
class ControlAssessment:
    control: str
    kpi_defined: bool = False
    kpi_measured_last_30d: bool = False
    test_performed_last_12m: bool = False
    test_date: Optional[str] = None
    findings_tracked: bool = False
    evidence_available: bool = False
    management_reviewed: bool = False

    def score(self) -> float:
        checks = [
            self.kpi_defined,
            self.kpi_measured_last_30d,
            self.test_performed_last_12m,
            self.findings_tracked,
            self.evidence_available,
            self.management_reviewed,
        ]
        return sum(checks) / len(checks)

    def gaps(self) -> list[str]:
        result = []
        if not self.kpi_defined:
            result.append(f"{self.control}: No KPI defined")
        if not self.kpi_measured_last_30d:
            result.append(f"{self.control}: KPI not measured in last 30 days")
        if not self.test_performed_last_12m:
            result.append(f"{self.control}: No test performed in last 12 months")
        if not self.findings_tracked:
            result.append(f"{self.control}: Findings not tracked with SLA")
        if not self.evidence_available:
            result.append(f"{self.control}: No audit-grade evidence available")
        if not self.management_reviewed:
            result.append(f"{self.control}: Not reviewed by management in last 12 months")
        return result


@dataclass
class NIS2EffectivenessAssessor:
    org_name: str
    assessment_date: str = field(
        default_factory=lambda: datetime.date.today().isoformat()
    )
    controls: list[ControlAssessment] = field(default_factory=list)

    def add_control(self, **kwargs) -> "NIS2EffectivenessAssessor":
        self.controls.append(ControlAssessment(**kwargs))
        return self

    def assess(self) -> dict:
        if not self.controls:
            return {"error": "No controls defined"}

        scores = [c.score() for c in self.controls]
        composite = sum(scores) / len(scores)
        all_gaps = []
        for c in self.controls:
            all_gaps.extend(c.gaps())

        tier = (
            1 if composite < 0.4 else
            2 if composite < 0.6 else
            3 if composite < 0.8 else
            4
        )

        return {
            "org": self.org_name,
            "assessment_date": self.assessment_date,
            "composite_score": round(composite, 3),
            "nist_csf_tier": tier,
            "nca_compliant": tier >= 3,
            "controls_assessed": len(self.controls),
            "gaps": all_gaps,
            "priority_actions": all_gaps[:5],
            "next_review": (
                datetime.date.today() + datetime.timedelta(days=90)
            ).isoformat(),
        }

    def report(self) -> str:
        result = self.assess()
        lines = [
            f"NIS2 Art.21(2)(f) Effectiveness Assessor",
            f"Organisation: {result['org']}",
            f"Date: {result['assessment_date']}",
            f"─" * 60,
            f"Composite Score: {result['composite_score']:.1%}",
            f"NIST CSF Tier: {result['nist_csf_tier']}/4",
            f"NCA Compliant: {'YES ✓' if result['nca_compliant'] else 'NO ✗'}",
            f"Controls Assessed: {result['controls_assessed']}",
            f"─" * 60,
            f"Gaps ({len(result['gaps'])}):",
        ]
        for gap in result["gaps"]:
            lines.append(f"  ✗ {gap}")
        lines += [
            f"─" * 60,
            f"Priority Actions:",
        ]
        for i, action in enumerate(result["priority_actions"], 1):
            lines.append(f"  {i}. {action}")
        lines.append(f"Next Review: {result['next_review']}")
        return "\n".join(lines)


# Example usage
if __name__ == "__main__":
    assessor = NIS2EffectivenessAssessor(org_name="ExampleSaaS GmbH")
    (
        assessor
        .add_control(
            control="Incident Detection",
            kpi_defined=True,
            kpi_measured_last_30d=True,
            test_performed_last_12m=False,
            findings_tracked=True,
            evidence_available=False,
            management_reviewed=True,
        )
        .add_control(
            control="Penetration Testing",
            kpi_defined=True,
            kpi_measured_last_30d=False,
            test_performed_last_12m=True,
            test_date="2025-11-12",
            findings_tracked=True,
            evidence_available=True,
            management_reviewed=False,
        )
        .add_control(
            control="Backup Restore Validation",
            kpi_defined=False,
            kpi_measured_last_30d=False,
            test_performed_last_12m=True,
            test_date="2026-01-20",
            findings_tracked=False,
            evidence_available=True,
            management_reviewed=False,
        )
        .add_control(
            control="Vulnerability Management",
            kpi_defined=True,
            kpi_measured_last_30d=True,
            test_performed_last_12m=True,
            findings_tracked=True,
            evidence_available=True,
            management_reviewed=True,
        )
        .add_control(
            control="MFA Coverage Monitoring",
            kpi_defined=True,
            kpi_measured_last_30d=False,
            test_performed_last_12m=False,
            findings_tracked=False,
            evidence_available=False,
            management_reviewed=False,
        )
    )
    print(assessor.report())
    print(json.dumps(assessor.assess(), indent=2))

9. 25-Item NCA Audit Checklist — Art.21(2)(f)

Use this checklist when preparing for a June 2026 NCA audit. Each item maps to a specific auditor question.

Policy and Governance

Penetration Testing

Vulnerability Management

Continuous Control Monitoring

Testing and Validation

Evidence and Reporting


10. 12-Week Implementation Timeline

WeekFocusKey Deliverables
1–2KPI DefinitionDefine KPIs per control category; assign owners; set targets
3–4Tool IntegrationConnect SIEM, vulnerability scanner, backup monitor to metric store
5–6Policy DraftingDraft Effectiveness Assessment Policy; management review
7–8Pentest SchedulingCommission external pentest; prepare scope document
9–10Evidence CollectionCollect existing test results; run backup restore test
11**Gap RemediationClose P1/P2 gaps from pentest and KPI analysis
12Audit SimulationInternal mock audit; assemble NCA evidence pack

June 2026 NCA audits are scheduled for essential entities. Important entities are likely to follow in H2 2026. Starting this timeline in April 2026 gives four weeks of buffer for the first audit cohort.


11. EU Sovereign Infrastructure and Art.21(2)(f)

One frequently overlooked intersection: Art.21(2)(f) requires effectiveness assessment of all cybersecurity measures — including cloud infrastructure controls. For organisations running on non-EU infrastructure subject to the US CLOUD Act, the question arises: can you fully assess controls on infrastructure where a third-party jurisdiction can request access without your knowledge?

This is one reason ENISA's guidance on NIS2 technical measures references data sovereignty as a factor in risk assessment. An EU-native infrastructure eliminates the CLOUD Act uncertainty — assessments of access controls, encryption, and data handling are not complicated by external legal override.

Platforms like sota.io — built on EU-only infrastructure — provide the clean audit boundary that Art.21(2)(f) effectiveness assessment requires: what you test is what runs, without extraterritorial jurisdictional overlap.


Key Takeaways

Art.21(2)(f) is the quality gate for all other NIS2 measures. It requires:

The Python NIS2EffectivenessAssessor above provides a structured gap analysis tool. The 25-item checklist maps directly to NCA audit questions for June 2026. The 12-week timeline is calibrated for organisations beginning compliance work in April 2026.

The NIS2 Art.21(2) series continues with the remaining measures: (a) Risk Analysis and (g) Cyber Hygiene and Training.


See also: