2026-05-31·5 min read·sota.io Team

EU Data Act Cloud Switching 2026: Data Format Standards, Open APIs & Interoperability Requirements

Post #1417 in the sota.io EU Cloud Compliance Series

EU Data Act Cloud Switching Data Format Standards and Interoperability 2026

The EU Data Act (Regulation (EU) 2023/2854) does not specify a list of approved file formats for cloud switching exports. What it requires is something more demanding: data must be exported in "commonly used, machine-readable, open, interoperable formats" — and the receiving provider must be able to import that data without manual intervention. This post covers what that means in practice, which formats qualify, how to document your export schema, and what a compliant multi-format export pipeline looks like in TypeScript.

This is the second post in our EU-DATA-ACT-CLOUD-SWITCHING-2026 series. The first post covered the switching request API and 30-day state machine.


The Four Format Criteria

The Data Act Chapter IV switching provisions establish four format requirements. Each is a genuine constraint, not just aspirational language:

1. Commonly Used

The format must have meaningful adoption in the industry. This disqualifies:

This criterion explicitly targets formats like "Salesforce export", "HubSpot export", or a custom binary database dump. If a data engineer at the receiving provider needs to write an ad-hoc parser, the format fails this test.

Formats that qualify: JSON, CSV, XML, Parquet, NDJSON/JSON Lines, YAML, TSV

Formats that don't qualify: proprietary binary blobs, undocumented custom encodings, base64-wrapped binary data without schema documentation

2. Machine-Readable

The format must be parseable by standard tools without ambiguity. This means:

PDFs, Word documents, and HTML pages generally fail this criterion for structured data (though they may be included as supplementary attachments for human-readable summaries).

Edge case: XLSX spreadsheets are borderline. They are machine-readable by tooling but are inherently cell-based rather than schema-based. XLSX may satisfy the criterion for tabular data, but JSON or CSV with a documented schema is strongly preferable.

3. Open

The format specification must be publicly available, royalty-free. This eliminates any format with licensing restrictions on parsing tools.

All standard formats (JSON, CSV, XML, OpenAPI, Parquet) satisfy this criterion. The openness requirement is rarely the binding constraint in practice.

4. Interoperable

This is the most demanding criterion. The format must allow a receiving provider to import the data without custom integration work. Practically, this means:

The interoperability requirement is what separates a compliant export from a simple data dump.


Different data types have different optimal formats. The following matrix reflects common SaaS data categories:

Data TypeRecommended FormatSchema StandardNotes
Structured records (users, projects, items)JSON Lines (NDJSON)JSON Schema Draft-07+One record per line, streaming-friendly for large datasets
Relational data with joinsJSON (nested) or ParquetJSON Schema + column metadataParquet preferred for >10MB
Binary assets (files, images, attachments)ZIP with manifestJSON Schema for manifest.jsonBinary content in /files/, manifest indexes metadata
Time-series / audit logsJSON LinesJSON SchemaISO 8601 timestamps, UTC timezone
Configurations and settingsYAML or JSONJSON SchemaSingle-file preferred
Financial records / billing historyJSON LinesJSON Schema + FHIR-inspired profilesCurrency in ISO 4217, amounts as integers (cents)
Calendar / scheduling dataiCalendar (RFC 5545) + JSON summaryStandard iCal + JSON SchemaiCal for calendar apps, JSON for non-calendar targets
Contacts / identity datavCard (RFC 6350) + JSONStandard vCard + JSON SchemavCard for addressbook apps, JSON for general targets

Schema Documentation: The OpenAPI Approach

The technical requirements in Chapter IV implicitly require that your export schema be machine-readable and documented. The receiving provider's import system must be able to validate imported data against your schema. We recommend using OpenAPI 3.1 to document your export format — the same tooling as your API.

Export Manifest Schema

Every Data Act compliant export package must include a manifest.json at the root:

interface ExportManifest {
  schemaVersion: "1.0";
  exportId: string;               // UUID
  requestId: string;              // Matches the switch request ID
  exportedAt: string;             // ISO 8601, UTC
  provider: {
    name: string;
    legalEntity: string;          // Registered company name
    euRepresentative?: string;    // If provider is non-EU
    dataActContact: string;       // Email for portability questions
  };
  customer: {
    accountId: string;            // Your internal ID
    externalId?: string;          // Optional stable external identifier
    dataResidency: string[];      // ISO 3166-1 alpha-2 country codes
  };
  exportContents: ExportContent[];
  schemaDocumentation: string;    // URL to your OpenAPI/JSON Schema doc
  retentionExpiry: string;        // ISO 8601: 30 days post-export
  exportFormat: ExportFormatSpec;
}

interface ExportContent {
  scope: string;                  // "user_data" | "application_data" | etc.
  files: ExportFile[];
  recordCount: number;
  dataSizeBytes: number;
  dateRangeFrom?: string;
  dateRangeTo?: string;
}

interface ExportFile {
  path: string;                   // Relative to archive root
  format: "ndjson" | "json" | "csv" | "parquet" | "zip" | "ical" | "vcard";
  encoding: "utf-8";
  schemaRef: string;              // URL to JSON Schema for this file
  recordCount?: number;
  checksum: string;               // SHA-256 hex
}

interface ExportFormatSpec {
  primaryFormat: string;
  compressionAlgorithm: "zip" | "gzip" | "none";
  characterEncoding: "utf-8";
  lineEnding: "LF";
}

OpenAPI Export Schema Declaration

Publish your export schema as an OpenAPI document. This allows receiving providers to auto-generate import clients:

# openapi.export.yaml — published at https://api.yoursaas.com/data-act/export-schema
openapi: "3.1.0"
info:
  title: "YourSaaS EU Data Act Export Schema"
  version: "1.0.0"
  description: "Machine-readable schema for EU Data Act cloud switching exports. Compliant with Regulation (EU) 2023/2854 Chapter IV technical requirements."
  contact:
    name: "EU Data Portability Team"
    email: "data-portability@yoursaas.com"
  license:
    name: "CC0 1.0"             # Schema is public domain — receiving providers can use freely

paths: {}                       # No API paths — this is schema-only

components:
  schemas:
    UserRecord:
      type: object
      required: [id, email, createdAt, status]
      properties:
        id:
          type: string
          format: uuid
          description: "Stable user identifier. Unique within this export."
        email:
          type: string
          format: email
        displayName:
          type: string
        locale:
          type: string
          pattern: "^[a-z]{2}-[A-Z]{2}$"
          description: "BCP 47 language tag"
        timezone:
          type: string
          description: "IANA timezone identifier"
        createdAt:
          type: string
          format: date-time
        updatedAt:
          type: string
          format: date-time
        status:
          type: string
          enum: [active, suspended, deleted]
        metadata:
          type: object
          description: "Additional customer-specific fields. Keys use snake_case."
          additionalProperties:
            type: string

Avoiding Interoperability Failures

Most Data Act compliance failures will occur at the interoperability layer, not the format layer. Common patterns that produce technically compliant but practically non-interoperable exports:

Anti-Pattern 1: Internal ID References Without Context

// BAD — receiving provider cannot resolve team_id without your system
{
  "userId": "u_abc123",
  "teamId": "t_xyz789",
  "roleId": "r_owner"
}

// GOOD — self-contained, receiving provider can reconstruct relationships
{
  "userId": "550e8400-e29b-41d4-a716-446655440000",
  "email": "alice@example.com",
  "team": {
    "id": "t_xyz789",
    "name": "Engineering",
    "slug": "engineering"
  },
  "role": {
    "id": "r_owner",
    "name": "Owner",
    "permissions": ["admin", "billing", "member_management"]
  }
}

Anti-Pattern 2: Non-Standard Timestamps

// BAD — ambiguous, timezone unclear
{
  "created": "05/31/26 8:03 PM",
  "updated": 1748700180
}

// GOOD — ISO 8601 with explicit UTC offset
{
  "createdAt": "2026-05-31T20:03:00Z",
  "updatedAt": "2026-05-31T20:03:00Z"
}

Anti-Pattern 3: Opaque Enum Values

// BAD — receiving provider doesn't know what status=2 means
{
  "userId": "u_abc123",
  "status": 2,
  "plan": "p3",
  "tier": 4
}

// GOOD — human-readable, stable string values
{
  "userId": "u_abc123",
  "status": "active",
  "plan": "professional",
  "tier": "standard"
}

Anti-Pattern 4: Missing Pagination Context in Large Exports

// BAD — customer has 50,000 records but export is a single 500MB file with no structure
[{ "id": "..." }, ...]

// GOOD — chunked export with manifest
// manifest.json lists:
// - users_001.ndjson (10,000 records)
// - users_002.ndjson (10,000 records)
// - users_003.ndjson (10,000 records)
// etc.
// Each file is independently parseable

TypeScript: Multi-Format Export Pipeline

A production-ready export pipeline must support multiple output formats from the same data model. This allows your platform to satisfy both the Data Act's format requirements and individual customer preferences:

import { createWriteStream } from "fs";
import { pipeline } from "stream/promises";
import { stringify } from "csv-stringify";

type ExportFormat = "ndjson" | "csv" | "parquet";

interface ExportPipeline<T> {
  format: ExportFormat;
  outputPath: string;
  schema: Record<string, unknown>; // JSON Schema
  write(records: AsyncIterable<T>): Promise<ExportStats>;
}

export class UserExportPipeline implements ExportPipeline<UserRecord> {
  readonly format: ExportFormat;
  readonly outputPath: string;
  readonly schema = USER_JSON_SCHEMA;

  constructor(options: { format: ExportFormat; outputPath: string }) {
    this.format = options.format;
    this.outputPath = options.outputPath;
  }

  async write(records: AsyncIterable<UserRecord>): Promise<ExportStats> {
    const stats: ExportStats = { recordCount: 0, sizeBytes: 0, errors: [] };

    switch (this.format) {
      case "ndjson":
        return this.writeNDJSON(records, stats);
      case "csv":
        return this.writeCSV(records, stats);
      case "parquet":
        return this.writeParquet(records, stats);
    }
  }

  private async writeNDJSON(
    records: AsyncIterable<UserRecord>,
    stats: ExportStats
  ): Promise<ExportStats> {
    const writeStream = createWriteStream(this.outputPath, {
      encoding: "utf8",
      flags: "w",
    });

    try {
      for await (const record of records) {
        const line = JSON.stringify(this.normalizeRecord(record)) + "\n";
        writeStream.write(line);
        stats.recordCount++;
        stats.sizeBytes += Buffer.byteLength(line, "utf8");
      }
    } finally {
      writeStream.end();
      await new Promise((resolve) => writeStream.on("finish", resolve));
    }

    return stats;
  }

  private async writeCSV(
    records: AsyncIterable<UserRecord>,
    stats: ExportStats
  ): Promise<ExportStats> {
    const csvStringifier = stringify({
      header: true,
      columns: CSV_COLUMNS, // flat column list for tabular export
    });

    const writeStream = createWriteStream(this.outputPath, {
      encoding: "utf8",
    });

    try {
      for await (const record of records) {
        csvStringifier.write(this.flattenRecord(record));
        stats.recordCount++;
      }
      csvStringifier.end();
      await pipeline(csvStringifier, writeStream);
      stats.sizeBytes = (await import("fs")).statSync(this.outputPath).size;
    } catch (error) {
      stats.errors.push(error instanceof Error ? error.message : String(error));
    }

    return stats;
  }

  private normalizeRecord(record: UserRecord): SerializableUserRecord {
    return {
      id: record.id,
      email: record.email,
      displayName: record.displayName ?? null,
      locale: record.locale ?? "en-US",
      timezone: record.timezone ?? "UTC",
      createdAt: record.createdAt.toISOString(),
      updatedAt: record.updatedAt.toISOString(),
      status: record.status,
    };
  }

  private flattenRecord(record: UserRecord): FlatUserRecord {
    return {
      id: record.id,
      email: record.email,
      display_name: record.displayName ?? "",
      locale: record.locale ?? "en-US",
      timezone: record.timezone ?? "UTC",
      created_at: record.createdAt.toISOString(),
      updated_at: record.updatedAt.toISOString(),
      status: record.status,
    };
  }
}

// Multi-format coordinator
export async function generateSwitchingExport(
  accountId: string,
  options: { formats: ExportFormat[]; outputDir: string }
): Promise<ExportPackage> {
  const results: ExportPackage = {
    files: [],
    manifest: createBaseManifest(accountId),
  };

  for (const format of options.formats) {
    const outputPath = `${options.outputDir}/users.${format === "ndjson" ? "ndjson" : format}`;
    const pipeline = new UserExportPipeline({ format, outputPath });
    const records = streamUsersForAccount(accountId);
    const stats = await pipeline.write(records);

    results.files.push({
      path: `users.${format}`,
      format,
      encoding: "utf-8",
      schemaRef: "https://api.yoursaas.com/data-act/export-schema#/components/schemas/UserRecord",
      recordCount: stats.recordCount,
      checksum: await computeSHA256(outputPath),
    });
  }

  return results;
}

interface ExportStats {
  recordCount: number;
  sizeBytes: number;
  errors: string[];
}

Interoperability Profiles: The GAIA-X Approach

The EU's GAIA-X initiative and the European Data Standards Body (EDSB) are developing sector-specific interoperability profiles for cloud switching. These profiles specify:

While mandatory profiles are not yet finalized for general SaaS, EDSB working groups have published guidance for specific sectors (financial services, health, mobility). Even where profiles are not yet mandatory, implementing them positions your platform ahead of the compliance curve.

The DCAT vocabulary (W3C Data Catalog Vocabulary) is the leading semantic layer for Data Act exports, particularly for B2B data spaces. For customer-facing exports, JSON Schema is sufficient.

Minimal DCAT-Compatible Export Manifest

{
  "@context": {
    "dcat": "http://www.w3.org/ns/dcat#",
    "dct": "http://purl.org/dc/terms/",
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "@type": "dcat:Dataset",
  "@id": "urn:export:550e8400-e29b-41d4-a716-446655440000",
  "dct:title": "YourSaaS Account Export — Account abc123",
  "dct:description": "EU Data Act cloud switching export for customer account abc123",
  "dct:issued": {
    "@type": "xsd:dateTime",
    "@value": "2026-05-31T20:00:00Z"
  },
  "dct:publisher": {
    "@id": "https://yoursaas.com",
    "@type": "foaf:Organization"
  },
  "dct:license": {
    "@id": "https://creativecommons.org/licenses/by/4.0/"
  },
  "dcat:distribution": [
    {
      "@type": "dcat:Distribution",
      "dcat:downloadURL": "urn:local:users.ndjson",
      "dcat:mediaType": "application/x-ndjson",
      "dct:format": "NDJSON",
      "dcat:byteSize": 1048576,
      "spdx:checksum": {
        "spdx:algorithm": "SHA-256",
        "spdx:checksumValue": "a3f5d..."
      }
    }
  ]
}

Validation: Testing Your Export for Interoperability

Before publishing your export feature, run it through this validation checklist:

Automated Validation Checklist

import Ajv from "ajv";
import addFormats from "ajv-formats";

export async function validateExportCompliance(
  exportDir: string
): Promise<ComplianceReport> {
  const report: ComplianceReport = { passed: [], failed: [], warnings: [] };
  const ajv = new Ajv({ strict: true, allErrors: true });
  addFormats(ajv);

  // 1. Manifest validation
  const manifestPath = `${exportDir}/manifest.json`;
  const manifest = await loadJSON(manifestPath);
  const manifestValid = ajv.validate(MANIFEST_SCHEMA, manifest);
  manifestValid
    ? report.passed.push("Manifest schema valid")
    : report.failed.push(`Manifest invalid: ${ajv.errorsText()}`);

  // 2. All referenced files exist
  for (const content of manifest.exportContents ?? []) {
    for (const file of content.files ?? []) {
      const exists = await fileExists(`${exportDir}/${file.path}`);
      exists
        ? report.passed.push(`File exists: ${file.path}`)
        : report.failed.push(`Missing file: ${file.path}`);
    }
  }

  // 3. Checksums match
  for (const file of getAllFiles(manifest)) {
    const computed = await computeSHA256(`${exportDir}/${file.path}`);
    computed === file.checksum
      ? report.passed.push(`Checksum OK: ${file.path}`)
      : report.failed.push(
          `Checksum mismatch: ${file.path} (expected ${file.checksum}, got ${computed})`
        );
  }

  // 4. Record count consistency
  for (const content of manifest.exportContents ?? []) {
    for (const file of content.files ?? []) {
      if (file.format === "ndjson" && file.recordCount) {
        const count = await countNDJSONLines(`${exportDir}/${file.path}`);
        count === file.recordCount
          ? report.passed.push(`Record count OK: ${file.path}`)
          : report.failed.push(
              `Record count mismatch: ${file.path} (manifest says ${file.recordCount}, actual ${count})`
            );
      }
    }
  }

  // 5. Timestamp format validation (sample first 100 records)
  for (const file of getAllNDJSONFiles(manifest)) {
    const sample = await readFirstNRecords(`${exportDir}/${file.path}`, 100);
    for (const record of sample) {
      for (const [key, value] of Object.entries(record)) {
        if (key.endsWith("At") || key.endsWith("Date") || key === "timestamp") {
          const isISO = ISO_DATE_REGEX.test(String(value));
          if (!isISO) {
            report.failed.push(
              `Non-ISO timestamp in ${file.path}: ${key}=${value}`
            );
          }
        }
      }
    }
  }

  // 6. UTF-8 encoding check
  for (const textFile of getTextFiles(manifest)) {
    const isUTF8 = await checkUTF8(`${exportDir}/${textFile.path}`);
    isUTF8
      ? report.passed.push(`UTF-8 encoding OK: ${textFile.path}`)
      : report.failed.push(`Non-UTF-8 file: ${textFile.path}`);
  }

  return report;
}

const ISO_DATE_REGEX = /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.\d+)?Z$/;

interface ComplianceReport {
  passed: string[];
  failed: string[];
  warnings: string[];
}

Schema Versioning and Migration

Your export schema will evolve. New fields, deprecated fields, restructured objects. The Data Act's interoperability requirement includes an implicit versioning obligation: if you change your export format, receiving providers must be able to update their import pipeline.

Best practices for schema versioning under the Data Act:

  1. Semantic versioning in the manifest — include schemaVersion: "1.2.0" in every manifest.json
  2. Breaking change notice — minimum 90 days advance notice before breaking schema changes (analogous to the API change notice requirement)
  3. Migration guides — publish diff documentation at your schema URL (e.g. https://api.yoursaas.com/data-act/schema-changelog)
  4. Additive-only for minor versions — only add new optional fields in minor versions, never remove or rename required fields
  5. Schema URL stability — the schema URL referenced in your manifest must remain stable; use /data-act/export-schema/v1 not /data-act/export-schema
export const SCHEMA_VERSIONS = {
  "1.0.0": {
    released: "2025-09-12",
    deprecated: null,
    schemaUrl: "https://api.yoursaas.com/data-act/export-schema/v1.0",
    breakingChanges: [],
  },
  "1.1.0": {
    released: "2026-01-15",
    deprecated: null,
    schemaUrl: "https://api.yoursaas.com/data-act/export-schema/v1.1",
    breakingChanges: [],
    additions: ["user.metadata", "team.parentTeamId"],
  },
} as const;

export function getCurrentSchemaVersion(): string {
  const versions = Object.keys(SCHEMA_VERSIONS).sort();
  return versions[versions.length - 1];
}

export function isDeprecatedSchema(version: string): boolean {
  const schema = SCHEMA_VERSIONS[version as keyof typeof SCHEMA_VERSIONS];
  return !!schema?.deprecated;
}

What's Coming from the EDSB

The European Data Standards Body (EDSB), established under the Data Act, is tasked with developing common specifications for cloud switching. As of mid-2026, the EDSB has published early-stage guidance on:

These specifications are not yet mandatory, but regulators are expected to adopt them via delegated acts once finalized. Implementing EDSB guidance now reduces future migration risk.

Practical recommendation: sign up for EDSB consultation alerts and subscribe to the ENISA cloud switching guidance publication RSS. When sector-specific profiles are published for your industry, your lead time to implement is typically 12-18 months from publication to enforcement.


Implementation Checklist

Before your Data Act export feature goes live, verify:


Next in This Series

The next post covers Art. 27 switching obstacle compliance — identifying and removing the hidden lock-in patterns that regulators will flag: service degradation during switching, removal of integrations, and contractual terms that restrict export scope. We'll provide an audit framework and the TypeScript patterns for safe switching window enforcement.

The EU Data Act (Regulation (EU) 2023/2854) text is available at EUR-Lex. EDSB guidance documents are published at the European Commission data economy portal.

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.