API Contracts

Summary

An API contract is a formal, machine-readable agreement between a service provider and its consumers that specifies exactly what requests are valid, what responses to expect, and what guarantees the API offers around versioning and compatibility. Contracts can describe synchronous HTTP/RPC APIs (OpenAPI 3.x, gRPC/Protobuf), asynchronous event-driven APIs (AsyncAPI 3.x), or data schemas flowing through pipelines (Avro, JSON Schema, Protobuf in Confluent Schema Registry).

This guide covers API contracts at intermediate depth for data engineers and backend engineers preparing for technical interviews. It targets OpenAPI 3.1, AsyncAPI 3.0, Avro 1.11, and Pact v4 — all current GA versions as of 2025.

Table of Contents

Core Concepts

1. Schema-First Design

Schema-first (also called design-first) means writing the contract document before any implementation code. The spec becomes the single source of truth: server stubs, client SDKs, mock servers, and documentation are all generated from it. The alternative — code-first — derives the spec from annotations on existing code, which often results in incomplete or inaccurate contracts.

Key artefacts produced from a schema-first workflow:

2. Backward and Forward Compatibility

Schema evolution rules determine which changes are safe to deploy without coordinating a lockstep upgrade across all producers and consumers.

Compatibility ModeWhat it meansAllowed changesForbidden changes
Backward New schema can read data written by the old schema Add optional fields, remove fields with defaults Add required fields, rename/retype fields
Forward Old schema can read data written by the new schema Remove fields, add fields with defaults Add new required fields that old readers don't understand
Full Both backward and forward at once Add/remove optional fields with defaults only Any breaking change
None No compatibility checks enforced Any change

Confluent Schema Registry enforces compatibility modes per subject. The default is BACKWARD.

3. Breaking vs. Non-Breaking Changes

A breaking change causes existing consumers to fail without code changes on their side. Common examples:

Non-breaking changes include adding new optional fields, adding new endpoints, adding new enum values (with caution — see Gotchas), loosening constraints, and deprecating (not removing) endpoints.

4. API Versioning Strategies

There is no universally correct versioning strategy; the best choice depends on your consumer base and deployment model.

5. Consumer-Driven Contract Testing (CDCT)

In CDCT the consumer defines the subset of the provider's API it actually uses, and that subset becomes a binding contract the provider must not break. Pact is the dominant open-source framework for CDCT.

Workflow:

  1. Consumer writes a Pact test → generates a .json pact file describing interactions
  2. Pact file is published to PactFlow or a self-hosted Pact Broker
  3. Provider verifies against the pact in CI; if a change would break a consumer interaction, the build fails
  4. "Can I deploy?" webhook checks allow teams to move independently without breaking each other

6. Schema Registry

A Schema Registry is a centralised store for schemas referenced by Kafka messages (and other event streams). Producers serialise messages with an Avro/Protobuf/JSON Schema and register the schema; consumers look up the schema by ID embedded in the message wire format. Confluent Schema Registry and AWS Glue Schema Registry are the two dominant implementations. Benefits: schema enforcement at produce-time, evolution compatibility checks, single source of truth for data contracts in pipelines.

↑ Back to top

Industry Use Cases

E-Commerce: Microservices Integration

A large e-commerce platform decomposes into Order, Inventory, Payment, and Notification services. Each service publishes its REST API as an OpenAPI 3.1 spec stored in a central spec repository. A CI gate runs oasdiff breaking on every PR to prevent accidental breaking changes. Consumer-driven Pact tests run on the provider in the Notification service's pipeline so the Order service can confidently add new webhooks without manual coordination calls.

Financial Services: Event-Driven Data Contracts

A trading firm streams trade execution events into Kafka topics consumed by risk, compliance, and analytics systems — each owned by a different team. All topics are registered in Confluent Schema Registry with FULL compatibility. A data contract YAML file (following the Open Data Contract Standard) is co-located with each Avro schema, documenting SLAs, data quality rules, and owner contact. Schema changes require an automated PR review from all registered consumers before merging.

Platform Engineering: Internal Developer Platform

A platform team exposes infrastructure APIs (provision a database, create a service account) via an internal API gateway. APIs are spec-first OpenAPI documents, used to generate Terraform provider schemas and CLI tool completions. Contract tests run in a staging environment after every deployment, using Dredd to replay the spec's example requests against the live service and assert expected responses.

Data Engineering: Pipeline Schema Management

A data team ingests clickstream events from a mobile app. The mobile app's schema evolves frequently. By registering the event schema in AWS Glue Schema Registry with BACKWARD_ALL compatibility, the Glue ETL job reading from Kinesis can handle old messages still in the stream while the mobile app ships a new schema version. Downstream Redshift tables are auto-updated via schema evolution in the Glue job.

↑ Back to top

Code Examples

Example 1: OpenAPI 3.1 Contract (YAML)

# openapi: 3.1.0 — uses JSON Schema 2020-12 dialect
openapi: "3.1.0"
info:
  title: Order Service API
  version: "2.1.0"
  description: Manages customer orders. Breaking changes follow SemVer.

paths:
  /v2/orders:
    post:
      operationId: createOrder
      summary: Create a new order
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/CreateOrderRequest"
      responses:
        "201":
          description: Order created
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Order"
        "422":
          description: Validation error

components:
  schemas:
    CreateOrderRequest:
      type: object
      required: [customer_id, items]
      properties:
        customer_id: { type: string, format: uuid }
        items:
          type: array
          minItems: 1
          items: { $ref: "#/components/schemas/OrderItem" }
        coupon_code: { type: string, nullable: true }  # optional field
    OrderItem:
      type: object
      required: [sku, quantity]
      properties:
        sku:      { type: string }
        quantity: { type: integer, minimum: 1 }

Example 2: Avro Schema Evolution (Backward-Compatible)

// v1 — original schema registered in Schema Registry
{
  "type": "record",
  "name": "TradeEvent",
  "namespace": "com.example.trading",
  "fields": [
    { "name": "trade_id",   "type": "string" },
    { "name": "symbol",     "type": "string" },
    { "name": "quantity",   "type": "int"    },
    { "name": "price",      "type": "double" },
    { "name": "timestamp_ms", "type": "long"  }
  ]
}

// v2 — adds optional "venue" field with a default: BACKWARD compatible
{
  "type": "record",
  "name": "TradeEvent",
  "namespace": "com.example.trading",
  "fields": [
    { "name": "trade_id",   "type": "string" },
    { "name": "symbol",     "type": "string" },
    { "name": "quantity",   "type": "int"    },
    { "name": "price",      "type": "double" },
    { "name": "timestamp_ms", "type": "long" },
    {
      "name": "venue",
      "type": ["null", "string"],  // union: null | string
      "default": null              // must default to the first union type
    }
  ]
}

Example 3: Consumer-Driven Contract Test with Pact (Python)

import pytest
from pact import Consumer, Provider

# Consumer: Notification service expects Order service to return this shape
pact = Consumer("notification-service").has_pact_with(
    Provider("order-service"),
    pact_dir="./pacts",
)

def test_get_order_returns_expected_shape():
    expected_body = {
        "order_id": "abc-123",
        "status": "confirmed",
        "customer_id": "cust-456",
    }

    (
        pact
        .given("order abc-123 exists and is confirmed")
        .upon_receiving("a request for order abc-123")
        .with_request(method="GET", path="/v2/orders/abc-123")
        .will_respond_with(200, body=expected_body)
    )

    with pact:
        import requests
        resp = requests.get(f"{pact.uri}/v2/orders/abc-123")
        assert resp.json()["status"] == "confirmed"
    # Pact writes ./pacts/notification-service-order-service.json

Example 4: Registering and Evolving a Schema with Confluent Schema Registry (Python)

from confluent_kafka.schema_registry import SchemaRegistryClient, Schema

client = SchemaRegistryClient({"url": "http://schema-registry:8081"})

# Register v1 schema under subject "trades-value"
schema_v1 = Schema(
    schema_str=open("trade_event_v1.avsc").read(),
    schema_type="AVRO",
)
schema_id = client.register_schema("trades-value", schema_v1)
print(f"Registered schema ID: {schema_id}")

# Set subject-level compatibility to FULL before registering v2
client.set_compatibility("trades-value", compatibility_level="FULL")

# Register v2 — Schema Registry rejects it if it violates FULL compatibility
schema_v2 = Schema(
    schema_str=open("trade_event_v2.avsc").read(),
    schema_type="AVRO",
)
schema_id_v2 = client.register_schema("trades-value", schema_v2)
print(f"Registered v2 schema ID: {schema_id_v2}")

# Check compatibility before registering (dry-run)
is_compatible = client.test_compatibility("trades-value", schema_v2)
print(f"Will v2 pass? {is_compatible}")

Example 5: Breaking Change Detection in CI with oasdiff

# .github/workflows/api-contract-check.yml
name: API Contract Check
on: [pull_request]

jobs:
  breaking-changes:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with: { fetch-depth: 0 }

      - name: Install oasdiff
        run: |
          curl -sSfL https://raw.githubusercontent.com/tufin/oasdiff/main/install.sh | sh

      - name: Check for breaking changes
        run: |
          # Compare base branch spec vs PR branch spec
          oasdiff breaking \
            origin/main:openapi/order-service.yaml \
            openapi/order-service.yaml \
            --fail-on ERR   # exit 1 on any breaking change
↑ Back to top

Comparison / When to Use

Technology Best For Schema Format Sync/Async Code Generation Streaming Support
OpenAPI 3.1 REST HTTP APIs, public/partner APIs JSON Schema 2020-12 Sync (request/response) Excellent (50+ languages via openapi-generator) No (use AsyncAPI for events)
gRPC / Protobuf Internal microservices, low-latency RPC, mobile backends .proto files Sync + streaming (HTTP/2) Excellent (native) Server-side, client-side, bidirectional streaming
AsyncAPI 3.0 Event-driven APIs, Kafka topics, WebSocket channels, MQTT JSON Schema / Avro / Protobuf Async (pub/sub) Good (asyncapi-generator) Native — designed for events
Avro + Schema Registry Kafka data pipelines, schema evolution in streams .avsc JSON Async (embedded in messages) Moderate (avro-tools, fastavro) Excellent — compact binary, schema embedded by ID
GraphQL Client-driven queries, BFF layer, product APIs SDL (Schema Definition Language) Sync (+ subscriptions for events) Good (graphql-codegen) Subscriptions only
JSON Schema Document validation, config validation, webhook payloads JSON Schema Any (schema only) Moderate No native transport
↑ Back to top

Gotchas & Anti-patterns

  1. Treating "additive changes" as always safe. Adding a new required field to a response is backward-compatible for the provider but breaks consumers that use strict schema validation (e.g. Pydantic models with model_config = ConfigDict(extra="forbid")). Always check consumer-side strictness. Similarly, adding a new enum value is technically non-breaking in the spec but will crash deserializers that use exhaustive switch/match statements.
  2. No contract between internal services ("it's all in one repo"). Monorepos tempt teams to skip contracts because "we can change everything at once." In practice, services are deployed independently, and a fast-moving team can still break a slow-moving consumer. Even within a monorepo, contract tests provide regression safety.
  3. Forgetting to version the contract, not just the code. Bumping the service's Docker image tag is not versioning the API. The OpenAPI/Avro schema document itself needs a version field, and the compatibility history must be preserved in a registry or VCS. Without this history, you can't run compatibility checks in CI.
  4. Avro union default must match the first type. In Avro, if you declare "type": ["null", "string"], the default must be null (the first type). Declaring "default": "" causes a schema parse error that can be silent until runtime. Always put null first in nullable unions in Avro schemas.
  5. Using provider-driven contract tests only. Provider-driven tests (e.g., Dredd replaying OpenAPI examples) verify the provider's view of the contract. They do not catch cases where a consumer relies on a field that the provider considers optional and stops returning. Consumer-driven contract testing (Pact) is the complement — it tests what consumers actually depend on, not what the provider thinks it offers.
↑ Back to top

Exercises

  1. Break and fix an Avro schema. Take the TradeEvent v1 schema from Example 2 and register it in a local Confluent Schema Registry (Docker Compose). Then attempt to register a v2 schema that renames price to trade_price. Observe the compatibility error. Fix it by adding trade_price as a new optional field alongside the original and verify v2 registers successfully under BACKWARD mode.
  2. Write a Pact consumer test for a pagination contract. A consumer calls GET /v2/orders?page=1&limit=10 and expects a response with fields items (array), total (integer), and next_cursor (nullable string). Write the Pact interaction, run it to generate a pact file, then implement a minimal Flask provider that satisfies the pact and run provider verification.
  3. Add a breaking change gate to a CI pipeline. Create a small OpenAPI 3.1 spec for a User resource with a GET /users/{id} endpoint. Commit it to a git repo. Then create a branch that removes the email field from the response schema. Add an oasdiff breaking step to a GitHub Actions (or local shell) workflow that compares main vs the branch and fails the build. Confirm the gate triggers, then make a non-breaking change (add optional phone field) and confirm the gate passes.
↑ Back to top

Quiz

Q1: What is the difference between backward and forward compatibility in Avro?

Backward compatibility means the new schema can read data written by the old schema — consumers can upgrade independently. Forward compatibility means the old schema can read data written by the new schema — producers can upgrade independently. Full compatibility requires both. In practice: adding a new field with a default satisfies Backward; removing a field satisfies Forward; adding an optional field with a default satisfies Full.

Q2: Why must the Avro default value match the first type in a union?

Avro's specification requires that the default value for a union field must be valid for the first type listed in the union. So ["null", "string"] requires "default": null, while ["string", "null"] would require a string default. This is a common gotcha: reversing the union order to get a non-null default can accidentally break backward compatibility, since old readers won't know about the null variant if it's not first.

Q3: What is the "Pact triangle" and why does it matter for microservices deployment?

The Pact triangle refers to the three-way relationship between consumer, provider, and the Pact Broker. Consumer tests generate pact files; the broker stores them; provider verification runs against the broker. The key operational benefit is the "can-i-deploy" query: before deploying service X to production, the CI pipeline asks the broker whether the currently deployed consumers are compatible with X's latest pact verification results. This lets teams deploy independently without manual coordination, as long as all pacts are verified.

Q4: What makes adding a new enum value potentially breaking, even though it's an additive change?

Many deserializers and code generators produce exhaustive switch/match/when statements over enum values. When a producer sends a new enum value that the consumer's generated code doesn't know about, the consumer throws an unhandled case exception. In Protobuf this is handled gracefully (unknown values are preserved as integers), but in Avro, JSON Schema, and OpenAPI-generated clients, new enum values can cause hard failures. The safe pattern is to always include an UNKNOWN or catch-all case in consumer logic.

Q5: When would you choose gRPC/Protobuf over OpenAPI for an internal microservice API?

Prefer gRPC/Protobuf when: (1) performance is critical — Protobuf binary encoding is 3–10× smaller than JSON and faster to serialize; (2) you need bidirectional streaming (not possible with REST); (3) the API is internal-only and all consumers can handle the binary protocol; (4) you want strong typing enforced at compile time via generated stubs. Prefer OpenAPI when: the API is public or partner-facing (HTTP+JSON is universally accessible), you need browser or curl accessibility, or your consumers can't run a gRPC stack.

↑ Back to top

Further Reading

↑ Back to top