API Design in System Design: REST, Versioning, Pagination & Idempotency (Visualized)

API design is the practice of defining the contract between a service and its clients — the resources it exposes, the operations allowed on them, the shape of requests and responses, and the rules for how that contract evolves over time. A well-designed API is predictable, hard to misuse, and stable enough that clients written today still work next year.

Most public APIs you have used — Stripe, GitHub, Twilio — succeed less because of clever code and more because of disciplined design: consistent naming, correct use of HTTP, careful versioning, and clear error semantics. This guide walks through those decisions in the order you actually make them when designing a service.

Resource Modeling & Naming

In a REST API, you model your domain as resources — nouns like users, orders, or invoices — and act on them with HTTP methods. The golden rules: use plural nouns for collections (/users), put the identifier in the path (/users/42), nest only to express ownership (/users/42/orders), and keep verbs out of URLs. POST /users creates a user; you never need /createUser.

The HTTP method carries the verb, and each one comes with a contract the whole web relies on. GET is safe (no side effects) and cacheable; PUT and DELETE are idempotent (repeating them has the same effect as doing them once); POST is neither. Respecting these semantics is what lets proxies, browsers, and clients reason about your API.

Method	Purpose	Safe	Idempotent
GET	Read a resource or collection	Yes	Yes
POST	Create a resource / trigger an action	No	No
PUT	Replace a resource at a known URL	No	Yes
PATCH	Partially update a resource	No	No
DELETE	Remove a resource	No	Yes

The request/response cycle is the heartbeat of any API: the client sends a method and path, the server does work, and a status code communicates the outcome — 2xx success, 4xx the client's fault, 5xx the server's fault. The animation below traces that round trip.

The request/response cycle

Each request carries a method and path; the server replies with a status code. Watch GET, POST, and DELETE round-trip through the API with 200, 201, and 404 responses.

Versioning: Evolving Without Breaking Clients

Once external clients depend on your API, you can never break the contract without warning. Versioning lets you ship incompatible changes while old clients keep working. The most common approach is a version prefix in the URL (/v1/users, /v2/users); alternatives include a custom header or an Accept media type. Stripe takes a notable approach: a single date-based version pinned per account, so existing integrations are frozen in time while new ones opt into the latest behavior.

The router below shows how a gateway dispatches the same logical request to different backend implementations based on its version — old traffic to v1, new traffic to v2 — so both can run side by side during a migration.

API version routing

The gateway reads the version prefix and routes each request to the matching backend. v1 and v2 run side by side so old clients never break.

Pagination: Offset vs Cursor

No endpoint should return an unbounded list. Pagination breaks large result sets into pages. Offset pagination (?limit=20&offset=40) is simple but slow on deep pages and unstable when rows are inserted mid-scroll. Cursor pagination instead returns an opaque pointer to the last item seen (?after=eyJpZCI6MTAwfQ); the next request resumes exactly there. It is stable under concurrent writes and stays fast at any depth, which is why Stripe and the GitHub API use it for large collections.

Cursor pagination walking a dataset

A cursor points at the last item returned; each next page resumes from there. The window slides forward one page at a time, stable even as new rows arrive.

Idempotency: Safe Retries

Networks fail mid-request, so clients retry — but retrying a POST /charges could bill a customer twice. Idempotency guarantees that making the same request many times has the same effect as making it once. The standard pattern, popularized by Stripe, is an idempotency key: the client generates a unique key per logical operation and sends it as a header. The server stores the result against that key, so a retry with the same key returns the original response instead of re-executing the action.

POST /v1/charges HTTP/1.1
Idempotency-Key: 7f3a9c12-...
Content-Type: application/json

{
  "amount": 4200,
  "currency": "usd",
  "customer": "cus_Qk29"
}

// A retry with the SAME Idempotency-Key returns the
// stored result instead of creating a second charge.

Error Handling

Errors are part of your API contract, so design them deliberately. Use the right status code (400 bad input, 401 unauthenticated, 403 forbidden, 404 not found, 409 conflict, 422 validation, 429 rate limited), and return a consistent machine-readable body with a stable code, a human message, and ideally a field pointer. A predictable error shape lets clients branch on code rather than parsing prose.

HTTP/1.1 422 Unprocessable Entity

{
  "error": {
    "code": "invalid_email",
    "message": "The email address is not valid.",
    "field": "email",
    "request_id": "req_8Hq2"
  }
}

Authentication & Authorization

Authentication proves who the caller is; authorization decides what they may do. Server-to-server APIs commonly use API keys or OAuth2 client credentials; user-facing APIs use OAuth2 with short-lived bearer tokens (often JWTs) refreshed by a long-lived token. Always send credentials over TLS in the Authorization header — never in the URL, where they leak into logs and browser history. Scope tokens narrowly so a leaked key for reading invoices cannot also delete them.

Rate Limiting

Rate limiting protects your service from abuse and noisy neighbors by capping how many requests a client may make in a window — typically with a token-bucket algorithm. Communicate limits in headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) and return 429 Too Many Requests with a Retry-After header when a client exceeds them, so well-behaved clients can back off gracefully rather than hammering you.

REST vs RPC vs GraphQL

REST is the default for public, resource-oriented HTTP APIs (Stripe, GitHub). gRPC — an RPC style over HTTP/2 with Protobuf — wins for high-throughput internal service-to-service traffic where latency and strong contracts matter. GraphQL lets clients request exactly the fields they need in one round trip, eliminating over- and under-fetching, which shines for rich frontends with many entities (used by the GitHub v4 API). None is universally best; the right choice depends on who consumes the API and how.

	REST	GraphQL	gRPC
Style	Resources + HTTP verbs	Single endpoint, query language	Remote procedure calls
Transport	HTTP/1.1 + JSON	HTTP + JSON	HTTP/2 + Protobuf
Fetching	Fixed responses per endpoint	Client picks exact fields	Fixed typed messages
Best for	Public, cacheable APIs	Rich frontends, many entities	Internal microservices
Caching	Easy (HTTP caching)	Harder (POST queries)	Manual

Frequently Asked Questions

What makes an API RESTful?

A RESTful API models the domain as resources identified by URLs, manipulates them with standard HTTP methods that respect safe/idempotent semantics, is stateless (each request carries its own context), and uses status codes and media types as the web intends. The aim is a uniform, predictable interface that any HTTP client can consume without bespoke logic.

When should I version my API, and how?

Version when you must make a breaking change — removing a field, changing a type, or altering behavior clients rely on. Additive changes (new optional fields, new endpoints) usually do not need a new version. URL prefixes like /v1 are the simplest and most visible; header or media-type versioning keeps URLs clean; Stripe's date-pinned versions freeze each integration in place. Whatever you pick, document a deprecation policy.

Why use cursor pagination instead of offset?

Offset pagination gets slow on deep pages because the database must scan and discard every skipped row, and it can show duplicates or gaps when rows are inserted or deleted mid-scroll. Cursor pagination resumes from an opaque pointer to the last item seen, so it stays fast at any depth and remains stable under concurrent writes. That is why large public APIs like Stripe and GitHub default to it.

A good API is a promise: the same request, made the same way, behaves the same way today, tomorrow, and after a retry. Design the contract first, the code second.
— alokknight Engineering