A junior dev once shipped a checkout flow that accepted any JSON the client sent. One morning, a typo in the mobile app turned "quantity": 1 into "quantity": "1" — the string slipped past the API, hit the database as text, and silently broke inventory math for two days. The fix that prevents this entire class of bug is JSON Schema, and it takes about ten minutes to learn.
JSON Schema is a contract for JSON. You describe the shape your data must take — required keys, types, value ranges, regex patterns — and a validator either lets the payload through or tells you exactly where it broke. It is the spec underneath OpenAPI, the inspiration behind Zod and Pydantic, and the thing most APIs eventually re-invent badly before adopting.
The Mental Model: A Schema Is Just JSON
The first thing that throws people: a JSON Schema is itself a JSON document. There is no DSL, no separate file format, no compile step. You write a JSON object whose keys are keywords the validator understands, and that object describes the shape another JSON document must match.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["email", "age"],
"properties": {
"email": { "type": "string", "format": "email" },
"age": { "type": "integer", "minimum": 13, "maximum": 120 }
},
"additionalProperties": false
}
Feed that schema and a payload to a validator like Ajv, and you get back either valid: true or a list of errors with paths into the document. The same schema works in Node, Python, Go, Rust — every mainstream language has a validator that follows the spec.
The $schema line declares which draft you are writing against. The current one is Draft 2020-12, but you will still see Draft-07 because OpenAPI 3.0 used a slightly older variant. Target 2020-12 for new work and Draft-07 only if you need to interop with older tooling. The full text lives in the JSON Schema spec.
The Keywords That Do 90% of the Work
The spec has dozens of keywords, but a small core covers almost every real validation need.
Type narrows the value to one of seven JSON types: string, number, integer, boolean, array, object, null. You can also pass an array — "type": ["string", "null"] — to allow either.
Strings: minLength, maxLength, pattern (a regex), format (named formats like email, uri, date-time, uuid). format is annotation-only by default — Ajv enables it as a real check, but some validators don't.
Numbers: minimum, maximum, exclusiveMinimum, exclusiveMaximum, multipleOf.
Arrays: items (a schema each element must match), minItems, maxItems, uniqueItems, plus prefixItems for tuples where each position has its own type.
Objects: properties, required, additionalProperties (a schema or false to forbid unknown keys), and patternProperties for keys matching a regex.
{
"type": "object",
"properties": {
"tags": {
"type": "array",
"items": { "type": "string", "minLength": 1 },
"uniqueItems": true,
"maxItems": 10
},
"settings": {
"type": "object",
"patternProperties": {
"^x-": { "type": "string" }
}
}
}
}
That schema accepts a tags array of unique non-empty strings (max 10) and a settings object where any key starting with x- must be a string. It is shockingly compact for what it expresses.
Composition: anyOf, oneOf, allOf
This is where JSON Schema starts feeling powerful instead of fiddly. You can build complex shapes by combining simpler ones.
{
"type": "object",
"properties": {
"payment": {
"oneOf": [
{
"type": "object",
"properties": {
"method": { "const": "card" },
"last4": { "type": "string", "pattern": "^[0-9]{4}$" }
},
"required": ["method", "last4"]
},
{
"type": "object",
"properties": {
"method": { "const": "paypal" },
"account": { "type": "string", "format": "email" }
},
"required": ["method", "account"]
}
]
}
}
}
oneOf means exactly one of the listed schemas must match. anyOf means at least one. allOf means all must match — useful for layering a base schema with extra constraints. There is also not, which inverts a schema, but reach for it sparingly because the error messages get cryptic.
A common pattern is the discriminated union above: a method constant tells you which variant you are dealing with, and the rest of the keys depend on that. This is how you model tagged unions in JSON. Tools like JSON to Zod Schema translate this exact pattern into TypeScript-friendly equivalents.
$ref and $defs: Don't Repeat Yourself
Real schemas describe nested, recursive, reused shapes. $ref lets one part of a schema point at another, and $defs is the conventional bag to stash reusable subschemas.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"billing": { "$ref": "#/$defs/address" },
"shipping": { "$ref": "#/$defs/address" }
},
"$defs": {
"address": {
"type": "object",
"required": ["street", "country"],
"properties": {
"street": { "type": "string" },
"country": { "type": "string", "minLength": 2, "maxLength": 2 }
}
}
}
}
The # is the root of the current document; the rest is a JSON Pointer into it. Refs can also point at external URIs, which is how OpenAPI documents pull in shared component schemas across files. Recursive refs work too — a tree node can reference itself in children.
When you start nesting schemas a few levels deep, run them through a JSON Formatter before debugging. The mental load drops sharply when the indentation is consistent.
How JSON Schema Powers OpenAPI and Zod
The reason JSON Schema matters in practice is that it is the substrate for tools you probably already use.
OpenAPI (formerly Swagger) describes HTTP APIs. The shapes of request and response bodies are written in a dialect of JSON Schema — when you define a Pet schema in an OpenAPI document, that's JSON Schema. The full OpenAPI specification lives on top of JSON Schema's vocabulary.
Zod, Yup, Joi, Pydantic, Ajv — every modern validation library either targets JSON Schema directly or models the same vocabulary. Zod's z.object({ email: z.string().email() }) compiles to the same intent as the JSON Schema at the top of this post. Most of these libraries can export to or import from JSON Schema, which is why interop is largely painless.
Code generation. From one schema you can produce TypeScript interfaces, Go structs, Python dataclasses, even SQL DDL. Run a JSON sample through JSON to TypeScript for the same shape inference — schema-first means the schema is authoritative and code is downstream.
Documentation. Redoc and Stoplight render JSON Schema into readable API docs. The constraints you wrote — minLength, format, enum — appear without you writing them twice.
This is the leverage: one document, many consumers. Server validation, client validation, types, docs, mocks, contract tests. It is rare for a piece of metadata to pay off this many times.
Validating in Production: Ajv in Two Lines
For Node, Ajv is the de-facto validator. It is fast (it compiles each schema to native JavaScript at startup), spec-compliant, and integrates with TypeScript.
import Ajv from "ajv";
import addFormats from "ajv-formats";
const ajv = new Ajv({ allErrors: true });
addFormats(ajv); // enables format: email, uri, date-time, etc.
const validate = ajv.compile({
type: "object",
required: ["email"],
properties: { email: { type: "string", format: "email" } },
additionalProperties: false,
});
const ok = validate(req.body);
if (!ok) {
return res.status(400).json({ errors: validate.errors });
}
Three behaviors to know:
- Compile once, validate many.
ajv.compileis the slow step; the returned function is the hot path. Don't compile inside a request handler. allErrors: truereturns every failure instead of stopping at the first. Worth it for API responses.strict: true(default in newer Ajv) errors on unknown keywords. If you useformator vendor extensions, configure them or disable strict mode.
For a checker without any setup, paste your schema and sample into the JSON Schema Validator — same error structure Ajv produces.
Schema From Sample, And The Pitfalls That Follow
Writing a schema from scratch for an existing API is tedious. Feed a representative payload into a generator like the JSON Schema Generator from Sample and edit the result. Smarter generators look at multiple samples and infer optional vs required fields by which keys appear in every example. After generation, always add required arrays, tighten strings with format or enum, set minimum/maximum on numbers with semantic ranges, and set additionalProperties: false if your API rejects unknown keys.
A few mistakes show up over and over once schemas hit production:
Forgetting required. Every property in properties is optional by default. Generators inherit this — if email must be present, it has to appear in the required array.
Loose additionalProperties. Without additionalProperties: false, any extra key is silently accepted. Fine for permissive APIs; a security smell for internal services where unexpected fields might mean a typo or an attack.
number vs integer. number accepts floats and integers. For whole numbers only, use type: "integer". JavaScript has no integer type, so a parser will accept 42 and 42.0 identically — the schema is your only guarantee.
Regex dialect drift. JSON Schema's pattern uses ECMA-262 regex. Stick to the common subset; PCRE-only features behave inconsistently across validators. The Regex Tester covers JavaScript-compatible patterns.
Too strict on day one. A schema that rejects unknown keys will reject new fields you add later, breaking older clients. Version your schemas, or leave additionalProperties permissive on responses while keeping it strict on requests. Postel's law applies.
For broader API design that complements schema-driven validation, see REST API Design Best Practices and API Rate Limiting — together with JSON Schema, those three cover almost everything an HTTP API needs at the boundary.
When Not To Use It
JSON Schema is a hammer, but not every payload is a nail. Skip it for throwaway scripts where the data is yours and the schema is the only consumer, strongly-typed pipelines where the language's types already validate at compile time and the JSON never crosses an untrusted boundary, and performance-critical hot paths where even compiled validators are too slow — though for almost all web traffic, validators are nowhere near the bottleneck.
For everything else — public APIs, webhook receivers, config files, message queues, anywhere data crosses a process boundary — a schema is cheap insurance against the kind of bug that surfaces at 2am two weeks after it shipped. The official JSON Schema getting-started guide is a worthwhile next read once the keywords above feel comfortable.
The quantity: "1" bug from the opening would have been a 400 Bad Request with a clear error path the second it was sent. That's the whole pitch. Write the schema, wire up the validator, and let the contract do the work.
FAQ
Should I use JSON Schema or Zod for new TypeScript projects?
For TypeScript-first stacks, Zod is more ergonomic — types and runtime validation are colocated, and z.infer derives types automatically. For polyglot or schema-first systems (OpenAPI-driven contracts, multi-language consumers, generated SDKs), JSON Schema is the right substrate because every language has a validator. The pragmatic answer: write Zod for internal TS code, generate JSON Schema from it (via zod-to-json-schema) when you need to share the contract.
What's the difference between Draft-07 and Draft 2020-12?
Mostly cleanup and consolidation. 2020-12 reorganized definitions into $defs, added prefixItems for tuple validation (replacing the array form of items), and made $id resolution stricter. Draft-07 is still everywhere because OpenAPI 3.0/3.1 stayed compatible with it for ages. For new schemas, target 2020-12; for OpenAPI integrations, check what your tooling supports — many validators handle both.
How do I validate that exactly one of several variants matches?
Use oneOf with a discriminator pattern: each variant requires a type (or kind) field with a specific const value, plus the variant-specific properties. Example: {"oneOf": [{"properties": {"type": {"const": "card"}, "last4": {...}}}, {"properties": {"type": {"const": "paypal"}, "email": {...}}}]}. OpenAPI 3.1 has explicit discriminator syntax that maps cleanly to this pattern; some validators use it for better error messages.
Why doesn't `format: "email"` validate emails by default?
JSON Schema treats format as annotation by default — validators may or may not enforce it. Ajv's ajv-formats package opts in to actual format checking; some validators ignore format entirely. For strict email validation, either use ajv-formats and trust its rules, or write a pattern regex. Note that "validate an email by regex" is a famously hard problem — the only true validation is sending a confirmation email.
Can I generate types from a JSON Schema?
Yes — json-schema-to-typescript for TS interfaces, quicktype for cross-language code (TypeScript, Go, C#, Swift, Kotlin, Rust, Python), datamodel-code-generator for Python Pydantic models. The pattern: schema is the source of truth, types are derived. This works particularly well for OpenAPI documents where the schema lives in the API spec and clients regenerate types from it.
How do I version JSON Schemas?
Two common patterns. (1) Version the schema URL — https://example.com/schemas/user/v2.json — and let consumers reference specific versions explicitly. (2) Embed a version field in the document itself and use oneOf to dispatch to version-specific subschemas. Backward compatibility is hard either way; the rule of thumb is: never make a previously-optional field required, never tighten a constraint, never remove a property.
Should I validate API responses or just requests?
Both, with different strictness. Validate requests strictly (reject unknown fields with additionalProperties: false) — clients sometimes send wrong data. Validate responses loosely or only in dev/test (your own server should produce valid output, but contract tests catch drift). Validating production responses adds latency without much benefit; running schema validation in your test suite gives you the safety without the cost.
What's the right way to handle nullable fields?
In Draft 2020-12, use "type": ["string", "null"] to allow either. In OpenAPI 3.0, the convention is "type": "string", "nullable": true. OpenAPI 3.1 aligned with JSON Schema's array-of-types form. Don't use "type": "string" and rely on the field being optional (required: []) — that's "may be absent," not "may be null." The two are different states and your validator will treat them differently.