XML vs JSON — Which Data Format to Use and When

A Brief History of Both Formats

XML (Extensible Markup Language) was finalized by the W3C in 1998 as a general-purpose data format for the web. The intent was ambitious: one format for structured data everywhere — configuration, documents, APIs, database exports, messages between enterprise systems. By the early 2000s it was everywhere, and SOAP web services, RSS feeds, and config files like Maven's pom.xml were central to how software communicated.

JSON arrived as a reaction. Douglas Crockford formalized it around 2001–2002, and the pitch was simple: JavaScript already has object and array literals, so let's use that syntax as a data format. JSON was initially dismissed as too limited. Then AJAX applications exploded in the mid-2000s, and developers discovered that parsing JSON in the browser was trivially easy (JSON.parse()) while XML required the verbose, inconsistent DOM API. JSON won the web API wars fast.

Both formats are still alive, but they live in different places now.

Structural Comparison

The same data in both formats makes the difference in verbosity obvious:

<!-- XML -->
<person>
  <name>Alice</name>
  <age>30</age>
  <roles>
    <role>admin</role>
    <role>editor</role>
  </roles>
  <active>true</active>
</person>

// JSON
{
  "name": "Alice",
  "age": 30,
  "roles": ["admin", "editor"],
  "active": true
}

The JSON version is about half the size. But the comparison isn't entirely fair — XML is doing more. Notice that age in XML has no type information; it's just the string "30" inside tags. JSON's 30 is a number. JSON has six types: string, number, boolean, null, array, object. XML has one — string — with schema-defined types available as an add-on.

XML's attributes add another dimension that JSON lacks:

<image src="photo.jpg" width="800" height="600" format="webp" />

Representing the same data in JSON means putting everything in an object with no structural distinction between "this describes the thing" and "this is the thing."

What XML Does Better

Mixed content. XML can mix text and markup inside the same element, which is essential for documents:

<para>This is <em>very</em> important and <strong>should not</strong> be ignored.</para>

There's no clean JSON equivalent. Markdown and HTML have largely taken over for this use case, but if you need a structured document format where prose can contain inline markup, XML is genuinely better suited.

Namespaces. XML namespaces (xmlns:) let multiple vocabularies coexist in the same document without name collisions. An SVG file can embed MathML and XHTML in the same document with unambiguous element names. This matters in standards-heavy domains (healthcare HL7/FHIR, publishing DocBook, office formats) where multiple specifications need to compose cleanly.

Schema validation. XML Schema (XSD) and RELAX NG provide rich validation: required vs optional elements, element ordering, data types with patterns, mixed content rules. JSON Schema exists and is useful, but it started later, and the ecosystem of XML validators — command-line tools, IDE plugins, and library support across Java, .NET, and Python — predates it by a decade.

XSLT transforms. XSLT is a full transformation language that can convert XML into HTML, plain text, or other XML formats using templates. It's powerful for document pipelines. There's nothing with the same maturity in the JSON world — you use general-purpose JavaScript or tools like jq.

What JSON Does Better

Readability. JSON is easier to scan quickly. There's no closing tag repetition, no attribute vs child element ambiguity. A moderately nested JSON object is readable at a glance; an equivalently nested XML document requires tracking open and close tag matching mentally.

Size. JSON is consistently smaller. The difference matters in high-volume API traffic.

Native browser parsing. JSON.parse() is built into every JavaScript runtime. Parsing XML requires DOMParser or XMLSerializer, both of which are available but far more cumbersome to use correctly.

Native type information. JSON numbers, booleans, and nulls carry semantic type information without schema annotation. Small but persistent advantage: you don't need to know the schema to know that "active": true is a boolean, not the string "true".

Ubiquitous tooling. Every library in every language has excellent JSON support. JSON.stringify(), json.dumps(), encoding/json — first-class, fast, simple. XML libraries tend to be heavier and more complex.

Where Each Format Lives Today

JSON is the default for public REST APIs, internal microservice communication, browser storage (localStorage, IndexedDB), configuration in modern ecosystems (npm's package.json, VS Code's settings, GitHub Actions with JSON schema), and anywhere that JavaScript is in the stack.

XML holds ground in:

SOAP and enterprise service buses (particularly in financial, healthcare, and government systems where SOAP contracts are embedded in existing infrastructure)
RSS and Atom feeds
Office document formats (.docx, .xlsx, .pptx are ZIP files containing XML)
SVG (vector graphics — XML is foundational here)
Maven, Gradle (historically), Ant build files
Android layouts and manifests
Configuration formats like Tomcat's server.xml, Spring's bean definitions, and many Java ecosystem tools

Converting Between Them

Conversion isn't lossless in either direction. XML's attributes, namespaces, mixed content, and processing instructions don't have JSON equivalents. Converting XML to JSON means making decisions: do attributes become a special @attributes key? Do text nodes become #text? Different libraries make different choices.

A common convention for XML-to-JSON conversion (used by libraries like xml2js and the online tools based on it):

<user id="42" active="true">
  <name>Alice</name>
</user>

{
  "user": {
    "$": { "id": "42", "active": "true" },
    "name": "Alice"
  }
}

The $ or @attributes key captures XML attributes. Text content goes in a _ or #text key if the element has both text and child elements.

The XML to JSON tool handles this conversion in your browser without uploading anything to a server. For working with the resulting JSON, JSON Formatter lets you validate, pretty-print, and explore the structure.

When to Choose Which

Choose JSON when you're building a new API, storing data in a web app, or writing configuration for any modern tool in the npm/PyPI/Go module ecosystems. The tooling is better, the size is smaller, and almost every consumer will find it easier to work with.

Choose XML when: you're integrating with an existing SOAP service that mandates it, you're working in a standards domain (healthcare, publishing, finance) where XML schemas are the industry baseline, you need namespace-qualified elements from multiple standards to coexist in one document, or you're building a document format where mixed text and markup is a requirement.

For a broader look at structured data formats and where CSV fits into the picture, see CSV and TSV: The Universal Data Exchange Format Explained. And if you need to understand JSON's syntax in depth first, JSON Basics and Syntax covers the fundamentals.

The W3C XML specification is the canonical reference for anything XML-specific, and it's more readable than you might expect for a W3C document.