Skip to content

Working with XML, explained

XML is older than many of the web formats people reach for today, but it has not disappeared. You still find it in feeds, invoices, office files, SOAP services, Android resources, build files, configuration exports, sitemaps, banking files, government forms, and enterprise integrations.

The reason is simple. XML is strict, text-based, and good at representing structured documents with names, attributes, nesting, and namespaces. That power also makes it noisier than formats like JSON.

When XML is formatted well, the structure is readable. When it is crammed onto one line or copied from a system log, it can feel impossible to inspect. And when people convert XML to JSON, they often discover that the two formats do not line up as cleanly as they expected.

This guide explains the XML pieces that matter most in daily work: tags, attributes, nesting, namespaces, formatting, conversion limits, and schema-sensitive data.

What XML is

XML stands for Extensible Markup Language. It is a text format for structured data and documents.

A small XML fragment might look like this:

<book id="bk-1001">
  <title>Practical APIs</title>
  <author>Ada Smith</author>
  <price currency="USD">29.00</price>
</book>

That example has elements, attributes, text content, and nesting. A parser can read it in a predictable way because the structure is explicit.

XML is not tied to one fixed set of tags. The creator of a format decides what tags mean. That is why XML can describe a book catalog, a sitemap, a spreadsheet document, a service message, or a configuration file.

Tags and elements

An XML element usually has an opening tag and a closing tag:

<title>Practical APIs</title>

The element name is title. The text content is Practical APIs.

Elements can also contain other elements:

<customer>
  <name>Sam</name>
  <email>sam@example.com</email>
</customer>

That nesting is one of XML's main strengths. It can describe document-like data and deeply nested structures clearly, as long as the formatting makes the hierarchy visible.

XML must be well-formed. Tags need to close, nesting cannot overlap, and there must be a single root element for a complete document.

This is not well-formed:

<name><strong>Sam</name></strong>

The strong tag opens inside name but closes after name, so the nesting overlaps. XML parsers reject that kind of structure.

Attributes

Attributes store extra information on an element:

<price currency="USD">29.00</price>

Here, currency is an attribute of price.

A common design question is whether data belongs as an element or an attribute. There is no single answer, but a useful habit is this:

  • elements are good for main content and repeatable child data
  • attributes are good for small metadata about an element

For example, currency="USD" works nicely as an attribute because it describes how to read the price. A long product description would be awkward as an attribute and better as an element.

Nesting and mixed content

XML can store regular data, but it can also store document-like content where text and tags mix.

For example:

<p>This guide covers <strong>XML</strong> formatting.</p>

The p element contains text, a strong element, and more text. That is called mixed content.

Mixed content is one reason XML does not always convert cleanly to JSON. JSON is great for objects, arrays, strings, numbers, booleans, and null. It does not have a native concept of text nodes mixed between child elements in the same way XML does.

Namespaces, at a high level

Namespaces help avoid name collisions when XML combines vocabularies.

You might see something like this:

<feed xmlns:media="http://search.yahoo.com/mrss/">
  <media:thumbnail url="https://example.com/thumb.jpg" />
</feed>

The media: prefix marks a tag as belonging to a particular namespace. That matters when two systems might both have a tag called thumbnail, id, or title, but mean different things.

For daily inspection, the key point is that prefixes are part of the element identity. If you convert XML, query it, or map it to another format, namespace handling can change the result.

Why formatting XML helps

XML is much easier to read when indentation shows the tree.

Compact XML:

<order><id>1007</id><customer><name>Sam</name></customer><total currency="USD">42.50</total></order>

Formatted XML:

<order>
  <id>1007</id>
  <customer>
    <name>Sam</name>
  </customer>
  <total currency="USD">42.50</total>
</order>

The second version makes the structure clear. You can see the root element, child elements, nested customer data, and the currency attribute.

Formatting is useful when you need to inspect an API response, debug a feed, check a sitemap, review a config file, or compare before-and-after output from a transformation.

XML and JSON do not map perfectly

XML-to-JSON conversion is useful, but it is not a perfect translation.

The formats have different models.

XML has:

  • elements
  • attributes
  • text nodes
  • comments
  • processing instructions
  • namespaces
  • ordered child nodes
  • mixed content

JSON has:

  • objects
  • arrays
  • strings
  • numbers
  • booleans
  • null

A converter has to make decisions. Should attributes become keys with a prefix? Should repeated elements become arrays? What happens when an element appears once in one file and many times in another? How should mixed content be represented? Should namespaces be kept, renamed, or dropped?

Those choices can change the shape of the result.

For example:

<item id="42">Notebook</item>

Could become something like:

{
  "item": {
    "@id": "42",
    "#text": "Notebook"
  }
}

Another converter might choose a different shape. Both can be reasonable. That is why converted JSON should be reviewed before code starts depending on it.

Schema-sensitive data

Some XML workflows depend on a schema. A schema describes which elements are allowed, where they may appear, what types they carry, and which attributes are required.

This matters in serious integrations. An invoice, payment file, feed, or government form may look readable but still fail validation because one element is missing, the order is wrong, a namespace is wrong, or a value does not match the expected type.

Formatting can help you inspect the file. Conversion can help you move data into another system. Neither one replaces schema validation when the receiving system expects a precise XML format.

A worked example: format XML and convert the shape

Start with compact XML:

<product sku="A100"><name>Desk lamp</name><price currency="USD">34.50</price><tags><tag>home</tag><tag>lighting</tag></tags></product>

Formatted, it becomes:

<product sku="A100">
  <name>Desk lamp</name>
  <price currency="USD">34.50</price>
  <tags>
    <tag>home</tag>
    <tag>lighting</tag>
  </tags>
</product>

Now the structure is clear. The product has an sku attribute, a name, a price with currency metadata, and a list of tags.

A JSON conversion might represent it as:

{
  "product": {
    "@sku": "A100",
    "name": "Desk lamp",
    "price": {
      "@currency": "USD",
      "#text": "34.50"
    },
    "tags": {
      "tag": ["home", "lighting"]
    }
  }
}

That is useful, but notice the converter had to invent conventions for attributes and text content. Those conventions need to match the code that will read the JSON later.

Try the browser tools

These tools cover the two most common XML jobs:

  • XML Formatter - turn compact or messy XML into an indented structure that is easier to read.
  • XML JSON Converter - convert between XML and JSON when you need to inspect, migrate, or adapt structured data.

They run in your browser, which is useful for private config files, internal feeds, sample payloads, or client data you do not want to upload just to reformat.

Common mistakes

Treating XML as just HTML with custom tags. XML has stricter parsing rules and different goals.

Ignoring attributes during conversion. Attributes can carry important data, not just decoration.

Assuming repeated elements will always become arrays. Converter behavior can vary, especially when an element appears once in one file and many times in another.

Dropping namespaces without checking the impact. Prefixes and namespace URIs can be meaningful to the receiving system.

Using conversion instead of validation. Schema-sensitive XML needs validation against the rules expected by the target system.

FAQ

Yes. It remains common in enterprise systems, feeds, document formats, sitemaps, configuration files, and older APIs.

Not universally. XML is strong for document-like structures, attributes, namespaces, and schema-heavy workflows. JSON is lighter for many web APIs and application data shapes.

It can often be converted, but not always without trade-offs. Attributes, namespaces, mixed content, and repeated elements need mapping choices.

It should not change the structure, but whitespace can matter in some text-heavy documents. Review important output before sending it onward.

It means the XML follows the core syntax rules: proper nesting, closed tags, valid attribute quoting, and one root element.

Related guides