HTML entities, explained
HTML uses characters like <, >, and & as part of its own syntax. That creates a small but important problem: what if you want those characters to appear as normal text on a page?
HTML entities are one answer. They let you write a character in a form the browser can read as text instead of markup. So < displays as <, > displays as >, and & displays as &.
That sounds tiny, but it shows up everywhere: documentation, blog posts, code examples, CMS fields, template output, comments, product descriptions, and user-generated content. Get it wrong and text can break the surrounding page, display incorrectly, or get interpreted in a context you did not mean.
When you'd actually use HTML entities
You need HTML entities when plain text contains characters that mean something special in HTML.
If you are writing a tutorial about tags, you want readers to see <strong>bold</strong> as text, not have the browser treat it as an actual element. If a product name contains an ampersand, like Tom & Jerry Mug, the ampersand should not look like the start of an entity. If a quote appears inside an attribute value, the quote may need escaping so it does not close the attribute early.
Entities also show up when you want characters that are awkward to type directly, such as a nonbreaking space. A nonbreaking space looks like a normal space, but it tells the browser not to wrap the line at that point. That is useful for things like 10 MB, where splitting the number and unit across two lines would look odd.
The pattern is the same in each case: you are telling the browser, "show this as text, not as HTML syntax."
Named and numeric entities
HTML has two common styles of entities.
A named entity uses a readable name:
<
>
&
"
A numeric entity uses a character number:
<
>
&
"
 
There are also hexadecimal numeric forms, such as < for <.
Named entities are easier to recognize for common characters. Numeric entities are useful when there is no familiar name or when a tool outputs numeric references by default. In modern UTF-8 pages, many characters can be written directly, but entities are still needed for characters that would collide with HTML syntax.
The characters people escape most
The characters that matter most in normal HTML text are:
<as<>as>&as&
Those three handle most visible text cases. The less-than sign starts a tag, the greater-than sign closes one, and the ampersand starts an entity reference.
Quotes depend on context. In normal text content, a double quote can usually appear as itself. Inside an attribute value wrapped in double quotes, it should be escaped as ". Inside an attribute wrapped in single quotes, a single quote may need special handling too, depending on the templating rules.
That is why escaping is not just about the character. It is about where the character lands.
Text context versus HTML context
Context is the part that causes mistakes.
Text between tags is one context:
<p>Tom & Jerry</p>
Attribute values are another:
<a title="Tom & Jerry">Cartoon</a>
JavaScript strings, CSS, URLs, and JSON embedded inside a page all have their own rules. HTML entities solve HTML text and HTML attribute problems. They do not automatically make text safe for every other language that might appear inside a page.
A practical rule helps: escape for the exact place the value is going. Text node output, attribute output, URL parameters, CSS strings, and JavaScript strings should not be treated as one shared bucket.
Escaping is not the same as sanitization
Escaping changes how characters are represented so they display as text. Sanitization is a different job: it removes or filters markup you do not want to allow.
That distinction matters for user-generated content.
If a comment field should accept only plain text, escaping the user input before rendering is the right kind of move. The visible text remains visible, but it is not treated as page structure.
If a rich-text editor allows a limited set of tags, such as links and bold text, the system may need sanitization instead. It must decide which tags and attributes are allowed, and remove the rest.
Entities are part of the safety story, but they are not a full content-security policy by themselves.
A worked example: escape a short snippet
Suppose you want a page to show this exact text:
Use <button>Save & close</button> for the label "Done".
If you place that text directly into an HTML page, the browser may read <button> and </button> as real tags. The ampersand can also cause parsing trouble.
Escaped for display, it becomes:
Use <button>Save & close</button> for the label "Done".
The browser displays the original text to the reader, but the markup characters are no longer treated as active HTML.
This is exactly what you want in documentation, examples, support replies, and any place where code-like text needs to be shown literally.
Try it in your browser
Our HTML Entity Codec runs in your browser. You can encode text into entities, decode entities back to readable characters, and test small snippets without sending the text to a server.
That local workflow is helpful when the text contains draft content, internal labels, customer-facing copy, or code examples you do not want to paste into a random online box.
Common mistakes
Escaping in the wrong context. HTML entities are for HTML contexts. A URL query string or JavaScript string has different escaping rules.
Double-encoding text. If & becomes &, and then that output is encoded again, you may end up with & on the page.
Forgetting the ampersand. People remember < and >, but & is just as important because it begins entity references.
Treating entities as full sanitization. Escaping text and filtering allowed HTML are related, but they are not the same job.
Using nonbreaking spaces everywhere. is useful in small doses. Too many nonbreaking spaces can make text wrap badly on small screens.
FAQ
< is the literal less-than character. < is an entity that tells the browser to display a less-than character as text.
No. Focus on the characters that collide with HTML syntax in the current context. In text nodes, <, >, and & matter most.
Usually because the text was encoded but not decoded before display, or because it was encoded more than once.
No. It displays like a space, but it prevents a line break at that point.
No. Correct escaping helps a lot for plain-text output, but rich HTML needs proper sanitization and context-aware handling.
Related guides
- Markdown and HTML, explained - helpful when Markdown output turns into HTML and text needs the right escaping.
- URL encoding and parsing, explained - a different escaping problem for links and query strings.
- Working with XML, explained - useful because XML has its own rules for markup-sensitive characters.