Markdown and HTML, explained
Markdown exists because writing raw HTML for everyday text is more work than most people need. You want a heading, a link, a list, a quote, or a code block. You do not want to type every tag by hand just to draft a README, a blog post, a changelog, or a help page.
HTML is still what the browser understands. Markdown is the author-friendly layer. A parser reads the Markdown text and generates HTML, which the browser can render as structured content.
That conversion sounds simple until the content moves back the other way. Markdown can express common document structure, but HTML can express far more. When you convert HTML back to Markdown, some detail can be lost, simplified, or rewritten. This guide explains how the two formats relate, where sanitization fits, and how to think about conversion without pretending that both directions are perfect mirrors.
What Markdown is
Markdown is a lightweight markup format. It uses plain-text characters to describe document structure.
For example:
# Release notes
Version **2.4** fixes the upload bug and adds:
- CSV import
- JSON export
- clearer error messages
Read the [migration guide](/guides/working-with-json/).
A human can read that even before it is converted. That is the main appeal. The source text stays close to normal writing, while still carrying enough structure for software to turn it into HTML.
Markdown is common in:
- README files
- documentation pages
- blog drafts
- issue trackers
- static-site content
- notes and knowledge bases
Different Markdown parsers support different extensions. Tables, task lists, footnotes, frontmatter, and fenced code blocks are common, but not every environment handles them the same way.
What generated HTML does
When Markdown is rendered, a parser turns Markdown patterns into HTML tags.
A heading becomes an h1, h2, or another heading level. Bold text becomes strong. A list becomes ul or ol. A link becomes an a element with an href attribute. A code block becomes a pre and code pair, often with a language hint if the Markdown source included one.
The browser does not know that the author typed bold. By the time the content reaches the page, the browser sees structured HTML.
That split is useful because it lets writers work in plain text while developers and publishing systems still get semantic output.
Markdown syntax people use most
The core syntax is small:
# Heading 1
## Heading 2
This is **bold** and this is *italic*.
- item one
- item two
1. first
2. second
[Link text](https://example.com)
> quoted text
`inline code`
Code fences are common in developer docs:
`
Markdown is not meant to replace HTML completely. It is meant to make the common writing cases easy.
Sanitization matters
Converting Markdown to HTML creates HTML output. That means security rules matter.
If the Markdown came from a trusted author inside your own publishing system, you may allow more features. If it came from a public comment box, user profile, forum post, support ticket, or shared document, you need to think much more carefully.
Some Markdown parsers allow raw HTML inside Markdown. For example, an author might write a custom <span> or embedded iframe. That can be useful in a controlled documentation system, but dangerous when the content is untrusted.
Sanitization is the step that removes or blocks risky HTML before it reaches readers. A sanitizer might remove scripts, dangerous attributes, unknown tags, inline event handlers, or links with unsafe protocols.
The rule is simple: Markdown is not automatically safe just because it looks like plain text. Once it becomes HTML, the output needs the same care you would give any other HTML created from user input.
HTML to Markdown is not always clean
Markdown to HTML is usually more predictable than HTML to Markdown.
The reason is that Markdown is smaller. It covers common document shapes. HTML can represent precise layout, attributes, nested elements, inline styles, classes, data attributes, forms, tables, media embeds, scripts, and many other details.
When you convert HTML back to Markdown, the converter has to choose what to keep, what to simplify, and what to drop.
Common lossy cases include:
- custom classes on elements
- inline styles
- complex tables
- nested blocks inside list items
- embedded widgets
- forms and inputs
- custom data attributes
- unusual link attributes
- spacing that mattered visually but not semantically
A converter can often preserve the main text, headings, links, lists, and code blocks. It may not preserve the exact structure that a designer or frontend component expected.
That does not make HTML-to-Markdown conversion useless. It is useful for cleanup, migration, and first drafts. You just need to review the result before treating it as final source content.
A worked example: Markdown to HTML and back
Start with this Markdown fragment:
## Account setup
Before you start, create a **strong password** and save your recovery codes.
- Use at least 14 characters
- Avoid reused passwords
- Store the password in a manager
See the [password guide](/guides/hashing-password-storage/).
A Markdown-to-HTML converter would produce output shaped like this:
<h2>Account setup</h2>
<p>Before you start, create a <strong>strong password</strong> and save your recovery codes.</p>
<ul>
<li>Use at least 14 characters</li>
<li>Avoid reused passwords</li>
<li>Store the password in a manager</li>
</ul>
<p>See the <a href="/guides/hashing-password-storage/">password guide</a>.</p>
That conversion keeps the meaning cleanly. The heading, paragraph, bold text, list, and link all have direct HTML equivalents.
Now imagine the HTML is edited later:
<h2 class="section-title">Account setup</h2>
<p data-source="cms">Before you start, create a <strong>strong password</strong>.</p>
If you convert that back to Markdown, you might get:
## Account setup
Before you start, create a **strong password**.
The text is fine, but the class and data-source attribute are gone. For a normal document, that may be acceptable. For a component-driven page, it may matter a lot.
That is the key lesson. Conversion can preserve content while losing presentation or metadata.
Try the browser tools
These tools are useful when you want to check both directions without wiring up a build step first:
- Markdown to HTML converts Markdown into generated HTML so you can inspect the rendered structure.
- HTML to Markdown turns HTML back into Markdown when you are cleaning up copied content, migrating docs, or starting from an existing page fragment.
Both run in your browser. That is useful for draft documentation, internal snippets, and private notes you do not want to paste into a remote editor just to test a conversion.
Common mistakes
Treating Markdown as one universal standard. Parser differences matter, especially around tables, task lists, HTML blocks, and code fences.
Assuming Markdown output is safe HTML. If untrusted people can write the Markdown, sanitize the generated HTML before rendering it to readers.
Expecting HTML-to-Markdown conversion to preserve everything. HTML has more expressive detail than Markdown. Some detail will be simplified or dropped.
Editing generated HTML and expecting the original Markdown to stay in sync. Once people edit both sides by hand, the source of truth gets blurry.
Using Markdown where full layout control is needed. Markdown is great for documents. It is not a full page-layout language.
FAQ
No. Markdown is author-friendly source text. HTML is the structured markup a browser renders.
Many parsers allow it, but that depends on the environment. Raw HTML can be useful for trusted docs and risky for user-generated content.
Tables are not part of the smallest Markdown core. They depend on parser extensions and may differ across systems.
Because Markdown has no native place for most HTML attributes. The converter keeps the content it can represent and drops details it cannot express cleanly.
For mostly text-based docs, Markdown is often easier to maintain. For custom page layout, interactive components, or precise markup, HTML or a component system may be a better fit.
Related guides
- Working with JSON, explained - useful when documentation examples include API payloads.
- Minifying and beautifying code, explained - helpful when you need to inspect generated HTML, CSS, or JavaScript.
- HTML entities, explained - a companion topic when text needs to be displayed safely inside HTML.