Skip to content

Hashing and password storage, explained

People use the word "hash" for several different jobs, and that is where the confusion starts.

A hash can help you check whether a file changed. It can help software identify duplicate content. It can help a version control system point to an exact object. It can also sit at the center of password storage. Those are related ideas, but they are not interchangeable. The algorithm that is fine for a checksum is often the wrong choice for passwords, and a password hash is solving a different problem than encryption.

If you keep those differences straight, a lot of security advice that sounds fuzzy becomes practical. This guide lays out the basics in plain language: what hashing is, when checksums make sense, why salts matter, where bcrypt fits, and why MD5 or plain SHA-256 are not enough for password storage by themselves.

What a hash actually is

A hash function takes input data of any length and produces a fixed-length output. Feed it "hello" and you get one digest. Change one character and the digest changes too.

For many common algorithms, good hash behavior means:

  • the same input always produces the same output
  • a tiny input change produces a very different digest
  • the output length stays fixed
  • it is fast to compute

That fixed output is useful because you can compare digests instead of comparing the original data byte by byte. If two large files produce different hashes, the files are different. If a stored password hash does not match the hash of the password a user just entered, the password is wrong.

The important part is context. "A hash is one-way" is directionally true, but it does not tell you which kind of system you are building. The right follow-up question is always: what job is this hash doing?

Hashing is not encryption

Encryption is meant to be reversed by someone with the right key. Hashing is not.

If you encrypt a document, the goal is to get the original document back later. If you hash a document, the goal is usually comparison, lookup, or tamper detection. You are not planning to "decrypt" the hash back into the original text.

That distinction matters for passwords. You should not store passwords in encrypted form just because encryption feels more serious. If a database breach exposes both the encrypted passwords and the keys, the attacker can recover the originals. With password hashing, the server stores a one-way representation and checks future login attempts by hashing the submitted password and comparing the result.

Hashing is not magic, though. If the hashing setup is too fast or too predictable, attackers can still guess passwords offline at high speed. That is where salts and slow password-hashing algorithms come in.

Checksums and integrity hashes

One common use of hashing is integrity checking.

Say you download a Linux ISO or a backup archive. The publisher might also post a SHA-256 hash. After the download finishes, you hash the file yourself. If your result matches the published digest, you know the file arrived intact.

This is the checksum style of use. The main question is, "Did the bytes change?" Speed is good here. Determinism is good here. The output should change if the input changes at all.

Checksums also show up in everyday developer work:

  • confirming a file copy did not corrupt data
  • spotting duplicate files
  • caching build artifacts by content
  • identifying immutable objects in systems like Git

For this job, algorithms like SHA-256 or SHA-512 are normal choices. Older ones such as MD5 and SHA-1 still appear in the wild, but they are no longer good choices when collision resistance matters. For casual duplicate detection you may still see MD5, but for security-sensitive integrity signals, pick stronger modern hashes.

Password storage is a different problem

Password storage asks a nastier question: what happens if an attacker steals the database?

If you store passwords in plain text, the damage is immediate. If you store a fast unsalted hash of each password, the attacker still has a pretty easy offline guessing setup. They can take a huge list of common passwords, hash each guess, and compare the results at scale.

That is why password storage uses specialized hashing schemes that are intentionally slow and salted.

Salt

A salt is a random value generated for each password. The salt is stored alongside the resulting hash and mixed into the hashing process.

Salts solve two big problems:

2. Attackers should not be able to rely on huge precomputed rainbow tables that map common passwords to known digests.

The salt does not need to be secret. Its job is uniqueness, not secrecy.

Slow password hashing

For password storage, fast is bad.

If an attacker can try billions of guesses quickly, weak user passwords fall fast. Password-specific algorithms such as bcrypt, scrypt, and Argon2 are designed to raise the cost of each guess. They are slower on purpose, and some of them add memory hardness too.

bcrypt remains common because it is widely available, battle-tested, and simple to deploy correctly when paired with sane defaults and regular review of the cost factor.

bcrypt and the cost factor

bcrypt takes a password and a salt, then runs a deliberately expensive process to produce the stored hash string. That string typically includes the algorithm marker, cost, salt, and resulting hash.

A bcrypt hash looks something like this:

$2b$12$C6UzMDM.H6dfI/f/IKcEeO5uA9p8uGv9s6v6iP0M6VnQ0VJQw0WwK

The 12 there is the cost factor. Higher cost means more work per guess.

Choosing a cost factor is not about chasing the largest number you can type. It is about the time budget you can afford at login and during account creation, measured on the hardware that actually runs your app. If a bcrypt check is too cheap, attackers benefit. If it is too expensive, your own service slows down and users feel it.

Teams often revisit this over time. Hardware gets faster, so a cost factor that felt right years ago may be too low now.

Why MD5 and plain SHA are not enough for passwords

MD5 is too weak for modern security work. That part is widely understood.

The more subtle mistake is thinking, "Fine, I'll use SHA-256 for passwords instead." Plain SHA-256 is stronger than MD5 in many contexts, but it is still a fast general-purpose hash. Fast general-purpose hashing is exactly what you do not want for password storage.

Here is the problem:

  • SHA-256 is built to be computed quickly
  • attackers can run it at enormous scale on GPUs and specialized hardware
  • if you skip salting, matching passwords produce matching hashes
  • even with a salt, a fast algorithm still helps attackers burn through guesses

That is why people say MD5 and SHA are "not enough" for password storage. It is not because SHA-256 is broken in the same way MD5 is broken. It is because the job has different requirements. Password storage needs slowness and, ideally, memory cost.

A worked example: checksum hash versus password hash

Imagine you have two separate tasks on the same day.

First, you download a 2 GB backup file from a storage provider. They publish this SHA-256 digest:

4c1a9d4f...<rest of digest>

You hash the downloaded file with SHA-256 and compare the result. If it matches, the file probably arrived intact. This is a checksum-style use. Speed is helpful. A fixed deterministic digest is exactly what you want.

Now switch to a signup form for a new user whose password is Rain!Cedar!Puzzle!29.

You do not store a plain SHA-256 digest of that password and call it done. Instead, the server:

3. stores the resulting bcrypt hash string

At login time, the server runs bcrypt again on the submitted password using the stored salt and cost embedded in the hash string, then compares the result.

Both tasks involve hashing, but the goals are different:

  • file checksum: did these bytes change?
  • password storage: how hard is it for an attacker to test guesses after a breach?

Once you see that split clearly, the tool choice becomes much easier.

Try the browser tools

These tools run in your browser, so you can inspect digests, generate bcrypt strings for testing, or create stronger passwords without sending your input to a server.

  • Hash Generator — useful for checksums, quick comparisons, and seeing how different algorithms represent the same input.
  • Bcrypt Generator — helpful when you want to understand the shape of bcrypt output, experiment with cost values, or test development workflows.
  • Password Generator — good for creating long random passwords so users and admins are not stuck inventing weak ones by hand.

These tools are great for learning and debugging. For production password storage, keep the final hashing and verification logic on the server side, using a mature library in your app's language.

Common mistakes

Using one hash algorithm for every job. A checksum, a content ID, and a password database do not all want the same properties. Start from the problem, not the algorithm name you remember first.

Skipping salts. Without a unique random salt per password, matching passwords create matching stored hashes and precomputed attacks get easier.

Using fast hashes for passwords. SHA-256 has many good uses. Password storage is not one of them when used alone.

Thinking a hash proves authenticity by itself. A plain hash only says, "this digest matches these bytes." If an attacker can swap both the file and the posted digest, you need a signed checksum or a trusted channel, not just any hash value on a web page.

Forgetting the human part. Even strong storage can be undermined by weak passwords. Good password generation, password managers, and rate limiting all matter.

FAQ

No. Encryption is designed to be reversed with the right key. Hashing is mainly for comparison, integrity, indexing, and password verification.

Not in the way MD5 is broken. But plain SHA-256 is still the wrong tool for password storage because it is too fast and does not solve the salting problem on its own.

No. Salts need to be random and unique per password. They are usually stored right next to the password hash.

Because that is the point. Password hashing should make each offline guess more expensive for an attacker.

Both are legitimate password-hashing choices when used correctly. bcrypt is older and very common. Argon2 is newer and designed with memory hardness in mind. The right answer depends on your platform, library support, and security requirements.

Related guides

  • JWTs, explained — another common security format that people often confuse with encryption.
  • Base64 encoding, explained — useful background when you need to distinguish encoding from hashing and encryption.
  • Working with JSON — handy when your APIs carry digests, password-policy settings, or auth metadata in JSON payloads.