Skip to content

UUIDs, explained

UUIDs exist for a boring but important reason: systems need identifiers that can be generated in many places without central coordination and still be extremely unlikely to collide.

That makes them useful in databases, APIs, queues, client-generated objects, offline-first apps, logs, and import/export workflows. When different services or devices can create records independently, a plain auto-incrementing integer is often not enough.

The confusing part is that "UUID" is not one single generation strategy. There are multiple versions, and they make different trade-offs. Some include time information. Some are mostly random. Some sort better in databases. Some reveal more than you want.

This guide explains what UUIDs solve, compares v1, v4, and v7 in plain language, talks through collision risk and ordering, and shows how to pick a version for a real product scenario.

What problem a UUID solves

A UUID is a Universally Unique Identifier. In practice, that means a 128-bit identifier represented in a standard text form, usually like this:

550e8400-e29b-41d4-a716-446655440000

The point is not that collision is mathematically impossible. The point is that collisions are so unlikely, with sane generation, that independent systems can create IDs without checking a central counter first.

That is useful when:

  • clients create records before they sync
  • many services insert into the same logical data space
  • IDs need to be generated outside the database
  • you want opaque public identifiers instead of sequential numbers

UUIDs are not magic, but they are a very practical way to decouple ID generation from one single machine.

The basic shape of a UUID

A UUID is 128 bits long. The familiar text form breaks those bits into hexadecimal groups separated by hyphens.

You will often see five groups:

8-4-4-4-12

Inside those bits are fields that vary by version. The "version" nibble is one of the important clues, because it tells you how the UUID was generated.

That is why v1, v4, and v7 matter. They are not cosmetic labels. They describe the generation strategy.

UUID v1: time-based, with privacy baggage

UUID v1 is based largely on timestamp information plus node-related data.

Historically, that node component was often tied to a MAC address or something derived from it. That gave v1 a useful property: the IDs were time-related and could be generated with a low chance of collision across machines.

The downside is obvious once you say it out loud. If the identifier carries information tied to creation time and machine identity, it may reveal more than you want.

That creates two practical issues:

2. Sort behavior is only part of the story. v1 has time components, so it can be more ordered than random UUIDs, but it is not the version people now reach for first when they want time-ordering without the old privacy trade-offs.

v1 still exists and still has valid uses, but teams choosing a fresh default often look elsewhere.

UUID v4: random and widely used

UUID v4 is the one many developers know best.

It is mostly random. That gives it a few nice properties:

  • simple mental model
  • no embedded timestamp
  • no sequence to guess
  • widely supported in libraries and platforms

If you need a public-facing identifier and you want it to reveal as little as possible about record creation order, v4 is a solid default.

The trade-off is ordering. Random UUIDs do not sort in creation order, which can be awkward for some storage engines, indexes, or debugging workflows. If you insert v4 values as primary keys in a database, the randomness may lead to less locality than a time-ordered scheme.

That does not make v4 bad. It just means "best default" depends on what you care about.

UUID v7: time-ordered and modern

UUID v7 is designed to keep the standard UUID format while improving time-ordering behavior.

At a high level, v7 combines timestamp-based ordering with randomness in a way that is friendlier for modern systems. The big appeal is that newer IDs tend to sort after older ones, which is useful for database inserts, log inspection, and data pipelines that benefit from chronological locality.

Compared with v1, v7 aims to keep the ordering benefits without dragging along the same node-identity baggage. Compared with v4, it gives up a little "pure randomness" in exchange for much better sort characteristics.

That makes v7 appealing for new systems that want:

  • globally safe-enough generation without central coordination
  • IDs that sort roughly by creation time
  • less privacy leakage than v1

If you are starting fresh and you care about sortability, v7 is worth serious consideration.

Collision risk in plain language

People ask, "Can two UUIDs collide?" The honest answer is yes, in theory. The useful answer is that with correct generation, the risk is so low that it is usually dominated by other engineering concerns.

For random-style UUIDs such as v4, the space is huge. You do not treat collisions as impossible, but in ordinary applications they are not the thing that keeps you awake at night.

That said, collision risk is not just math on paper. It also depends on:

  • whether the generator is correct
  • whether the random source is good
  • whether the implementation follows the version rules
  • whether IDs are being generated in strange constrained environments

So the practical advice is:

  • use mature libraries
  • do not hand-roll UUID generation unless you have to
  • think about version choice in terms of system behavior, not just collision fear

Ordering and database behavior

The biggest non-collision question around UUIDs is often ordering.

Sequential IDs like 1, 2, 3, 4 naturally group new rows together in index order. Random UUIDs like v4 do not. If your storage engine cares about insertion locality, random UUIDs can produce more scattered writes.

That is part of why time-ordered UUIDs attract attention. They can preserve many of the distributed-generation benefits while behaving more nicely in sorted indexes and chronological listings.

You still need to test your actual database and workload. UUID choices are not purely theoretical. They affect storage patterns, debugging readability, and how easy it is to infer creation order from the IDs alone.

Privacy trade-offs

Not all UUID versions reveal the same thing.

v1 can leak timing and node-related information. v4 reveals much less about sequence because it is random. v7 intentionally carries ordering information through its time-based structure, which can be useful operationally but also means a reader may infer rough creation timing from the IDs.

So the privacy question is not "Are UUIDs private?" The question is "What can someone infer from this particular version?"

If public guess-resistance and low information leakage matter most, v4 often feels safer. If sortability and modern operational convenience matter more, v7 may be the better fit. If you are stuck with legacy v1 usage, be aware of what that version exposes.

A worked example: choosing a version for a product

Imagine you are building a project-management app with:

  • a web client
  • a mobile client with offline drafts
  • a server API
  • a Postgres database

Users can create tasks while offline, so clients need to generate IDs before the server responds. The app also shows recent tasks in chronological order, and database write patterns matter because tasks are created constantly.

What should you choose?

Option 1: v4

v4 is easy and safe from an information-leak perspective. Clients can generate IDs locally and the chances of collision are tiny.

The downside is ordering. Newly created task IDs will not sort by creation time, so database locality and chronological debugging are less friendly.

Option 2: v1

v1 gives you time-related ordering, but it carries the old privacy concerns. For a new app, that is probably not the direction you want.

Option 3: v7

v7 fits this scenario well:

  • clients can generate IDs without central coordination
  • IDs sort more naturally by creation time
  • the app avoids the older v1 node-identity baggage

In this case, v7 is a strong choice.

Now change the scenario. Suppose you are generating public invitation codes where you do not want outsiders to infer creation timing from the identifier. Then v4 may be the better fit even if the database likes ordered IDs less.

That is the real decision pattern. Start from how the ID will be used, what it may reveal, and how your storage layer behaves.

Try it in your browser

Our UUID Generator is useful when you want to generate sample IDs, compare version shapes, and get a feel for how v1, v4, and v7 differ before you wire a choice into code or docs.

That is especially handy when:

  • you are choosing a default for a new system
  • you need examples for tests or documentation
  • you want to check how time-ordered output differs from fully random output

It is a quick way to make the version differences concrete instead of abstract.

Common mistakes

Treating every UUID version as interchangeable. The format is shared, but the trade-offs are not.

Picking v4 by habit when ordering matters. v4 is a fine default, but it is not the answer to every storage pattern.

Picking v1 without thinking about leakage. Time and node-related details are not free.

Assuming UUIDs solve every ID problem. They solve distributed uniqueness very well. They do not replace authorization checks, access control, or good schema design.

Hand-rolling the generator. Use a mature library unless you have a strong reason not to.

FAQ

Not in the mathematical sense, but with correct generation the collision risk is tiny enough for ordinary system design.

It depends on what you mean by safest. If you care most about minimizing information leakage, v4 is a strong default. If you care about ordering and modern storage behavior, v7 is often more attractive.

Legacy systems, compatibility, or older infrastructure choices. It is not automatically wrong, but for new designs people often prefer v4 or v7.

With v4, not meaningfully. With v1 or v7, ordering is more informative because time plays a role in generation.

They often can be, but public exposure is still a product choice. Think about what the chosen version reveals and whether users or attackers can infer patterns from it.

Related guides