How Random Number Generation Works in Computing

How Random Number Generation Works in Computing

Computers are deterministic machines. Given the same inputs and the same state, they produce the same outputs every time. That's what makes randomness hard — a program running on predictable hardware can't spontaneously produce unpredictable numbers without some external input. Grasping this tension is the key to using random numbers correctly.

Why "Random" Is Harder Than It Sounds

A deterministic processor executing a fixed sequence of instructions will always produce the same result. There's no quantum uncertainty in a for-loop. Any function that returns "random" numbers is either using an algorithm that mimics randomness (pseudo-random), or it's reading entropy from some genuinely unpredictable physical source (truly random).

The distinction matters enormously. For statistical simulations and shuffling a playlist, pseudo-random is fine. For generating a password or a session token, you need the real thing — and using a pseudo-random generator in a security context has caused serious vulnerabilities.

Pseudo-Random Number Generators (PRNGs)

A PRNG is an algorithm. It takes a seed (an initial value) and produces a sequence of numbers that pass statistical tests for randomness — they look random, but they're entirely determined by the seed. Give it the same seed twice, and you get the exact same sequence.

Linear Congruential Generators

The simplest PRNG is a linear congruential generator (LCG). The formula is:

X(n+1) = (a × X(n) + c) mod m

Where X(0) is the seed, a is the multiplier, c is the increment, and m is the modulus. The output cycles through at most m values before repeating. With well-chosen constants, the cycle length equals m.

def lcg(seed, a=1664525, c=1013904223, m=2**32):
    state = seed
    while True:
        state = (a * state + c) % m
        yield state / m  # normalize to [0, 1)

LCGs are fast and simple, but they have low-dimensional structure. If you plot pairs (X(n), X(n+1)) they fall on parallel lines. This periodicity is obvious to statistical tests — and to attackers. C's old rand() function used an LCG. Don't use LCGs for anything beyond toy examples.

Mersenne Twister

The Mersenne Twister (MT19937) is the default PRNG in Python (random.random()), Ruby, PHP, and many other languages. It was designed in 1997 and has a period of 2^19937 − 1 — far longer than you'll ever need.

MT19937 passes most statistical tests and is fast. But it's not cryptographically secure. After observing 624 consecutive 32-bit outputs, you can reconstruct the full internal state and predict all future outputs. This is a known attack.

import random
random.seed(42)
print(random.random())  # deterministic: 0.6394267984578837

For games, simulations, and procedural generation, Mersenne Twister is excellent. For anything security-related, it's dangerously wrong.

Cryptographically Secure PRNGs (CSPRNGs)

A CSPRNG is a PRNG with two additional properties: the output is computationally indistinguishable from true randomness (you can't tell the difference even with a powerful computer), and knowing the past output doesn't help you predict future output.

The source of security is entropy — genuine unpredictability harvested from the physical environment.

OS Entropy Pool

Modern operating systems maintain an entropy pool built from:

  • Hardware interrupt timing (keyboard, mouse, network packets arriving at irregular intervals)
  • CPU hardware RNG instructions (Intel's RDRAND, ARM's TRNG)
  • Disk I/O timing variations
  • Network packet arrival times

On Linux, this pool is accessible through two interfaces: /dev/random and /dev/urandom. The historic difference was that /dev/random blocked when the entropy estimate was low; /dev/urandom never blocked. Modern Linux (kernel 5.6+) has eliminated this distinction — both are now equivalent once the pool has been seeded at boot. The recommendation is to use /dev/urandom (or the getrandom() syscall, which is preferred in new code).

#include <sys/random.h>

// Preferred on Linux 3.17+
unsigned char buf[32];
getrandom(buf, sizeof(buf), 0);  // fills buf with secure random bytes

On macOS, /dev/random is backed by a CSPRNG seeded from hardware entropy sources including the Secure Enclave (T2 chip on Intel Macs, Secure Enclave on Apple Silicon). On Windows, BCryptGenRandom is the system CSPRNG.

Hardware RNGs

CPUs since around 2012 include a true hardware random number generator. Intel's RDRAND instruction samples thermal noise from on-chip circuitry to produce random bits. This isn't a PRNG — it's sampling physical randomness directly.

Operating systems use RDRAND as one input to their entropy pool, but security-conscious implementations don't rely on it exclusively. An implementation flaw in RDRAND (or a malicious hardware design) would be undetectable without independent entropy sources. Defense in depth means mixing multiple sources.

Math.random() vs crypto.getRandomValues()

This is the JavaScript version of the PRNG vs CSPRNG choice, and it's one of the most important things to get right in web applications.

// DO NOT use for security purposes
Math.random()            // PRNG, implementation defined, not secure

// Use for security purposes
crypto.getRandomValues(new Uint32Array(1))[0]  // CSPRNG, uses OS entropy

// Or for convenience in modern environments
const { randomBytes, randomUUID } = require('crypto')  // Node.js
randomBytes(32).toString('hex')  // 64-char hex string, cryptographically secure
randomUUID()  // 'f47ac10b-58cc-4372-a567-0e02b2c3d479'

Math.random() is appropriate for: game mechanics, test data, procedural generation, random visual effects, randomly ordering UI elements. It is never appropriate for: session tokens, password reset links, CSRF tokens, cryptographic keys, or any value an attacker would benefit from predicting.

The V8 engine (used by Chrome and Node.js) implements Math.random() using xorshift128+ — a good statistical PRNG that runs fast, but with a known internal state that can theoretically be recovered from output.

Seeds and Why Predictable Seeds Are Dangerous

Every PRNG is initialized from a seed. If you control the seed, you control all the output. This is why predictable seeds are a vulnerability.

A classic mistake is seeding with the current time:

import random, time
random.seed(int(time.time()))  # predictable within a second
token = random.randint(0, 1000000)  # attacker can brute-force the seed

An attacker who knows approximately when your server generated a token can try all seed values in a small window (say, ±30 seconds = 60 attempts). If the token space is small, they find the token trivially.

The correct approach: don't manage seeds for security-sensitive code. Use the system CSPRNG, which seeds itself from the entropy pool automatically.

For generating random test data or picking sample items, the Random Line Picker works well for everyday use cases where statistical randomness is all you need. When your use case involves cryptographic operations, compare what we've covered here with Encoding vs Encryption vs Hashing to understand where random number generation fits in the broader security picture.

Quick Decision Guide

Use Math.random() / Mersenne Twister when:

  • You need speed and statistical uniformity
  • The application is a game, simulation, or visual effect
  • Predictability wouldn't give an attacker any advantage
  • You need reproducible results from a known seed (unit tests, procedural generation)

Use a CSPRNG when:

  • Generating any token, key, or identifier that must be secret
  • Building password reset flows, session management, or OAuth state parameters
  • Implementing any cryptographic primitive
  • When an attacker correctly guessing the output would cause harm

The CSPRNG is slightly slower — but on modern hardware, crypto.getRandomValues() generates millions of bytes per second. The performance difference only matters if you're generating gigabytes of random data, which is unusual. When in doubt, use the CSPRNG. The cost is negligible. The security benefit is real.

For more on how randomness connects to hashing and security — particularly why hash functions need to be unpredictable — see Public Key Cryptography Explained.