Cryptographic Failures

Cryptographic algorithms are not broken by attacking the mathematics. They are broken by attacking the implementation, the protocol, the randomness source, or the assumptions made when the pieces are assembled. This lesson catalogs the failure modes that have destroyed real systems built on mathematically sound primitives.

Timing Attacks

A timing attack exploits the fact that different code paths take different amounts of time. The most common example: comparing a received MAC tag to the expected tag byte by byte, returning false at the first mismatch.

# VULNERABLE: early-exit comparison
def verify_mac(received, expected):
    if len(received) != len(expected):
        return False
    for i in range(len(received)):
        if received[i] != expected[i]:
            return False  # leaks position of first difference
    return True

An attacker submits candidate MACs and measures response times with nanosecond precision. If the first byte is correct, the comparison takes slightly longer before returning false. The attacker learns the correct MAC one byte at a time — converting a brute-force attack over 2^128 possible values into a linear search of 256 values per byte position.

The fix is constant-time comparison: every byte is compared regardless of whether a mismatch has already been found, and the result is accumulated without branching.

# SAFE: constant-time comparison
import hmac
hmac.compare_digest(received, expected)

Remote Timing Attacks

Early timing attack research assumed the attacker needed local access. Subsequent work demonstrated that timing differences as small as 15 microseconds are measurable over a network, and statistical techniques (collecting thousands of samples and computing the median) can resolve differences below 1 microsecond. Cloudflare, Google, and other major providers have documented and mitigated remote timing attacks against their TLS implementations. Every cryptographic operation that depends on secret data — comparison, decryption, signing — must be constant-time.

Side-Channel Attacks

Timing is one side channel. Physical implementations leak information through several others:

Power analysis: a cryptographic chip’s power consumption varies depending on the operations it performs and the data it processes. Simple Power Analysis (SPA) can distinguish operations (multiplication vs. squaring in RSA) from a single power trace. Differential Power Analysis (DPA) correlates power consumption across many operations to extract key bytes. Smartcards, embedded devices, and HSMs must include power analysis countermeasures — random delays, constant-power circuits, or masking (randomizing intermediate values).

Cache timing: modern CPUs use caches to speed up memory access. A table lookup that hits the cache is faster than one that misses. AES implementations that use lookup tables (T-tables) leak information about which table entries are accessed, which correlates with key bytes. The Flush+Reload and Prime+Probe attacks can extract AES keys from a co-located virtual machine. The fix: use constant-time AES implementations (bitsliced or AES-NI hardware instructions) that avoid data-dependent memory access.

Electromagnetic emanations: processors emit electromagnetic radiation that correlates with their operations. TEMPEST-class attacks can reconstruct cryptographic keys from EM emissions measured meters away. Shielding and noise injection are the primary countermeasures.

Spectre and Meltdown

Speculative execution attacks demonstrated that even constant-time code can leak secrets through microarchitectural side channels. A CPU speculatively executes a branch that accesses secret data, loads it into the cache, and then rolls back the speculative execution — but the cache state change persists and is measurable. Mitigations (retpolines, kernel page table isolation, microcode updates) have been deployed across operating systems and processors, but the broader lesson is clear: side-channel security requires reasoning about hardware behavior, not just software logic.

Protocol Downgrade Attacks

A downgrade attack tricks two parties into using a weaker protocol version or cipher suite than both actually support. The attacker modifies the negotiation messages to remove strong options, forcing a fallback to a vulnerable configuration.

POODLE (2014): an attacker forces a TLS connection to fall back from TLS 1.2 to SSL 3.0, then exploits a padding oracle in SSL 3.0’s CBC mode to decrypt traffic one byte at a time. The attack was practical — a JavaScript payload on a malicious page could trigger fallback and decrypt HTTP cookies within minutes.

FREAK (2015): a man-in-the-middle attacker modifies the ClientHello to request “export-grade” RSA keys (512-bit), which the server accepts because it still supports the deliberately weakened cipher suites mandated by 1990s US export regulations. The attacker factors the 512-bit key (feasible in hours on modern hardware) and decrypts the session.

Logjam (2015): similar to FREAK but targeting Diffie-Hellman. Many servers used a common 512-bit or 1024-bit DH group. Precomputing the discrete logarithm for a common group allowed passive decryption of any session using that group. An estimated 8.4% of the top million HTTPS domains were vulnerable.

How TLS 1.3 Prevents Downgrade

TLS 1.3 includes a downgrade sentinel: when a TLS 1.3-capable server negotiates a lower version (because the client or an attacker requested it), the server embeds a specific value in the ServerHello random field. A TLS 1.3-capable client checks for this sentinel and aborts the handshake if it detects a downgrade. Additionally, TLS 1.3 eliminated all cipher suites vulnerable to POODLE, FREAK, and Logjam. The protocol cannot be downgraded to a weak configuration because weak configurations no longer exist.

Poor Randomness

Cryptographic systems are only as strong as their randomness. When the random number generator is flawed, everything built on it collapses.

Debian OpenSSL (2008): a maintainer removed entropy sources from OpenSSL’s random number generator, reducing the seed space to 15 bits (the process ID). Every SSL key generated on Debian-based systems for nearly two years was from a set of roughly 32,768 possibilities. Remote attackers could try all possible keys in seconds.

PlayStation 3 ECDSA (2010): Sony signed PS3 firmware updates with ECDSA but used the same random nonce k for every signature. ECDSA’s security requires a fresh, secret k for each signature. With two signatures sharing the same k, the private key is recoverable with simple algebra. The “fail0verflow” team extracted Sony’s private signing key and published it, enabling unsigned code to run on every PS3.

Dual EC DRBG (2006-2013): a NIST-standardized random number generator that contained a suspected NSA backdoor. The generator’s constants, if chosen by someone who knew a related secret, would allow that party to predict the generator’s output after observing 32 bytes. RSA Security used Dual EC DRBG as the default in its BSAFE toolkit, and it was deployed in VPN products and other security software.

The Nonce Reuse Catastrophe

ECDSA nonce reuse is not unique to Sony. Any system that reuses a nonce in ECDSA leaks the private key. The mathematics are unforgiving: given two signatures (r1, s1) and (r2, s2) with the same nonce k, the private key d can be computed as d = (s1m2 - s2m1) / (s2r1 - s1r2) mod n. This is not a theoretical attack — it is arithmetic. Nonce generation for ECDSA must use RFC 6979 (deterministic nonce derived from the private key and message) to eliminate the risk entirely.

Crypto Agility

Crypto agility is the ability to swap cryptographic algorithms in a deployed system without redesigning the system. Every algorithm will eventually be broken, deprecated, or superseded. Systems that hard-code a single algorithm face expensive, disruptive migrations when that day comes.

Designing for agility: cryptographic algorithms should be selected via configuration, not embedded in code. Key formats should include algorithm identifiers. Protocol negotiation (like TLS cipher suite selection) should prefer the strongest mutually supported option and reject known-weak ones. Data formats should version their encryption scheme.

The migration track record: MD5 to SHA-1 to SHA-2. DES to 3DES to AES. SSL 3.0 to TLS 1.0 to TLS 1.2 to TLS 1.3. RSA-1024 to RSA-2048 to ECDSA to (soon) ML-DSA. Each migration took years and broke systems that assumed the old algorithm would last forever.

The current migration: post-quantum algorithms are the next transition. Systems designed with crypto agility can add ML-KEM alongside X25519 and ML-DSA alongside ECDSA through configuration changes. Systems without crypto agility face code rewrites, protocol redesigns, and extended vulnerability windows.

The Danger of Too Much Agility

Agility introduces its own risks. Supporting many algorithms increases the attack surface — each algorithm is a potential target for downgrade attacks, implementation bugs, or configuration errors. TLS 1.3’s approach is instructive: it provides agility (new cipher suites can be added) but ruthlessly removes weak options. Crypto agility does not mean supporting everything — it means having the mechanism to switch, while policy restricts the active set to known-good choices.

Practical Lessons

Use vetted libraries: OpenSSL, libsodium, BoringSSL, and the crypto packages in Go’s standard library have been audited, fuzzed, and battle-tested. A hand-rolled AES implementation, even if mathematically correct, will lack constant-time guarantees, side-channel protections, and the hardening that comes from years of adversarial scrutiny.

Never roll custom crypto: this is not a suggestion. Designing a cryptographic protocol requires expertise in side channels, protocol composition, formal verification, and decades of attack literature. TLS handles transport security, libsodium’s crypto_secretbox handles symmetric encryption, and established libraries handle digital signatures. The apparent need for a custom cryptographic protocol almost never withstands scrutiny.

Constant-time everything: every operation on secret data — comparison, table lookup, branching — must execute in constant time. The appropriate primitives are hmac.compare_digest() in Python, crypto/subtle.ConstantTimeCompare() in Go, and CRYPTO_memcmp() in OpenSSL. A custom comparison loop should never be written for security-sensitive values.

Test the randomness: a system must use the OS CSPRNG (/dev/urandom, getrandom(), BCryptGenRandom()). The deployment environment needs adequate entropy — containers and VMs at boot time are known to have low entropy pools. Tools like rng-tools or haveged help in environments with limited hardware entropy sources.

Key Takeaways

This lesson establishes:

How a timing attack on MAC verification works and how constant-time comparison prevents it
Three types of side-channel attacks and how each leaks information
At least two protocol downgrade attacks and how TLS 1.3 prevents them
Why the PlayStation 3 ECDSA failure occurred and how RFC 6979 prevents nonce reuse
What crypto agility is and why it is essential for long-lived systems

Next: Applied Cryptography Check