These three terms get mixed up constantly, sometimes in ways that cause real security failures. Someone stores a password as Base64 thinking it's protected. Someone uses SHA-256 for passwords thinking it's "strong." Someone calls an API that returns JWT tokens and assumes the data is encrypted. Here's how they actually differ.
The Three Concepts at a Glance
Encoding transforms data into a different representation. It's reversible, requires no key, and the purpose is compatibility — not secrecy.
Encryption transforms data to protect confidentiality. It's reversible, but only with the correct key. The purpose is to keep data secret from anyone without that key.
Hashing transforms data into a fixed-length digest. It's one-way — you can't reconstruct the original input from the output. No key involved. The purpose is verification and fingerprinting.
Those three sentences contain everything. The rest is detail.
Encoding: Changing Shape, Not Protecting
Encoding is a lossless format transformation. Anyone who knows the encoding scheme can reverse it with zero additional information. There is no secret, no key, and no security implication.
Base64 takes binary data (or any bytes) and represents it as printable ASCII characters using an alphabet of 64 characters. It was designed so binary data could pass through systems that only handle text — email attachments, JSON fields, HTTP headers.
btoa('hello') // → 'aGVsbG8='
atob('aGVsbG8=') // → 'hello'
This is completely reversible by anyone. If you see Base64 in a JWT, the payload is not encrypted — it's just encoded. You can decode it in your browser console right now.
URL encoding (percent-encoding) represents characters that aren't safe in a URL context as %XX hex sequences. %20 is a space. Again, reversible by anyone, no key needed.
UTF-8 is an encoding of Unicode code points into bytes. It's not a security mechanism — it's how text gets stored as bytes.
The Base64 Encoder and URL Encoder on UtilityKit handle these transformations. Use them when you need to move data between systems with different format constraints, not when you need to protect data.
Encryption: Protecting Confidentiality
Encryption takes plaintext and a key and produces ciphertext. Without the key, the ciphertext is computationally infeasible to reverse. With the key, decryption is trivial.
Symmetric encryption uses the same key for both operations. AES-256-GCM is the current standard for symmetric encryption. AES-256 means 256-bit keys; GCM is an authenticated mode that also verifies integrity.
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
import os
key = AESGCM.generate_key(bit_length=256)
aesgcm = AESGCM(key)
nonce = os.urandom(12) # 96-bit nonce, must be unique per message
ciphertext = aesgcm.encrypt(nonce, b"hello world", None)
plaintext = aesgcm.decrypt(nonce, ciphertext, None) # → b"hello world"
Asymmetric encryption uses a key pair: a public key (can be shared freely) and a private key (must stay secret). RSA and Elliptic Curve Cryptography (ECC) are the main asymmetric algorithms. Data encrypted with the public key can only be decrypted with the private key. This is how HTTPS key exchange works.
The critical point: encryption is for confidentiality. It answers the question "can only the right person read this?" It's the right tool when you need to store or transmit sensitive data that must be read again later.
Hashing: One-Way Fingerprinting
A hash function maps an input of any length to a fixed-length output (the digest). The same input always produces the same output. But given only the output, you cannot reconstruct the input — that's what "one-way" means.
SHA-256 produces a 256-bit (64 hex character) digest:
echo -n "hello" | sha256sum
# → 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
echo -n "hello!" | sha256sum
# → ce06092fb948d9af2d6f72641bb2f652f6926f82490a40d47e2d2f8f90a78ee0
One character change, completely different output. This property (the avalanche effect) makes hashes useful for detecting tampering. The Hash Generator computes SHA-256, SHA-512, MD5, and others client-side.
Hashing answers: "is this the same data I saw before?" It's used for file integrity verification, content addressing, and the next section's topic.
The Comparison Table
| Property | Encoding | Encryption | Hashing |
|---|---|---|---|
| Reversible? | Yes | Yes (with key) | No |
| Requires key? | No | Yes | No |
| Output size | Larger than input | Same or larger | Fixed length |
| Purpose | Format compatibility | Confidentiality | Verification / fingerprinting |
| Examples | Base64, URL encoding, UTF-8 | AES, RSA, ChaCha20 | SHA-256, SHA-512, MD5 |
The Most Common Mistakes
MD5 or SHA-256 for Passwords
This is the most dangerous misconception in common development. Storing sha256(password) does not protect your passwords.
The problem isn't the algorithm — it's that SHA-256 is designed to be fast. On modern hardware, you can compute billions of SHA-256 hashes per second. An attacker who gets your database can brute-force an 8-character lowercase password in seconds.
Password storage requires a deliberately slow algorithm: bcrypt, Argon2id, or scrypt. These are tunable to take 100–500ms per hash, which makes brute-force attacks orders of magnitude harder. See the OWASP Password Storage Cheat Sheet for current recommendations.
import bcrypt
# Correct: bcrypt handles salting automatically
hashed = bcrypt.hashpw(b"hunter2", bcrypt.gensalt(rounds=12))
bcrypt.checkpw(b"hunter2", hashed) # → True
Base64 Thinking It's Encryption
A JWT token looks like random characters: eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJ1c2VyMTIzIn0.xyz.... It isn't encrypted. The header and payload are Base64url-encoded JSON — decode them in any Base64 decoder and you'll read them immediately.
JWTs are signed (the signature verifies they haven't been tampered with), not encrypted. If you're storing sensitive data in a JWT, use JWE (JSON Web Encryption) instead of JWS, or don't put the sensitive data in the token at all.
Using Raw SHA-256 Instead of HMAC for Integrity
If you want to verify that a message came from someone who holds a shared secret (an API signature, a webhook payload hash), SHA-256 alone isn't sufficient. Use HMAC-SHA-256 instead.
import hmac, hashlib
mac = hmac.new(b"secret-key", b"message body", hashlib.sha256).hexdigest()
HMAC combines the key with the message in a way that prevents length extension attacks, which can affect raw SHA-256 when used naively for authentication.
When to Use Each
Use encoding when you need data to survive transmission through a system that has character set or format restrictions — sending binary in JSON, embedding data in a URL, writing bytes to a text file.
Use encryption when you need to store or transmit sensitive data that must be readable again later — storing API keys in a database, encrypting a file at rest, securing a communication channel.
Use hashing when you need to verify something without storing the original — passwords (with bcrypt/Argon2), file integrity checks, content deduplication, digital signature verification.
The overlap that trips people up: both encryption and hashing can "protect" data, but in completely different ways. Encryption protects the value (you can get it back). Hashing proves the value (you can check it). If you can't hash your way out of a problem, you need encryption — and vice versa.
For a deeper look at the hashing side, see Hashing Algorithms Guide. For how encoding schemes like Base64 and UTF-8 relate to each other, Base64 Encoding Explained walks through the specifics.