Table of Contents
Base64 is one of those things developers use constantly but rarely understand deeply. You see it in email attachments, data URIs in CSS, HTTP Basic Authentication headers, JWT tokens, and API responses carrying binary data. It shows up so often because it solves a specific, fundamental problem: how do you transmit arbitrary binary data through a channel that only handles printable ASCII text?
This guide explains how Base64 encoding works at the bit level, why the standard alphabet is what it is, when Base64 is the right choice (and when it is not), the URL-safe variant and when you need it, and the most common mistakes developers make with Base64 in practice.
The encoder and decoder referenced here run entirely in your browser — no files are uploaded.
The Problem Base64 Solves
Many text-based protocols — SMTP (email), HTTP headers, HTML, XML, JSON — were designed to carry text. "Text" in practice meant ASCII-printable characters: letters, digits, and common punctuation. Binary data (image bytes, encrypted data, compressed files) contains byte values across the full 0–255 range, including bytes that are not printable ASCII and bytes that have special meaning in text protocols (null bytes, newlines, carriage returns, form feeds).
When binary data passes through a system that interprets byte values, corruption happens. A null byte (0x00) may terminate a string. A carriage return (0x0D) may rewrite a URL parameter. A byte that coincides with a markup character in the surrounding protocol may break the document structure.
Base64 solves this by encoding every 3 bytes of binary data into 4 ASCII characters drawn from a 64-character alphabet that consists entirely of printable characters safe in virtually every text context: A–Z, a–z, 0–9, +, and /. The output is longer (33% larger), but it is guaranteed to pass through any ASCII-safe text channel without corruption.
How Base64 Encoding Works
The mechanism operates on 3-byte groups:
1. Take 3 bytes (24 bits) of input.
2. Split the 24 bits into four 6-bit groups.
3. Map each 6-bit value (0–63) to a character in the Base64 alphabet.
The Base64 alphabet:
| Value | Char | Value | Char | Value | Char | Value | Char |
|---|---|---|---|---|---|---|---|
| 0–25 | A–Z | 26–51 | a–z | 52–61 | 0–9 | 62 | + |
| 63 | / | — | — | — | — | — | — |
Example: Encoding the ASCII string "Man":
- M = 77 = 01001101, a = 97 = 01100001, n = 110 = 01101110
- 24 bits: 010011 010110 000101 101110
- Mapped: 19 → T, 22 → W, 5 → F, 46 → u
- Result: "TWFu"
Padding: If the input is not a multiple of 3 bytes, padding characters (=) are added to make the output length a multiple of 4:
- 1 remaining byte → 2 Base64 chars +
== - 2 remaining bytes → 3 Base64 chars +
=
Padding allows decoders to know how many bytes the last group represents. Some implementations omit padding for URLs; decoders must handle both.
Decoding reverses the process: convert each Base64 character to its 6-bit value, concatenate the bits, split into 8-bit bytes, remove padding-implied zeros.
Standard vs URL-Safe Base64
The standard Base64 alphabet includes + and /. Both characters have special meanings in URLs:
+means space in URL-encoded query strings (application/x-www-form-urlencoded)/is the URL path separator
A standard Base64 string used in a URL (as a query parameter or path segment) must be percent-encoded: + → %2B, / → %2F. This works but is verbose and breaks some systems that decode the URL before passing parameters to the application.
URL-safe Base64 (RFC 4648 §5) uses an alternate alphabet: - replaces +, and _ replaces /. These two characters are safe in URLs without percent-encoding. Padding is often omitted in URL-safe Base64 to avoid the = character (which is also special in URLs).
When each is used:
- Standard Base64: email attachments (MIME), data URIs, JSON values, general binary-in-text embedding
- URL-safe Base64: JWT tokens (header + payload), OAuth tokens, API keys, signed URLs, cookie values
JWTs specifically use URL-safe Base64 without padding. If you decode a JWT header or payload and see - and _ characters, that is why.
Conversion: Converting between standard and URL-safe is a character substitution: replace + ↔ - and / ↔ _. The encoded data is the same length; only two characters in the alphabet change.
URL Encoding vs Base64
Base64 and URL encoding (percent-encoding) both convert data into text-safe representations. They solve overlapping problems in different ways and are not interchangeable.
URL encoding (percent-encoding) converts each byte that is not URL-safe into a %XX hex sequence, where XX is the byte value in hexadecimal. Safe characters (A–Z, a–z, 0–9, -, _, ., ~) pass through unchanged. A space becomes %20 or +. URL encoding is designed specifically for URLs — it preserves the readability of ASCII text while escaping special characters.
Base64 encodes every 3 input bytes into 4 output characters regardless of what the bytes represent. It does not try to preserve ASCII readability. "Hello" as Base64 is "SGVsbG8=" — nothing recognisable. Base64 is designed for arbitrary binary data.
When to use which:
- Passing text with special characters in a URL query string → URL encode it
- Embedding binary data (an image, an encrypted blob, a hash) in a JSON field or HTTP header → Base64 encode it
- Embedding a URL as a value inside another URL → URL encode the inner URL
- JWT token → URL-safe Base64 (no percent-encoding needed)
A common mistake: double-encoding. If you URL-encode data and then Base64-encode the result (or vice versa), you get a doubly-encoded value that requires two decoding steps to recover the original. Always apply exactly one encoding for any given transport boundary.
URL encode or decode— Percent-encoding for query strings and pathsBase64 in Practice: Common Use Cases
Data URIs: Embed small images or fonts directly in HTML or CSS without an extra HTTP request. Format: data:[mediatype][;base64],<data>. Example: <img src="data:image/png;base64,iVBORw0KGgo...">. Practical for icons under ~10 KB; above that, a separate file with caching benefits is more efficient.
HTTP Basic Authentication: The Authorization: Basic <credentials> header encodes the username:password pair as Base64. This is not encryption — anyone who intercepts the header can decode it instantly. Basic Auth must always be used over HTTPS.
Email attachments (MIME): SMTP was designed for ASCII text. MIME (the standard for email attachments) uses Base64 Content-Transfer-Encoding to embed binary files (PDFs, images, spreadsheets) in the email body. Each attachment is Base64-encoded and wrapped in a MIME boundary.
JSON API responses with binary fields: Checksums, cryptographic signatures, encryption keys, and small images are sometimes included in JSON API responses as Base64 strings. The alternative — returning a separate URL for each binary resource — requires more requests. Which approach is better depends on the size and access frequency of the binary data.
API keys and tokens: Many API keys are random bytes (for cryptographic security) base64-encoded for transmission. The encoding is presentational — the underlying value is binary. When you call openssl rand -base64 32, you generate 32 random bytes and encode them as a 44-character Base64 string.
What Base64 is not: Base64 is not encryption. It is not a hash. It provides no secrecy and no integrity guarantee. Anyone can decode it with a standard library call. Do not use Base64 to "obscure" sensitive data — use proper encryption for confidentiality and HMAC for integrity.
File Encoding with Base64
Encoding file contents to Base64 follows the same algorithm — the input is just treated as raw bytes rather than a text string.
Text files: Read the file as bytes, encode. Important: specify the text encoding (UTF-8) before converting to bytes, or you may get different results on different systems. btoa() in JavaScript only accepts binary strings (each character representing one byte); use TextEncoder + Uint8Array for proper UTF-8 handling.
Binary files: Read the file as a Uint8Array or similar byte buffer, encode. The encoding is the same regardless of file type — a JPEG and a PDF and a ZIP file all encode the same way.
Size overhead: Base64 increases file size by approximately 33% (4 output characters per 3 input bytes) plus potential whitespace if the output is line-wrapped (MIME wraps at 76 characters per line, adding newlines). A 1 MB binary file becomes approximately 1.37 MB as Base64.
Decoding back to a file: Decode the Base64 string to a byte array, then write the byte array as a file with the correct extension. The decoder output is the exact original bytes if the encoding was correct. Verify using a checksum (MD5 or SHA-256) of the original file versus the decoded file if integrity matters.
Encode files to Base64— Supports JPEG, PNG, PDF, any file typeFrequently Asked Questions
Is Base64 the same as encryption?
Why does Base64 end with == signs?
What is the difference between Base64 and Base64url?
Can I use Base64 to send images in an API?
Why does my Base64 decoder give garbage output?
Summary
Base64 is a simple algorithm with a specific job: making binary data safe for text-based channels. Once you understand the 3-bytes-to-4-characters mechanism and the distinction between standard and URL-safe alphabets, the encoding is fully predictable. The two rules to remember: Base64 is not encryption, and always match encoder and decoder variants. Everything else follows from those two constraints.