URL Encoding: The Complete Guide

Have you ever clicked a link and seen strange sequences like %20 or %3A in your browser's address bar? Or perhaps you've struggled with a URL that works in your browser but breaks in your code? These situations all relate to URL encoding, and understanding it thoroughly will save you countless hours of debugging.

In this guide, we'll explore everything you need to know about URL encoding. We'll start with why it exists, work through the mechanics of how it works, examine the key differences between JavaScript's encoding functions, and finish with practical debugging techniques. Let's begin.

Why URL Encoding Exists

URLs were designed with a limited character set in mind. The original specification allowed only a small set of characters that were considered safe and unambiguous across different systems and contexts.

Think about it: a URL needs to be parsed by browsers, servers, and various network equipment. It might be transmitted over different protocols, stored in databases, or embedded in HTML. At each step, certain characters could cause problems.

Consider this URL:

https://example.com/search?q=coffee & tea

That space character creates ambiguity. Where does the URL end? Is "tea" part of the query or something else? And that ampersand might be interpreted as the start of a new parameter.

URL encoding solves this by replacing problematic characters with safe representations. The properly encoded version looks like this:

https://example.com/search?q=coffee%20%26%20tea

Now there's no ambiguity. Every system along the way knows exactly what this URL means.

Understanding Reserved and Unreserved Characters

URLs divide characters into two categories, and understanding this distinction is fundamental.

Unreserved Characters

These characters can appear anywhere in a URL without encoding:

Letters: A-Z and a-z
Digits: 0-9
Four special characters: hyphen (-), period (.), underscore (_), tilde (~)

These 66 characters will never cause problems and never need encoding.

Reserved Characters

These characters have special meaning in URLs:

: / ? # [ ] @ ! $ & ' ( ) * + , ; =

Each serves a specific purpose:

: separates the scheme from the rest (https**:**//example.com)
/ separates path segments (/users**/profile/**settings)
? marks the start of the query string
# marks the start of the fragment
& separates query parameters (?page=1**&**sort=name)
= separates parameter names from values (page**=**1)

When these characters appear in data rather than as structural delimiters, they must be encoded. This is where many developers run into trouble.

Percent Encoding: The Mechanism

URL encoding is technically called percent encoding because it uses the percent sign followed by two hexadecimal digits representing the character's byte value.

Here's how it works:

Take the character you need to encode
Convert it to its UTF-8 byte sequence
Represent each byte as %XX where XX is the hexadecimal value

Let's trace through some examples:

Space character:

ASCII/UTF-8 value: 32 (decimal) = 20 (hexadecimal)
Encoded form: %20

Ampersand (&):

ASCII/UTF-8 value: 38 (decimal) = 26 (hexadecimal)
Encoded form: %26

Equals sign (=):

ASCII/UTF-8 value: 61 (decimal) = 3D (hexadecimal)
Encoded form: %3D

Encoding Non-ASCII Characters

What about characters outside the basic ASCII range? This is where UTF-8 encoding comes in. Characters like accented letters or emoji are first converted to their UTF-8 byte sequence, then each byte is percent-encoded.

Take the word "cafe" with the French spelling "cafe" (with an accent: cafe):

The character e is UTF-8 encoded as two bytes: C3 A9. So in a URL, it becomes:

caf%C3%A9

An emoji like the check mark would be encoded as multiple bytes:

%E2%9C%93

This multi-byte encoding is important to understand when debugging encoding issues with international text.

JavaScript's Encoding Functions

JavaScript provides several functions for URL encoding, and using the wrong one is one of the most common mistakes I see. Let's examine each carefully.

encodeURI()

The encodeURI() function is designed to encode an entire URI. It preserves characters that have special meaning in URLs because it assumes you're encoding a complete, valid URL.

const url = 'https://example.com/path?query=hello world';
console.log(encodeURI(url));
// Output: https://example.com/path?query=hello%20world

Notice that the spaces were encoded, but the colons, slashes, question mark, and equals sign were preserved. That's intentional because those characters are part of the URL structure.

Characters NOT encoded by encodeURI:

A-Z a-z 0-9 - _ . ~ ! # $ & ' ( ) * + , / : ; = ? @

When to use encodeURI: Use it when you have a complete URL and just need to ensure any spaces or special characters in it are properly encoded. This is relatively rare in practice.

encodeURIComponent()

The encodeURIComponent() function is more aggressive. It encodes everything except unreserved characters because it's designed for encoding data that will become part of a URL.

const searchTerm = 'coffee & tea';
console.log(encodeURIComponent(searchTerm));
// Output: coffee%20%26%20tea

The ampersand was encoded because in the context of a query string component, it's data, not a delimiter.

Characters NOT encoded by encodeURIComponent:

A-Z a-z 0-9 - _ . ~ ! ' ( ) *

When to use encodeURIComponent: This is what you'll use most often. Use it whenever you're constructing a URL and need to encode:

Query parameter values
Query parameter names (if they might contain special characters)
Path segments
Any piece of data that will be embedded in a URL

The Critical Difference

Let me illustrate why choosing the right function matters:

const userInput = 'Tom & Jerry';

// Building a search URL incorrectly:
const badUrl = 'https://example.com/search?q=' + encodeURI(userInput);
console.log(badUrl);
// Output: https://example.com/search?q=Tom%20&%20Jerry
// Problem! The & wasn't encoded, so we now have two parameters

// Building it correctly:
const goodUrl = 'https://example.com/search?q=' + encodeURIComponent(userInput);
console.log(goodUrl);
// Output: https://example.com/search?q=Tom%20%26%20Jerry
// Correct! The entire input is preserved as one parameter value

In the first case, a server parsing this URL would see two parameters: q=Tom and Jerry (with an empty value). That's not what we intended at all.

decodeURI() and decodeURIComponent()

For completeness, here are the decoding counterparts:

decodeURI('https://example.com/search?q=hello%20world');
// Output: https://example.com/search?q=hello world

decodeURIComponent('hello%20world');
// Output: hello world

These reverse the encoding process. Use decodeURIComponent() for decoding parameter values and decodeURI() for complete URLs.

What About escape() and unescape()?

You might encounter these older functions in legacy code:

escape('hello world');    // "hello%20world"
unescape('hello%20world'); // "hello world"

Do not use these functions. They are deprecated and don't handle Unicode correctly. They use a non-standard encoding scheme that can cause problems with international characters. Always use encodeURIComponent() and decodeURIComponent() instead.

Building URLs Safely

Now that we understand the encoding functions, let's look at patterns for building URLs safely.

Manual Construction

The traditional approach involves string concatenation with encoding:

function buildSearchUrl(baseUrl, params) {
  const queryString = Object.keys(params)
    .map(key => {
      const encodedKey = encodeURIComponent(key);
      const encodedValue = encodeURIComponent(params[key]);
      return `${encodedKey}=${encodedValue}`;
    })
    .join('&');

  return `${baseUrl}?${queryString}`;
}

const url = buildSearchUrl('https://api.example.com/search', {
  q: 'coffee & tea',
  category: 'hot drinks',
  sort: 'price:asc'
});
// Output: https://api.example.com/search?q=coffee%20%26%20tea&category=hot%20drinks&sort=price%3Aasc

Using the URL API

Modern JavaScript provides the URL and URLSearchParams APIs, which handle encoding automatically:

const url = new URL('https://api.example.com/search');
url.searchParams.set('q', 'coffee & tea');
url.searchParams.set('category', 'hot drinks');
url.searchParams.set('sort', 'price:asc');

console.log(url.toString());
// Output: https://api.example.com/search?q=coffee+%26+tea&category=hot+drinks&sort=price%3Aasc

Notice something interesting? URLSearchParams encodes spaces as + instead of %20. Both are valid in query strings (the + convention comes from the HTML form specification), but it's worth knowing about this difference.

The URL API is my recommended approach for most cases. It handles encoding automatically and provides a clean interface for manipulating URL parts.

Encoding Path Segments

Path segments need encoding too, but be careful not to encode the slashes that separate them:

// Wrong: encoding the whole path
const badPath = encodeURIComponent('/users/John Doe/profile');
// Output: %2Fusers%2FJohn%20Doe%2Fprofile (slashes are encoded!)

// Right: encoding each segment separately
const segments = ['users', 'John Doe', 'profile'];
const goodPath = '/' + segments.map(encodeURIComponent).join('/');
// Output: /users/John%20Doe/profile

Common Mistakes and How to Avoid Them

Let me walk you through the mistakes I see most often, along with their solutions.

Mistake 1: Double Encoding

This happens when you encode a value that's already encoded:

const alreadyEncoded = 'hello%20world';
const doubleEncoded = encodeURIComponent(alreadyEncoded);
// Output: hello%2520world
// The % became %25, so %20 became %2520

When you decode this, you get hello%20world instead of hello world.

Solution: Only encode once, at the point where you construct the URL. If you're unsure whether something is encoded, decode it first:

function safeEncode(value) {
  // Decode first in case it's already encoded, then encode
  try {
    return encodeURIComponent(decodeURIComponent(value));
  } catch {
    // If decoding fails, the value might not be encoded
    return encodeURIComponent(value);
  }
}

Mistake 2: Using encodeURI for Query Parameters

We covered this earlier, but it's worth emphasizing:

// Wrong
const url = 'https://api.example.com?name=' + encodeURI('A & B');
// Result: https://api.example.com?name=A%20&%20B (& not encoded!)

// Right
const url = 'https://api.example.com?name=' + encodeURIComponent('A & B');
// Result: https://api.example.com?name=A%20%26%20B

Mistake 3: Forgetting to Encode User Input

Any data that comes from users must be encoded:

// Dangerous if userName contains special characters
const url = `https://api.example.com/users/${userName}`;

// Safe
const url = `https://api.example.com/users/${encodeURIComponent(userName)}`;

This isn't just about preventing errors; it's also a security consideration. Unencoded user input can lead to injection attacks.

Mistake 4: Encoding the Entire URL

Sometimes developers encode an entire URL, including its structure:

// Wrong
const encoded = encodeURIComponent('https://example.com/path?query=value');
// Result: https%3A%2F%2Fexample.com%2Fpath%3Fquery%3Dvalue
// This is no longer a valid URL!

// Right: only encode the parts that need encoding
const url = `https://example.com/path?query=${encodeURIComponent('value with spaces')}`;

Debugging Encoded URLs

When things go wrong with URL encoding, here's my systematic approach to debugging.

Step 1: Visually Inspect the URL

Look at the URL in your browser's address bar or in your logs. Common patterns to look for:

%25 indicates double encoding (the % itself was encoded)
Missing encoding for special characters (&, =, ? in data)
Garbled text for international characters

Step 2: Decode and Compare

Use the browser console to decode the URL and see what you actually have:

const suspiciousUrl = 'https://example.com/search?q=hello%2520world';
const decoded = decodeURIComponent('hello%2520world');
console.log(decoded); // "hello%20world" - aha, double encoded!

Step 3: Trace the Encoding Path

Walk through your code and identify every place where encoding happens. Look for:

Libraries that might encode automatically
Server-side encoding that adds to client-side encoding
Middleware that transforms URLs

Step 4: Test with Problematic Characters

When testing URL handling, always test with characters that commonly cause problems:

const testCases = [
  'hello world',        // space
  'Tom & Jerry',        // ampersand
  'price=100',          // equals sign
  'path/to/file',       // slash
  '50% off',            // percent sign
  'search?q=test',      // question mark
  '[email protected]', // plus sign
  'uber',               // non-ASCII
];

testCases.forEach(input => {
  const encoded = encodeURIComponent(input);
  const decoded = decodeURIComponent(encoded);
  console.log(`${input} -> ${encoded} -> ${decoded}`);
  console.assert(input === decoded, 'Round trip failed!');
});

Quick Reference

Let me leave you with a quick reference for common scenarios:

Scenario	Function
Encoding a query parameter value	`encodeURIComponent()`
Encoding a path segment	`encodeURIComponent()`
Encoding a complete URL with spaces	`encodeURI()`
Building a URL from scratch	Use the `URL` API
Decoding a parameter value	`decodeURIComponent()`

And here's a handy encoding reference for common characters:

Character	Encoded
Space	`%20` or `+`
`&`	`%26`
`=`	`%3D`
`?`	`%3F`
`/`	`%2F`
`#`	`%23`
`%`	`%25`
`+`	`%2B`

Conclusion

URL encoding might seem like a small detail, but getting it wrong can cause frustrating bugs that are hard to diagnose. The key principles to remember are:

Reserved characters have special meaning in URLs and must be encoded when used as data
Use encodeURIComponent() for encoding data that becomes part of URLs
Use encodeURI() only for encoding complete URLs (rare)
Encode once, at the point of URL construction
Test with special characters early and often

With these principles in mind, you'll be well-equipped to handle any URL encoding challenge that comes your way. The next time you see a %20 in your address bar, you'll know exactly what's going on and why it's there.