Query String Parsing
How to parse URL query parameters in every major language, handle edge cases like arrays and nested objects, and protect against injection attacks.
What is a Query String?
A query string is the part of a URL that comes after the question mark (?). It consists of one or more key-value pairs separated by ampersands (&), with each key and value connected by an equals sign (=). Query strings are the most common way to pass parameters in HTTP GET requests and are used extensively in web applications, APIs, and analytics tracking.
https://example.com/search?q=javascript&page=2&sort=relevance
└──────────── query string ───────────┘
Key-value pairs:
q = javascript
page = 2
sort = relevanceQuery strings originated in the earliest days of the web as a way to send form data to servers. When an HTML form uses the GET method, the browser serializes the form fields into a query string and appends it to the action URL. This mechanism remains fundamental to how the web works today, powering everything from search engines to REST APIs.
Query String Syntax Rules
While query strings appear simple, the details matter. Here are the rules that govern their syntax:
- The query string begins immediately after the
?character. - Parameters are separated by
&(though some systems accept;as an alternative). - Keys and values are separated by
=. - Keys and values must be percent-encoded if they contain reserved characters.
- In form encoding (
application/x-www-form-urlencoded), spaces are represented as+instead of%20. - A key may appear without a value (
?debugor?debug=). - The same key may appear multiple times (
?color=red&color=blue). - The order of parameters is generally not significant, though some APIs may depend on it.
The query string ends at the # character (which begins the fragment identifier) or at the end of the URL, whichever comes first. Everything between ? and # (or the end of the URL) is the query string.
Parsing in JavaScript
JavaScript provides the URLSearchParams API, which is the recommended way to parse and manipulate query strings in modern applications. It handles encoding, decoding, and iteration automatically.
URLSearchParams API
// Parse from a query string
const params = new URLSearchParams('q=hello+world&page=2&sort=relevance');
// Get a single value
params.get('q'); // "hello world"
params.get('page'); // "2"
params.get('missing'); // null
// Check if a key exists
params.has('sort'); // true
params.has('limit'); // false
// Iterate over all parameters
for (const [key, value] of params) {
console.log(`${key}: ${value}`);
}
// q: hello world
// page: 2
// sort: relevance
// Convert back to string
params.toString(); // "q=hello+world&page=2&sort=relevance"Using with the URL API
// Parse from a full URL
const url = new URL('https://example.com/search?q=javascript&page=2');
const params = url.searchParams;
params.get('q'); // "javascript"
params.get('page'); // "2"
// Modify parameters
params.set('page', '3');
params.append('lang', 'en');
params.delete('q');
url.toString();
// "https://example.com/search?page=3&lang=en"Legacy Approach (Manual Parsing)
Before URLSearchParams, developers often wrote manual parsers. While you should prefer the built-in API, understanding the manual approach helps when debugging:
// Manual parsing (not recommended for production)
function parseQueryString(qs) {
return qs
.replace(/^\?/, '') // Remove leading ?
.split('&') // Split on &
.filter(Boolean) // Remove empty strings
.reduce((params, pair) => {
const [key, ...rest] = pair.split('=');
const value = rest.join('='); // Handle values containing =
params[decodeURIComponent(key)] = decodeURIComponent(value || '');
return params;
}, {});
}
parseQueryString('?q=hello%20world&page=2');
// { q: "hello world", page: "2" }Parsing in Python
Python's standard library provides urllib.parse with several functions for query string handling:
from urllib.parse import parse_qs, parse_qsl, urlencode
# parse_qs returns a dict with lists (handles duplicate keys)
parse_qs('q=hello+world&page=2&color=red&color=blue')
# {'q': ['hello world'], 'page': ['2'], 'color': ['red', 'blue']}
# parse_qsl returns a list of tuples (preserves order and duplicates)
parse_qsl('q=hello+world&page=2&color=red&color=blue')
# [('q', 'hello world'), ('page', '2'), ('color', 'red'), ('color', 'blue')]
# Build a query string from a dict
urlencode({'q': 'hello world', 'page': 2})
# 'q=hello+world&page=2'
# Build with multiple values for the same key
urlencode([('color', 'red'), ('color', 'blue')])
# 'color=red&color=blue'Notice that Python's parse_qs returns lists for all values, even when a key appears only once. This design prevents bugs when a parameter unexpectedly appears multiple times -- you always get a list, so your code handles both cases consistently.
Parsing in Go
Go's net/url package provides robust query string parsing:
package main
import (
"fmt"
"net/url"
)
func main() {
// Parse a query string
values, _ := url.ParseQuery("q=hello+world&page=2&color=red&color=blue")
// Get a single value (first occurrence)
fmt.Println(values.Get("q")) // "hello world"
fmt.Println(values.Get("page")) // "2"
fmt.Println(values.Get("color")) // "red" (first value only)
// Get all values for a key
fmt.Println(values["color"]) // ["red", "blue"]
// Check existence
_, exists := values["sort"]
fmt.Println(exists) // false
// Build a query string
v := url.Values{}
v.Set("q", "hello world")
v.Add("color", "red")
v.Add("color", "blue")
fmt.Println(v.Encode()) // "color=red&color=blue&q=hello+world"
}Parsing in PHP
PHP has built-in support for query string parsing via parse_str():
<?php
// Parse into an associative array
parse_str('q=hello+world&page=2&sort=relevance', $params);
// $params = ['q' => 'hello world', 'page' => '2', 'sort' => 'relevance']
// PHP automatically handles array syntax
parse_str('color[]=red&color[]=blue', $params);
// $params = ['color' => ['red', 'blue']]
// PHP also handles nested objects
parse_str('user[name]=John&user[age]=30', $params);
// $params = ['user' => ['name' => 'John', 'age' => '30']]
// Build a query string
echo http_build_query(['q' => 'hello world', 'page' => 2]);
// "q=hello+world&page=2"
?>PHP's handling of bracket syntax (color[]) for arrays is a convention that originated in PHP and is not part of any URL standard. While widely adopted, other languages do not automatically parse this syntax, which can cause interoperability issues.
Parsing in Ruby
Ruby provides query string parsing through CGI and URI modules, and Rails extends this with additional features:
require 'cgi'
require 'uri'
# Parse a query string
params = CGI.parse('q=hello+world&page=2&color=red&color=blue')
# {"q"=>["hello world"], "page"=>["2"], "color"=>["red", "blue"]}
# Access values
params['q'].first # "hello world"
params['color'] # ["red", "blue"]
# Using URI
uri = URI.parse('https://example.com/search?q=javascript&page=2')
CGI.parse(uri.query)
# {"q"=>["javascript"], "page"=>["2"]}
# Build a query string
URI.encode_www_form([['q', 'hello world'], ['page', 2]])
# "q=hello+world&page=2"Handling Arrays in Query Strings
There is no single standard for representing arrays in query strings. Different frameworks and APIs use different conventions, and understanding these differences is critical when working across systems:
Repeated Keys (Most Common)
// Convention: repeat the key
?color=red&color=blue&color=green
// JavaScript
const params = new URLSearchParams('color=red&color=blue&color=green');
params.getAll('color'); // ["red", "blue", "green"]This is the most widely supported approach and works with URLSearchParams, Python'sparse_qs, Go's url.Values, and most web frameworks.
Bracket Syntax (PHP Convention)
// Convention: append [] to the key
?color[]=red&color[]=blue&color[]=green
// Supported natively by: PHP, Rails, Express (with qs library)
// NOT supported by: URLSearchParams, Python urllib, Go net/urlComma-Separated Values
// Convention: comma-separate within one value
?color=red,blue,green
// Must be manually parsed:
const colors = params.get('color').split(',');
// ["red", "blue", "green"]Indexed Bracket Syntax
// Convention: explicit indices
?color[0]=red&color[1]=blue&color[2]=green
// Supported by: PHP, Rails, some Express configurationsNested Objects in Query Strings
Some frameworks support encoding nested objects in query strings using bracket notation:
// Bracket notation for nested objects
?user[name]=John&user[email]=john@example.com&user[address][city]=NYC
// Decoded structure:
{
user: {
name: "John",
email: "john@example.com",
address: {
city: "NYC"
}
}
}This syntax is supported by PHP, Ruby on Rails, and JavaScript libraries like qs (used by Express.js and Axios). However, the standard URLSearchParams API does not support this convention. If you need nested object support in JavaScript, use a library like qs:
import qs from 'qs';
// Parse nested objects
qs.parse('user[name]=John&user[address][city]=NYC');
// { user: { name: "John", address: { city: "NYC" } } }
// Stringify nested objects
qs.stringify({ user: { name: "John", address: { city: "NYC" } } });
// "user%5Bname%5D=John&user%5Baddress%5D%5Bcity%5D=NYC"Query Strings vs Fragment Identifiers
Query strings and fragment identifiers (the part after #) serve different purposes and behave differently:
| Aspect | Query String (?key=val) | Fragment (#section) |
|---|---|---|
| Sent to server | Yes, included in HTTP request | No, client-side only |
| Visible in logs | Yes, appears in server logs | No, never logged by servers |
| Triggers page load | Yes, when changed | No, handled by browser |
| Use case | Server-processed parameters | Client-side state, page anchors |
The privacy implications are significant. Query strings are visible to the server, to any intermediary proxy, and often appear in server access logs. Fragment identifiers are never sent over the network. This is why sensitive data like authentication tokens should be passed via fragments (as in OAuth 2.0 implicit flow) rather than query strings. Our JWT Decoder tool uses fragment identifiers for exactly this reason -- tokens are placed in the hash so they never leave the browser.
Security Considerations
Query string parsing introduces several security risks that developers must be aware of:
Injection Attacks
If query parameter values are inserted into HTML without proper escaping, an attacker can inject malicious scripts (Cross-Site Scripting, or XSS):
// DANGEROUS: Never do this!
document.getElementById('greeting').innerHTML =
'Hello, ' + params.get('name');
// If the URL is: ?name=<script>alert('XSS')</script>
// The script will execute in the user's browser
// SAFE: Use textContent instead of innerHTML
document.getElementById('greeting').textContent =
'Hello, ' + params.get('name');
// Or use a framework's built-in escaping (React, Vue, etc.)Parameter Pollution
HTTP Parameter Pollution (HPP) occurs when an attacker adds extra parameters to a URL to override or confuse server-side processing. Different servers handle duplicate parameters differently:
// URL: ?role=user&role=admin
// PHP: Uses last value -> role = "admin"
// Python: Uses first value -> role = "user" (with request.args.get())
// Express: Returns array -> role = ["user", "admin"]
// ASP.NET: Concatenates -> role = "user,admin"This inconsistency can be exploited if a frontend and backend parse the same query string differently. Always validate and sanitize parameters on the server side, and be explicit about how you handle duplicates.
Length Limits
While there is no formal limit on URL length, practical limits exist. Most browsers support URLs up to approximately 2,000 characters, and some servers impose limits as low as 8,192 bytes for the entire URL. Query strings that carry large payloads (such as serialized JSON or long lists) may be silently truncated.
For large data, prefer POST requests with a request body rather than stuffing data into query strings. If you must use query parameters for large data, consider compression (like lz-string) and stay well under the 2,000 character practical limit.
Sensitive Data Exposure
Query strings should never contain sensitive information like passwords, API keys, or personal identifiers. Query strings are stored in browser history, appear in server logs, may be cached by proxies, and are visible in the Referer header sent to third-party sites. Use POST request bodies, HTTP headers, or cookies for sensitive data instead.
Best Practices
- Use
URLSearchParams(or your language's equivalent) instead of manual string parsing. - Always encode parameter values with
encodeURIComponent()when building URLs manually. - Handle duplicate keys explicitly -- do not assume a key appears only once.
- Validate and sanitize all query parameters on the server side.
- Never insert query parameter values into HTML without escaping.
- Keep query strings under 2,000 characters total for maximum compatibility.
- Use POST bodies for sensitive or large data instead of query strings.
- Document your API's array and nesting conventions explicitly.
Try It Yourself
Experiment with parsing and constructing query strings using our URL Encoder/Decoder tool. Paste any URL into the Parse tab to see its query parameters broken down into individual key-value pairs, or use the Encode/Decode tab to safely encode values for inclusion in query strings.
Further Reading
- WHATWG URLSearchParams
The living standard specification for URLSearchParams interface.
- MDN URLSearchParams
Web API reference for parsing and manipulating URL query strings.
- RFC 3986 — Section 3.4 (Query)
IETF definition of the query component in URI syntax.