ScrapeGraphAIScrapeGraphAI
Dark

What are Request Headers?

Last updated: Apr 5, 2025

Definition

Request headers are key-value pairs sent alongside every HTTP request that provide metadata about the client and the desired response. They tell the server what software is making the request, what content formats are acceptable, what language the user prefers, and whether the request carries authentication credentials. Properly configured headers are critical for successful web scraping.

Essential Headers for Web Scraping

User-Agent

Identifies the client software. Using a realistic browser User-Agent instead of a library default is the most basic step in avoiding bot detection.

Accept

Specifies which content types the client can handle. Browsers send complex Accept headers like text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8. Missing or simplified Accept headers flag requests as non-browser.

Accept-Language

Indicates preferred languages. A request claiming to be from a US-based Chrome browser but missing en-US in Accept-Language is inconsistent.

Accept-Encoding

Tells the server which compression formats are supported. Browsers typically include gzip, deflate, br. Omitting this header wastes bandwidth and looks suspicious.

Referer

Indicates the page that linked to the current request. Many sites check this header — direct requests to deep pages without a Referer from the same domain can be flagged.

Carries session cookies for authenticated access. See session management for details.

Why Headers Matter

Anti-bot systems build a profile from the complete set of request headers. It is not enough to set a valid User-Agent if the rest of the headers are missing or inconsistent. The full header set must present a coherent browser identity.

Header Ordering

Some sophisticated detection systems even check header ordering. Different browsers send headers in different orders, and HTTP libraries often use a non-browser default order.

Headers in ScrapeGraphAI

ScrapeGraphAI automatically constructs complete, consistent header sets that match real browser profiles. Each request includes properly ordered headers with coherent values, removing one of the more tedious and error-prone aspects of building reliable scrapers.