ScrapeGraphAIScrapeGraphAI

Claude Web Fetch Tool: Complete Guide for AI Agents and Web Data

Claude Web Fetch Tool: Complete Guide for AI Agents and Web Data

Author 1

Marco Vinciguerra

Anthropic's web fetch tool lets Claude retrieve and analyze full content from web pages and PDFs directly through the API. It's a powerful addition to Claude's tool ecosystem—but it's not a replacement for production-grade web scraping. If you're building AI agents that need reliable, structured data extraction at scale, understanding the boundaries of this tool matters.

Here's everything you need to know about the Claude web fetch tool: how it works, how to use it, and when to reach for ScrapeGraphAI instead.

What Is the Claude Web Fetch Tool?

The web fetch tool is a server-side capability that allows Claude to retrieve content from URLs you provide. When you add it to your API request, Claude can:

  1. Decide when to fetch based on your prompt and available URLs
  2. Retrieve full text content from the specified URL
  3. Extract text from PDFs automatically
  4. Analyze the content and respond with optional citations

The tool is currently in beta. You enable it by adding the web-fetch-2025-09-10 beta header to your API requests. It's available on Claude Opus 4.6, Opus 4.5, Opus 4.1, Opus 4, Sonnet 4.5, Sonnet 4, Haiku 4.5, and deprecated models like Sonnet 3.7 and Haiku 3.5.

Important limitation: The web fetch tool does not support JavaScript-rendered pages. If your target site loads content dynamically, you'll get the initial HTML—not the rendered DOM. For dynamic sites, you need a real browser or a scraping engine like ScrapeGraphAI.

How to Use the Web Fetch Tool

Basic API Setup

curl https://api.anthropic.com/v1/messages \
    --header "x-api-key: $ANTHROPIC_API_KEY" \
    --header "anthropic-version: 2023-06-01" \
    --header "anthropic-beta: web-fetch-2025-09-10" \
    --header "content-type: application/json" \
    --data '{
        "model": "claude-opus-4-6",
        "max_tokens": 1024,
        "messages": [
            {
                "role": "user",
                "content": "Please analyze the content at https://example.com/article"
            }
        ],
        "tools": [{
            "type": "web_fetch_20250910",
            "name": "web_fetch",
            "max_uses": 5
        }]
    }'

Tool Parameters

The web fetch tool supports several optional parameters for control and security:

Parameter Purpose
max_uses Limit the number of fetches per request (e.g., 5, 10)
allowed_domains Restrict fetches to specific domains (e.g., ["example.com", "docs.example.com"])
blocked_domains Block specific domains from being fetched
citations Enable {"enabled": true} for cited passages in responses
max_content_tokens Cap content length to control token usage (e.g., 100000)

Domain rules: Use example.com without the scheme. Subdomains are included automatically. You can use either allowed_domains or blocked_domains, but not both in the same request.

Web fetch works well alongside web search for research workflows. Claude can search first, then fetch and analyze the most relevant results:

import anthropic
 
client = anthropic.Anthropic()
 
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Find recent articles about quantum computing and analyze the most relevant one in detail"
        }
    ],
    tools=[
        {"type": "web_search_20250305", "name": "web_search", "max_uses": 3},
        {
            "type": "web_fetch_20250910",
            "name": "web_fetch",
            "max_uses": 5,
            "citations": {"enabled": True}
        }
    ],
    extra_headers={"anthropic-beta": "web-fetch-2025-09-10"}
)

URL Validation and Security

For security, Claude can only fetch URLs that have already appeared in the conversation:

  • URLs in user messages
  • URLs in client-side tool results
  • URLs from previous web search or web fetch results

Claude cannot fetch arbitrary URLs it invents or URLs from container-based tools (Code Execution, Bash, etc.). This reduces data exfiltration risk but also limits flexibility.

Data exfiltration considerations: If you process untrusted input alongside sensitive data, Anthropic recommends disabling the web fetch tool, using max_uses, or restricting allowed_domains. Be aware of homograph attacks—Unicode characters in domain names can bypass filters (e.g., аmazon.com with Cyrillic 'а').

When the Web Fetch Tool Shines

The web fetch tool is well-suited for:

  • Single-page analysis — Summarizing articles, extracting key points, answering questions about a document
  • PDF text extraction — Research papers, reports, documentation
  • Research workflows — Combined with web search: discover → fetch → analyze
  • Citation-backed answers — When you need traceable sources
  • Static content — Server-rendered HTML, plain text, PDFs

It's free beyond standard token costs. There are no extra charges for web fetch requests—you only pay for the tokens consumed by the fetched content in context.

When ScrapeGraphAI Is the Better Choice

We've tested Claude's built-in fetch capabilities extensively. As documented in How we turned Claude into a beast machine for web scraping, raw LLM fetch tools break on real-world automation:

  • JavaScript-rendered pages — Web fetch gets initial HTML, not the rendered DOM
  • Structured extraction — Web fetch returns raw content; you need schema validation and consistent JSON
  • Pagination and crawling — No built-in support for multi-page workflows
  • Production scale — Token-based costs are unpredictable; credit-based pricing scales better
  • Dynamic sites — E-commerce, SPAs, and modern web apps often require browser-level fetching

ScrapeGraphAI provides:

  • Schema-validated output — Pydantic models, JSON schemas, guaranteed structure
  • JavaScript rendering — Real browser execution for dynamic content
  • Crawling and pagination — SmartCrawler, multi-page extraction, recursive workflows
  • Predictable pricing — Credits per request, regardless of page size
  • MCP integration — Use ScrapeGraphAI as a tool inside Claude or other agents

If you're building AI agents that need reliable, scalable web data extraction, ScrapeGraphAI works better as a tool. For one-off analysis and research, the web fetch tool is often enough.

Error Handling

When web fetch fails, the API returns a 200 response with an error in the body:

{
  "type": "web_fetch_tool_result",
  "tool_use_id": "srvtoolu_a93jad",
  "content": {
    "type": "web_fetch_tool_error",
    "error_code": "url_not_accessible"
  }
}

Common error codes:

Code Meaning
invalid_input Invalid URL format
url_too_long URL exceeds 250 characters
url_not_allowed Blocked by domain rules or model restrictions
url_not_accessible HTTP error, fetch failed
too_many_requests Rate limit exceeded
unsupported_content_type Only text and PDF supported
max_uses_exceeded Exceeded max_uses limit
unavailable Internal error

Usage and Pricing

Web fetch has no additional charges beyond standard token costs. Example token usage:

  • Average web page (10KB): ~2,500 tokens
  • Large documentation page (100KB): ~25,000 tokens
  • Research paper PDF (500KB): ~125,000 tokens

Use max_content_tokens to cap how much content is included and avoid unexpectedly large token bills.

Conclusion

The Claude web fetch tool is a useful addition for research, analysis, and single-page content retrieval. It works well with web search, supports PDFs, and costs nothing extra beyond tokens. For static content and one-off tasks, it's often sufficient.

For production AI agents that need structured data extraction, JavaScript rendering, crawling, or predictable costs, ScrapeGraphAI remains the better tool. Use web fetch for exploration and analysis; use ScrapeGraphAI when you need reliable, scalable extraction.

Try ScrapeGraphAI free with 50 credits and see how it compares for your use case.

Give your AI Agent superpowers with lightning-fast web data!