Why raw LLM "fetch" tools fail and how ScrapeGraphAI turns Claude into an autonomous scraper.

The Truth: LLM fetchers are still pretty bad at real world data acquisition
Claude, OpenAI, Gemini all of them still suffer from the same problem:
Their built in "fetch URL → extract data" tools break the moment real automation tasks begin.
Dynamic websites, pagination, JavaScript rendering, structured extraction... LLMs can describe scraping but fail at actually doing it. They hallucinate, return empty pages, misunderstand structure, or simply refuse to fetch.
We tested this live.
The experiment: Scraping IBM's partner directory
Target:
https://www.ibm.com/partnerplus/directory/companies?p=1
The request was simple: extract all the company URLs, open each profile, gather the overview, address, telephone, website and proficiencies, and finally assemble everything into an Excel file.
What happened without a scraping engine
This resulted in a classic LLM meltdown:
- Empty fetches
- Domain restrictions
- Wrong URLs
- Invented data
- 47 companies found instead of 30???
- An Excel file full of hallucinations
LLM fetcher = great talker, terrible scraper.
Original attempt: https://claude.ai/share/bcf1349b-0c87-416c-bb9f-2d1ced848b76
Then We Equipped Claude With a Real Scraping Engine
Same request. Same page.
But this time Claude had access to ScrapeGraphAI.
Result: https://claude.ai/share/b16acfb0-ba07-4116-9e46-3b781724a5b4
The difference was immediate. Claude correctly detected JavaScript-heavy content, extracted all 30 companies from page one, followed each link, pulled accurate structured data, built a clean Excel file — and did all of this without a single hallucination.
Why ScrapeGraphAI Works (While LLM Fetchers Fail)
Because LLMs don't have the infra for scraping at scale. ScrapeGraphAI is the main mission.
LLM fetch tools struggle with:
- JavaScript rendered pages
- Pagination
- Antibot logic
- Multistep workflows
- Large scale crawling
- Consistent structured data extraction
ScrapeGraphAI solves this by performing:
- Real browser level fetching
- DOM parsing
- Schema validation
- Recursive crawling
- Antiduplicate logic
- Robust retry mechanisms
The LLM is the brain, ScrapeGraphAI is the arm.
LLM + ScrapeGraphAI = Agentic Scraping
With ScrapeGraphAI behind it, Claude suddenly acts like a real agent. It navigates pages naturally, following links from one profile to the next without getting lost or confused. When it encounters a page, it extracts clean structured fields exactly as requested, and if something fails, it retries intelligently instead of crashing or hallucinating.
The results speak for themselves: Claude generates Excel files with perfectly organized data, summarizes entire datasets on demand, and continues working seamlessly across multi-page flows. All of this happens without the overhead of dealing with a slow and heavy browser, because ScrapeGraphAI handles the infrastructure while Claude focuses on understanding and organizing the data.
This is true agentic data acquisition.
All 30 companies extracted perfectly with ScrapeGraphAI as scraping engine
| # | Company | Overview | Address | Telephone | Website | Key Proficiencies |
|---|---|---|---|---|---|---|
| 1 | Crayon Poland Sp. Z o.o. | Global technology player with 47 offices worldwide, HQs in Oslo. One of IBM's larger Business Partners with strong competencies across IBM's software stack. | Zlota 59, Warsaw, Poland | +48 48222782777 | crayon.com/pl-pl | watsonx.ai, watsonx.data, Maximo, Guardium, Instana |
| 2 | Arrow ECS | Mayorista de Soluciones Globales (Global Solutions Distributor) | Avenida de Europa 21, Alcobendas, Madrid, Spain | +34 0685914729 | ibm.com | MQ, Event Automation, QRadar, Guardium, watsonx suite |
| 3 | CAPGEMINI Technology Services | Global leader in partnering with companies to transform business. 55-year heritage with deep industry expertise in cloud, data, AI, connectivity and platforms. | 145-151 Quai du President Roosevelt, Paris, France | +33 1 49673000 | capgemini.com | watsonx.ai, watsonx.data, Cloudability, Turbonomics, Maximo |
| 4 | YCOS Yves Colliard Software GmbH | Since 1989 offering training, consultancy and products on MVS, OS/390 and z/OS platform. | Bienenstr. 2, Euskirchen, Germany | +49 2251 6250090 | ycos.de | z/OS platform, MVS, OS/390, ISV |
| 5 | Prolifics, Inc. | Digital engineering and consulting firm helping navigate digital transformation. Expertise in Data & AI, Integration, Business Automation, DevXOps, Cybersecurity. | Rödingsmarkt 20, Hamburg, Germany | +49 40 89066770 | prolifics.de | Data & AI, Business Automation, DevXOps, Cybersecurity |
| 6 | iSky Development | Founded 2013 in Cairo by Egyptian entrepreneurs. Solutions company serving Europe and Middle East for businesses, governments and non-profits. | Unit 14, Tower 1, Silver Mall, Cairo, Egypt | +20 20238326694 | iskydev.com | Event Automation, MQ, Turbonomics, Instana, API Connect |
| 7 | Deloitte Australia | Industry-leading audit, tax, consulting, financial advisory and risk advisory services. Part of Deloitte Touche Tohmatsu Limited network. | Quay Quarter Tower Level 46, Sydney, Australia | 0293227000 | deloitte.com.au | Cloudability, Turbonomics, Verify, watsonx.ai, Terraform |
| 8 | TECH-HUB | Professional IT Services Provider providing excellent business solutions to increase client revenue and provide competitive edge. | 3 Road 262, New Maadi, Cairo, Egypt | +20 101 6000789 | tech-hub.tech | Turbonomics, Instana, Guardium, watsonx Assistant |
| 9 | Cohesive | Leading Maximo provider with 700+ successful implementations over 25 years. A Bentley brand for asset lifecycle management. | Glenwood Office Park, Pretoria, South Africa | Not available | cohesivesolutions.com | Maximo, TRIRIGA, ELM Suite, QRadar Suite |
| 10 | Jones Lang Lasalle Holding AB | JLL Technologies delivers market-leading technology to power the future of real estate with purpose-built solutions. | Birger Jarlsgatan 25, Stockholm, Sweden | +46 84535000 | jllsweden.se | Maximo, TRIRIGA, Envizi Sustainability |
| 11 | Deloitte Poland (Consulting) | Professional advisory services in audit, tax, economic, risk management, financial and legal advisory. | al. Jana Pawła II 22, Warsaw, Poland | (+)48 (22) 511 08 11 | deloitte.com/pl | Cloudability, Guardium, Verify, watsonx.ai, Terraform |
| 12 | ITALWARE SrL | System integrator supporting Digital Transformation through ICT infrastructure solutions in partnership with major vendors. | Via della Maglianella 65E, Roma, Italy | +39 39 0666411156 | italware.it | Turbonomics, Power hardware, watsonx.ai, Guardium |
| 13 | GBM Dominicana, S.A. | Leading IT services company, solutions integrator and IT expert. Exclusive IBM distributor in Central America, Dominican Republic and Haiti. | John F Kennedy Ave, No. 14, Santo Domingo | +809 566 5161 | gbm.net | Power systems, Turbonomics, Instana, Maximo, watsonx suite |
| 14 | CrushBank Technology, Inc. | Award-winning Data and AI platform using IBM watsonx for faster IT support information access and problem resolution. | 5 Aerial Way, Syosset, New York, USA | 5163776585 | crushbank.com | Data and AI platform, IT support, ISV |
| 15 | Arrow ECS Baltic OÜ | Global value add distributor with strong local teams around IBM technologies supporting customers at every journey stage. | Sõpruse pst 145, Tallinn, Estonia | Not available | arrow.com/globalecs/ee | Full IBM portfolio, watsonx suite, Maximo, Guardium |
| 16 | Cubewise China | Full-service IBM Planning Analytics (TM1) supplier with hundreds of happy customers across six continents. | 虹口区天潼路328号 WeWork, Shanghai, China | +86 4000188803 | cubewise.com | Planning Analytics, watsonx.ai, Cognos Analytics |
| 17 | Cubewise Canada | Largest, most enduring team of IBM Planning Analytics (TM1) specialists in the world dedicated to quality craftsmanship. | 100 King Street W Suite 5700, Toronto, Canada | 857 208 7267 | cubewise.com | Planning Analytics, watsonx.ai, Cognos Analytics |
| 18 | Phoenix Technologies AG | Pioneers sovereign Cloud & AI solutions for large enterprises, governments and public entities in Switzerland. | Alpenstrasse 9, Zug, Switzerland | Not available | phoenix-technologies.ch | AI solutions, Cloud solutions, Sovereign Cloud & AI |
| 19 | CRAYON LITHUANIA, UAB | Global technology player with 45 offices worldwide. One of IBM's larger Business Partners specializing in IBM Software optimization. | 16F G Jasinskio G, Vilnius, Lithuania | Not available | crayon.com | Cloudability, Instana, Turbonomics, watsonx.ai |
| 20 | Intercomputer Bulgaria | Leading system integrator specializing in IT solutions including infrastructure, data analytics, middleware, security, AI/automation. | 593 street, Sofia, Bulgaria | Not available | intercomputer.bg | Event Automation, Power hardware, Maximo, watsonx suite |
| 21 | Crayon Australia | Global technology player with 45 offices worldwide with strong competencies across IBM's software stack. | 44 Lakeview Drive Scoresby, Melbourne, Australia | +61 22891085 | crayon.com/en-au | Event Automation, Cloudability, Turbonomics, watsonx.ai |
| 22 | SHI International Corp. | Passionate about delivering exceptional value helping customers select, deploy, and manage technology at global scale. | 290 Davidson Avenue, Somerset, New Jersey, USA | +1 888 7648888 | shi.com | Event Automation, Instana, Maximo, watsonx suite |
| 23 | Pedab Norway | Dedicated IBM Value Add Distributor and Techbroker with European presence and high focus on IBM offerings for 30+ years. | Stortingsgata 4, Oslo, Norway | +47 476 57 700 | pedab.no | Cloudability, Power hardware, FlashSystems, watsonx.ai |
| 24 | Dun & Bradstreet | D&B Ask Procurement is a GenAI assistant for procurement teams. World's leading source of commercial information with 550M+ business records. | 5335 Gate Pkwy Ste 100, Jacksonville, Florida, USA | Not available | dnb.com | ISV, GenAI assistant for procurement |
| 25 | PERSISTENT SYSTEMS LIMITED | Global services company delivering AI-led Digital Engineering and Enterprise Modernization. 26,000+ employees in 18 countries. | Bhageerath, 402, Senapati Bapat Road, Pune, India | +91 20 30234000 | persistent.com | SevOne, Instana, watsonx Assistant, API Connect |
| 26 | Crayon Deutschland | Global IBM Platinum Business Partner authorized in 30+ countries. Specialized in Software Licensing, SAM, Training and Consulting. | Crayon Deutschland GmbH, Oberhaching, Germany | +49 89 67804650 | crayon.de | Event Automation, Cloudability, Turbonomics, watsonx.ai |
| 27 | Dedagroup SPA | Present with 35 locations in Italy, operating in UK, USA, Mexico and Tunisia. Partners in France, Germany and China. | Via di Spini 50, Trento, Italy | +39 461 997111 | dedagroup.it | Turbonomics, Maximo, Power hardware, watsonx.ai |
| 28 | MACS BV | Software services provider for maintenance, service and facility management. 25 years of solutions across Europe, UK and worldwide. | Museumstraat 8, Antwerpen, Belgium | +32 3 2371755 | macs.eu | Maximo, Envizi, TRIRIGA |
| 29 | Kenac Computer Systems | Zimbabwean company specializing in Enterprise ICT Solutions for Hardware and Software including Sales and support. | 109 Enterprise Road, Highlands, Harare, Zimbabwe | +263 4 0773836664 | kenac.co.zw | Event Automation, MQ, SevOne, Guardium, Power systems |
| 30 | InTTrust SA | Provides high-quality IT services and solutions enhancing collaboration and productivity with current and emerging technologies. | 2 Ipeirou Str, Ag. Paraskevi, Athens, Greece | +30 210 6513040 | inttrust.gr | Event Automation, Maximo, Power systems, watsonx suite |
The final XLSX file contained fully structured data, summary statistics and every corresponding profile URL.
Exactly what scraping is supposed to produce.
The big lesson: LLMs don't need better fetchers, they need real scraping engines
A fetch function isn't a scraper.
ScrapeGraphAI provides the missing infrastructure.
Give Claude a real scraping engine.
How to Set Up Claude With Scraping Capabilities
If you want Claude to behave exactly like the Scraping Beast described above, here is the exact setup process.
Step 1: Install the MCP Server
Go to the ScrapeGraphAI MCP server page:
https://smithery.ai/server/@ScrapeGraphAI/scrapegraph-mcp
Once you are on the page, scroll to the Auto section. Under the list of supported clients, you will find Claude Desktop.
Step 2: Run the Installation Command
Copy the npx command shown there and execute it in your terminal. This installs the ScrapeGraphAI MCP server locally and makes it visible to Claude Desktop.
Step 3: Restart Claude Desktop
Restart Claude Desktop so it can automatically detect the new MCP server.
Step 4: Get Your API Key
Head over to ScrapeGraphAI and retrieve your API key.
Step 5: Configure Claude
Open Claude Code and ask it to set up your ScrapeGraphAI API key. You can follow the same flow shown in this example chat:
https://claude.ai/share/990c3025-40ec-49c8-b0bb-7c556ac033b1
(The API key shown in that example is no longer active, sorry guys)
Step 6: Start Scraping!
Once the key is configured, Claude instantly becomes a true Scraping Beast, fully equipped with real browserless scraping power and agentic extraction capabilities.
Enjoy your Scraping Beast!
