TL;DR
LiteLLM is an LLM gateway: one OpenAI-compatible endpoint in front of 100+ models, with fallbacks and spend tracking. ScrapeGraphAI plugs into it as an MCP server, so any model routed through the proxy can scrape, search, crawl, and extract live web data.
- Install:
pip install 'litellm[proxy]'. - Keys:
SGAI_API_KEYfor scraping, plus a model provider key likeOPENAI_API_KEY. - Wire it up: add ScrapeGraphAI under
mcp_serversin your LiteLLM config. - Run:
litellm --config config.yamland call the proxy athttp://localhost:4000.
Why route scraping through LiteLLM
LiteLLM sits between your app and the models. You point your code at one endpoint, and the proxy handles routing, fallbacks, retries, and cost tracking across providers. It's the layer a lot of teams standardize on so they don't rewrite client code every time they switch models.
That same gateway can expose tools over the Model Context Protocol. By registering ScrapeGraphAI as an MCP server on the proxy, every model behind LiteLLM gains web access without any per-model wiring. Swap GPT-4o for Claude or a local model and the scraping tools come along unchanged.
Installation
Install LiteLLM with proxy support:
pip install 'litellm[proxy]'Set your keys, one for the model provider and one for ScrapeGraphAI:
export OPENAI_API_KEY="your-openai-key"
export SGAI_API_KEY="your-scrapegraph-key"Configuration
Register ScrapeGraphAI as an MCP server in your LiteLLM proxy config:
# config.yaml
mcp_servers:
scrapegraph:
url: "https://smithery.ai/api/mcp/scrapegraph-mcp"
description: "Smart scraping, web crawling, search scraping, and agentic scraping workflows."The url points at the hosted ScrapeGraphAI MCP server. The proxy advertises its tools to every model that connects through it.
Running the proxy
Start LiteLLM with the config:
litellm --config config.yamlThe proxy comes up on http://localhost:4000, OpenAI-compatible. Point any OpenAI client at that base URL and the scraping tools are available to the model.
Using it
Once the proxy is running, models can call the scraping tools on their own. Send a request through the gateway and let the model decide when to reach for smartscraper, searchscraper, crawl, or extract:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000", api_key="anything")
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "user",
"content": "Scrape https://example.com and return the page title and main heading as JSON.",
}
],
)
print(response.choices[0].message.content)The model sees the ScrapeGraphAI tools the proxy exposes, picks the right one, runs it against the URL, and folds the result back into its answer. The same call works against any model LiteLLM can route to.
Wrapping up
LiteLLM gives you one stable endpoint in front of every model; ScrapeGraphAI gives that endpoint the web. Register it once as an MCP server, set two keys, and start the proxy. From then on, switching models is a config change, and web access follows automatically.
Related Articles
- ScrapeGraphAI MCP Server: Give Your AI the Web - The MCP server behind this integration.
- ScrapeGraphAI + Agno: Fast Agents With Web Access - Web tools in a lightweight agent framework.
- ScrapeGraphAI + CrewAI: Build Data Collection Agents - Multi-agent crews with the same scraping engine.
- ScrapeGraphAI Python SDK: Scrape, Extract, Crawl - The SDK these capabilities are built on.