ScrapeGraphAI Docker MCP Server: Complete Setup Guide

The Model Context Protocol (MCP) has revolutionized how AI assistants interact with external tools and data sources. If you're using Claude Desktop or other MCP-compatible tools, you can now leverage ScrapeGraphAI's powerful web scraping capabilities directly through a Docker container.

In this comprehensive tutorial, I'll walk you through setting up the ScrapeGraphAI MCP Server using Docker, configuring it with Claude Desktop, and using all its powerful tools for web scraping, data extraction, and content conversion.

What is the ScrapeGraphAI MCP Server?

The ScrapeGraphAI MCP Server is a Dockerized Model Context Protocol server that provides AI assistants like Claude with direct access to ScrapeGraphAI's web scraping tools. Instead of writing code or making API calls manually, you can simply ask Claude to scrape websites, extract data, or convert pages to markdown—all through natural language conversations.

The Docker container (mcp/scrapegraph) runs as an MCP server that exposes five powerful tools:

smartscraper: Extract structured data from webpages using AI
searchscraper: Perform AI-powered web searches with structured results
markdownify: Convert webpages to clean, formatted markdown
smartcrawler_initiate: Start intelligent multi-page web crawling operations
smartcrawler_fetch_results: Retrieve results from crawling operations

Prerequisites

Before we begin, make sure you have:

Docker installed: Download Docker Desktop from docker.com
Claude Desktop (optional but recommended): Get it from Anthropic's website
ScrapeGraphAI API Key: Sign up at dashboard.scrapegraphai.com and get your API key

Step 1: Pull the Docker Image

First, let's pull the ScrapeGraphAI MCP Server image from Docker Hub:

docker pull mcp/scrapegraph

This will download the latest version of the image (approximately 106 MB). Once complete, you can verify it's available:

docker images | grep scrapegraph

You should see output like:

mcp/scrapegraph    latest    sha256:9005c47bd...    106.2 MB    4 months ago

Step 2: Configure Claude Desktop

To use the MCP server with Claude Desktop, you need to configure it in Claude's settings. The configuration file location varies by operating system:

macOS:

~/Library/Application Support/Claude/claude_desktop_config.json

Windows:

%APPDATA%\Claude\claude_desktop_config.json

Linux:

~/.config/Claude/claude_desktop_config.json

Create or edit this file with the following configuration:

{
  "mcpServers": {
    "scrapegraph": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "-e",
        "SGAI_API_KEY",
        "mcp/scrapegraph"
      ],
      "env": {
        "SGAI_API_KEY": "YOUR_SGAI_API_KEY_HERE"
      }
    }
  }
}

Important Security Note: Replace YOUR_SGAI_API_KEY_HERE with your actual ScrapeGraphAI API key. For better security, you can also set it as an environment variable:

{
  "mcpServers": {
    "scrapegraph": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "-e",
        "SGAI_API_KEY",
        "mcp/scrapegraph"
      ],
      "env": {
        "SGAI_API_KEY": "${SGAI_API_KEY}"
      }
    }
  }
}

Then set the environment variable in your system before launching Claude Desktop.

After saving the configuration file, restart Claude Desktop for the changes to take effect.

Step 3: Verify the Connection

Once Claude Desktop restarts, you can verify that the MCP server is connected. In a new conversation, try asking Claude:

What MCP tools do you have available?

Claude should list the five ScrapeGraphAI tools. If you don't see them, check:

The Docker daemon is running
The configuration file is in the correct location
The JSON syntax is valid (you can use JSONLint to verify)
You've restarted Claude Desktop after making changes

Using the Tools

Now that everything is set up, let's explore how to use each tool through natural language conversations with Claude.

SmartScraper: Extract Structured Data

SmartScraper is perfect when you need to extract specific information from a webpage. Just describe what you want, and Claude will use the smartscraper tool to get it for you.

Example conversation:

You: Can you extract the product name, price, and description from 
https://example.com/product?

Claude: I'll extract that information for you using SmartScraper.

[Claude uses the smartscraper tool]

Here's the extracted data:
- Product Name: Example Widget
- Price: $29.99
- Description: A high-quality widget perfect for everyday use...

Advanced usage: You can also ask Claude to extract data from multiple pages or handle dynamic content:

You: Extract all job listings from https://company.com/careers, including 
title, location, and salary for each position.

SearchScraper: AI-Powered Web Search

SearchScraper lets Claude perform web searches and extract structured information from multiple sources automatically.

Example conversation:

You: Search for the latest information about renewable energy trends in 2024 
and extract the key findings.

Claude: I'll search for that information and extract the key findings.

[Claude uses the searchscraper tool]

Here are the key findings from my search:
1. Solar energy costs have decreased by 40%...
2. Wind power capacity has increased significantly...

By default, SearchScraper searches 3 websites, but you can ask Claude to adjust this:

You: Search 5 websites for information about Python web scraping best practices.

Markdownify: Convert Webpages to Markdown

Markdownify converts any webpage into clean, readable markdown format—perfect for documentation, content migration, or reading purposes.

Example conversation:

You: Convert https://docs.example.com/getting-started to markdown format.

Claude: I'll convert that page to markdown for you.

[Claude uses the markdownify tool]

# Getting Started

Welcome to our documentation...

[Markdown content continues]

This is especially useful for:

Converting documentation to markdown for your own docs
Creating backups of important web content
Migrating content between systems
Improving readability of web articles

SmartCrawler: Multi-Page Web Crawling

SmartCrawler is powerful for extracting data from multiple pages on a website. The process involves two steps: initiating the crawl, then fetching results.

Example conversation:

You: Crawl https://blog.example.com and extract all article titles, 
authors, and publication dates. Limit it to the first 10 articles.

Claude: I'll initiate a SmartCrawler operation to extract article information.

[Claude uses smartcrawler_initiate]
[Claude waits for completion]
[Claude uses smartcrawler_fetch_results]

Here are the articles I found:
1. Title: "Introduction to Web Scraping"
   Author: Jane Doe
   Published: 2024-11-15

2. Title: "Advanced Data Extraction"
   Author: John Smith
   Published: 2024-11-10

[... continues with remaining articles]

SmartCrawler supports two modes:

AI Extraction Mode (10 credits per page): Extracts structured data based on your prompt
Markdown Mode (2 credits per page): Converts pages to markdown format

You can control the crawling behavior:

depth: Maximum link traversal depth
max_pages: Maximum number of pages to crawl
same_domain_only: Whether to stay within the same domain

Running the Container Manually

While using it with Claude Desktop is convenient, you can also run the MCP server manually for testing or integration with other tools.

Basic Usage

docker run -i --rm -e SGAI_API_KEY=your_api_key_here mcp/scrapegraph

Using Environment Variables

For better security, use environment variables:

export SGAI_API_KEY="your_api_key_here"
docker run -i --rm -e SGAI_API_KEY mcp/scrapegraph

Persistent Container (Development)

For development purposes, you might want to keep the container running:

docker run -d --name scrapegraph-mcp \
  -e SGAI_API_KEY=your_api_key_here \
  mcp/scrapegraph

Then attach to it:

docker attach scrapegraph-mcp

Advanced Configuration

Using Docker Compose

For easier management, you can create a docker-compose.yml file:

version: '3.8'
 
services:
  scrapegraph-mcp:
    image: mcp/scrapegraph:latest
    container_name: scrapegraph-mcp
    environment:
      - SGAI_API_KEY=${SGAI_API_KEY}
    stdin_open: true
    tty: true
    restart: unless-stopped

Then run:

docker-compose up -d

Environment Variable Files

For better security, use a .env file:

# .env file
SGAI_API_KEY=your_api_key_here

And reference it in docker-compose:

env_file:
  - .env

Or when running docker directly:

docker run -i --rm --env-file .env mcp/scrapegraph

Troubleshooting Common Issues

Issue: Claude Desktop Can't Connect to the MCP Server

Solutions:

Verify Docker is running: docker ps should work without errors
Test the image manually: docker run -i --rm -e SGAI_API_KEY=test mcp/scrapegraph
Check the configuration file JSON syntax is valid
Ensure the configuration file is in the correct location for your OS
Restart Claude Desktop completely

Issue: "API Key Not Found" Error

Solutions:

Verify your API key is correctly set in the configuration
Check that the environment variable is accessible
Test your API key at the ScrapeGraphAI dashboard
Ensure there are no extra spaces or quotes around the API key

Issue: Docker Container Fails to Start

Solutions:

Check Docker logs: docker logs <container_id>
Verify you have the latest image: docker pull mcp/scrapegraph
Ensure you have sufficient disk space: docker system df
Try removing old containers: docker system prune

Issue: Tools Not Appearing in Claude

Solutions:

Restart Claude Desktop completely (quit and reopen)
Check that the MCP server appears in Claude's connection status
Verify the Docker command works manually
Check Claude Desktop's logs for errors

Best Practices

Security

Never commit API keys: Use environment variables or secure secret management
Use Docker secrets: For production deployments, use Docker secrets
Limit container access: Run containers with minimal required permissions
Regular updates: Keep the Docker image updated: docker pull mcp/scrapegraph

Performance

Resource limits: Set appropriate CPU and memory limits for Docker containers
Network configuration: Use Docker networks for secure communication
Credits management: Monitor your ScrapeGraphAI usage to stay within budget
Caching: Consider caching results for repeated requests

Usage Tips

Be specific in prompts: Clearer prompts yield better extraction results
Batch similar requests: Group related scraping tasks together
Monitor credit usage: Track your API usage in the dashboard
Test with small datasets first: Verify your approach before large crawls

Real-World Use Cases

Content Aggregation

You: Crawl https://tech-news.com and extract all article headlines and 
summaries from the front page, then format them as a daily digest.

Market Research

You: Search for information about competitor pricing in the SaaS space 
and extract pricing tiers, features, and target markets.

Documentation Migration

You: Convert all pages from https://old-docs.example.com/api to markdown 
format so I can migrate them to our new documentation system.

Lead Generation

You: Extract company information including name, contact email, and 
industry from https://directory.example.com/companies.

SEO Monitoring

You: Extract meta titles, descriptions, and H1 tags from 
https://competitor.com/blog to analyze their SEO strategy.

Integration with Other Tools

While Claude Desktop is the primary use case, the MCP server can integrate with other MCP-compatible tools:

Cline: VS Code extension with MCP support
Continue: IDE extension for code completion
Custom MCP clients: Build your own integrations

Check the Model Context Protocol documentation for more information about MCP client development.

Understanding Costs

ScrapeGraphAI uses a credit-based pricing system:

SmartScraper: 10 credits per request
SearchScraper: 10 credits per website searched (default 3 = 30 credits)
Markdownify: 2 credits per page
SmartCrawler (AI mode): 10 credits per page crawled
SmartCrawler (Markdown mode): 2 credits per page crawled

Monitor your usage in the dashboard and adjust your scraping strategies accordingly.

Frequently Asked Questions

Can I use this without Claude Desktop?

Yes! The Docker container implements the MCP protocol, so any MCP-compatible client can use it. You can also use it programmatically by implementing an MCP client.

How do I update to the latest version?

Simply pull the latest image:

docker pull mcp/scrapegraph

Then restart Claude Desktop or your MCP client.

Is my data secure?

The Docker container runs locally on your machine. Data flows from:

Your machine → Docker container → ScrapeGraphAI API
Results flow back through the same path

Your API key is stored in your local configuration and never leaves your machine (except to authenticate with ScrapeGraphAI).

Can I use multiple API keys?

Yes, you can run multiple instances with different API keys by giving them different names in the configuration:

{
  "mcpServers": {
    "scrapegraph": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "SGAI_API_KEY", "mcp/scrapegraph"],
      "env": {
        "SGAI_API_KEY": "key1"
      }
    },
    "scrapegraph-alt": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "SGAI_API_KEY", "mcp/scrapegraph"],
      "env": {
        "SGAI_API_KEY": "key2"
      }
    }
  }
}

What happens if I exceed my credit limit?

The API will return an error. Monitor your usage in the dashboard and upgrade your plan if needed.

Can I customize the Docker image?

Yes! The source code is available at github.com/ScrapeGraphAI/scrapegraph-mcp. You can build a custom image with your modifications.

Is there a way to cache results?

The MCP server itself doesn't cache, but you can implement caching at the Claude Desktop level or build a custom MCP client with caching capabilities.

Conclusion

The ScrapeGraphAI Docker MCP Server brings powerful web scraping capabilities directly to AI assistants like Claude. By following this tutorial, you've learned how to:

Set up the Docker container
Configure Claude Desktop to use the MCP server
Use all five available tools through natural language
Troubleshoot common issues
Apply best practices for security and performance

The combination of Docker's containerization and MCP's protocol makes this a secure, portable, and powerful solution for AI-powered web scraping. Whether you're aggregating content, conducting market research, or migrating documentation, this setup gives Claude the tools it needs to help you work with web data efficiently.

Start experimenting with different scraping tasks and discover how much more productive you can be when Claude can access the web directly through ScrapeGraphAI!

Want to learn more about ScrapeGraphAI and web scraping? Explore these guides:

ScrapeGraphAI Tutorial - Master AI-powered web scraping
ScrapeGraphAI JavaScript SDK - Use ScrapeGraphAI in JavaScript/TypeScript
LlamaIndex Integration - Combine with LlamaIndex for data pipelines
Building AI Agents - Create powerful automation agents
SmartCrawler Introduction - Learn about intelligent web crawling
Web Scraping Best Practices - Production-ready scraping strategies

ScrapeGraphAI Docker MCP Server: Complete Setup Guide

What is the ScrapeGraphAI MCP Server?

Prerequisites

Step 1: Pull the Docker Image

Step 2: Configure Claude Desktop

Step 3: Verify the Connection

Using the Tools

SmartScraper: Extract Structured Data

SearchScraper: AI-Powered Web Search

Markdownify: Convert Webpages to Markdown

SmartCrawler: Multi-Page Web Crawling

Running the Container Manually

Basic Usage

Using Environment Variables

Persistent Container (Development)

Advanced Configuration

Using Docker Compose

Environment Variable Files

Troubleshooting Common Issues

Issue: Claude Desktop Can't Connect to the MCP Server

Issue: "API Key Not Found" Error

Issue: Docker Container Fails to Start

Issue: Tools Not Appearing in Claude

Best Practices

Security

Performance

Usage Tips

Real-World Use Cases

Content Aggregation

Market Research

Documentation Migration

Lead Generation

SEO Monitoring

Integration with Other Tools

Understanding Costs

Frequently Asked Questions

Can I use this without Claude Desktop?

How do I update to the latest version?

Is my data secure?

Can I use multiple API keys?

What happens if I exceed my credit limit?

Can I customize the Docker image?

Is there a way to cache results?

Conclusion

Related Resources