ScrapeGraphAIScrapeGraphAI

Multi-Agent Systems: LangGraph, LlamaIndex & CrewAI

Multi-Agent Systems: LangGraph, LlamaIndex & CrewAI

Author 1

Written by Marco Vinciguerra

If you're building autonomous agents for data extraction, reasoning, or task automation, and you're looking to scale them intelligently β€” this tutorial is for you.


🧠 Why Multi-Agent Systems?

Multi-agent systems are AI architectures composed of independent agents that:

  • Specialize in different tasks (e.g., reasoning, data extraction, summarization)
  • Communicate with each other in a structured flow
  • Act autonomously or cooperatively to solve complex problems

When combined with LangGraph's graph-based orchestration, LlamaIndex's semantic memory and retrieval, and CrewAI's task delegation and personas, you get a production-ready AI system that can:

βœ… Ingest knowledge
βœ… Route intelligently between agents
βœ… Act on structured/unstructured data
βœ… Generate human-like outputs

Learn more about AI Agent Web Scraping and building intelligent agents with ScrapeGraph for practical applications.


🧰 Stack Overview

Tool Role
LangGraph Graph-based control flow for agents
LlamaIndex Long-term memory & retrieval-augmented generation
CrewAI Define agent roles, tasks, tools, and collaboration
LangChain LLM wrappers, tools, and memory integration
ScrapeGraphAI (Optional) Agent-powered web data pipelines

πŸ”§ Prerequisites

pip install langgraph crewai llama-index langchain openai scrapegraph-py

Make sure you have an LLM set up (like OpenAI or Ollama).


1. πŸ”— Create CrewAI Agents

from crewai import Agent, Task, Crew
 
researcher = Agent(
    role="Researcher",
    goal="Collect and summarize the latest market trends",
    backstory="An expert at scraping and summarizing complex data",
)
 
analyst = Agent(
    role="Analyst",
    goal="Draw insights from summarized content",
    backstory="A strategic thinker who uses structured data to generate
        recommendations",
)
 

2. 🧠 Load LlamaIndex for Context Memory

from llama_index import VectorStoreIndex, SimpleDirectoryReader
 
documents = SimpleDirectoryReader("data/").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

You can plug this query engine into your agents to give them RAG capabilities.


3. πŸ•ΈοΈ Design the LangGraph Flow

from langgraph.graph import StateGraph
 
def collect_data(state):
    content = researcher.run("Scrape AI trends from the web")
    return {**state, "content": content}
 
def analyze_data(state):
    summary = query_engine.query(state["content"])
    insights = analyst.run(f"Analyze the summary: {summary}")
    return {"output": insights}
 
workflow = StateGraph(dict)
workflow.add_node("Scrape", collect_data)
workflow.add_node("Analyze", analyze_data)
workflow.set_entry_point("Scrape")
workflow.add_edge("Scrape", "Analyze")
workflow.set_finish_point("Analyze")
 
graph = workflow.compile()
result = graph.invoke({})
print(result["output"])

⚑ Bonus: Add Dynamic Agent Routing

def router(state):
    if "financial" in state.get("input", "").lower():
        return "FinanceAgent"
    return "GeneralAgent"
 
workflow.add_node("Router", router)
 
workflow.add_node("FinanceAgent", analyze_data)
workflow.add_node("GeneralAgent", collect_data)
 
workflow.add_conditional_edges("Router", {
    "FinanceAgent": "Analyze",
    "GeneralAgent": "Scrape"
})

πŸ”„ Real-Time Use Case: Web Data Extraction with ScrapeGraphAI


πŸ› οΈ How to Add ScrapeGraphAI as a Tool in Your Python Agents

You can also use the official ScrapeGraphAI Python client to interact with the ScrapeGraph API directly:

from scrapegraph_py import Client
from scrapegraph_py.logger import sgai_logger
 
# Set logging level to INFO
sgai_logger.set_logging(level="INFO")
 
# Initialize the client with your API key (keep your key secret!)
sgai_client = Client(api_key="sgai-XXXX-XXXX-XXXX-XXXXXXXXXXXX")
 
try:
    # Make a SmartScraper request
    response = sgai_client.smartscraper(
        website_url="https://example.com",
        user_prompt="Extract webpage information"
    )
 
    # Print the response data
    print(f"Request ID: {response['request_id']}")
    print(f"Result: {response['result']}")
 
    # Optional: print reference URLs if available
    if response.get('reference_urls'):
        print("Reference URLs:")
        for url in response['reference_urls']:
            print(f" - {url}")
 
finally:
    # Close the client session
    sgai_client.close()

Note: Replace "sgai-XXXX-XXXX-XXXX-XXXXXXXXXXXX" with your actual ScrapeGraphAI API key, and never expose your API key publicly.


πŸ“Š Applications


🎯 SEO Takeaways

  • Use LangGraph for orchestration
  • Use CrewAI for multi-agent collaboration
  • Use LlamaIndex for contextual memory and document search
  • Use ScrapeGraphAI for autonomous data extraction

πŸ”š Conclusion

This is the future of AI infrastructure.


πŸ”— Ready to build?

Start your agent system today at scrapegraph.ai
β†’ or ask me below to generate a GitHub-ready template πŸš€


❓ Frequently Asked Questions (FAQs)

What is LangGraph used for?

LangGraph is a framework for defining stateful AI workflows as graphs. It enables complex reasoning paths, branching logic, and persistent memory flows between agents.


What is the difference between CrewAI and LangGraph?

  • CrewAI manages agent definitions, tasks, and interactions.
  • LangGraph handles the orchestration and flow between those agents, allowing dynamic routing and state transitions.

Using both together provides a flexible and powerful system for building AI agents that collaborate and reason over multiple steps.


How does LlamaIndex improve my agents?

LlamaIndex provides long-term memory and context-aware retrieval, allowing agents to access large corpora of documents and use relevant context in real-time. Perfect for RAG-based systems.


Can I use this for web scraping?

Yes! You can integrate ScrapeGraphAI as an agent tool or standalone service to autonomously scrape, structure, and query data from the web.


Is this production ready?

Yes β€” with proper error handling and LLM token/latency optimization, this stack can be used in production to power intelligent assistants, dashboards, search agents, and more.


Want to learn more about multi-agent systems and AI orchestration? Explore these guides:

Give your AI Agent superpowers with lightning-fast web data!