Building a Full-Stack AI Web App with Cursor, OpenAI o1, Vercel v0, ScrapeGraphAI, and Patched

In this tutorial, we'll demonstrate how to rapidly develop a full-stack AI web application by integrating several powerful tools: Cursor, OpenAI's o1 model, Vercel's v0, ScrapeGraphAI, and Patched. This combination allows for efficient prototyping and deployment of AI-driven applications.

Tools Overview

Cursor: An AI-enhanced code editor that assists in writing and understanding code.
OpenAI o1 Model: A powerful language model capable of understanding and generating human-like text.
Vercel v0: A platform for deploying web applications with ease.
ScrapeGraphAI: An AI-powered web scraping tool that simplifies data extraction from websites.
Patched: A tool for managing and deploying AI agents in production environments.

Step 1: Set Up Your Development Environment

Begin by installing the necessary tools and setting up your development environment.

# Install ScrapeGraphAI
pip install scrapegraphai
 
# Install Playwright for browser automation
pip install playwright
playwright install

Step 2: Create a ScrapeGraphAI Pipeline

Use ScrapeGraphAI to extract data from a target website. Here's an example of how to set up a simple scraping pipeline:

from scrapegraph_py import Client
from scrapegraph_py.logger import sgai_logger
 
sgai_logger.set_logging(level="INFO")
 
# Initialize the client
sgai_client = Client(api_key="your-scrapegraph-api-key-here")
 
# SmartScraper request
response = sgai_client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract webpage information"
)
 
# Print the response
print(f"Request ID: {response['request_id']}")
print(f"Result: {response['result']}")
if response.get('reference_urls'):
    print(f"Reference URLs: {response['reference_urls']}")

Step 3: Design Your Frontend with Vercel v0

Vercel v0 allows you to quickly generate UI components and layouts. Use natural language to describe your desired interface:

# Use Vercel v0 to generate components
npx v0@latest add "Create a dashboard with data visualization cards"

This will generate React components that you can customize for your application.

Step 4: Integrate OpenAI o1 Model

Create an AI service that processes the scraped data:

import openai
from typing import Dict, Any
 
class AIAnalyzer:
    def __init__(self, api_key: str):
        self.client = openai.OpenAI(api_key=api_key)
    
    def analyze_scraped_data(self, data: Dict[str, Any]) -> str:
        """Analyze scraped data using OpenAI o1 model"""
        
        prompt = f"""
        Analyze the following scraped data and provide insights:
        
        Data: {data}
        
        Please provide:
        1. Key findings
        2. Patterns or trends
        3. Actionable recommendations
        """
        
        response = self.client.chat.completions.create(
            model="o1-preview",
            messages=[
                {"role": "user", "content": prompt}
            ]
        )
        
        return response.choices[0].message.content
 
# Usage
analyzer = AIAnalyzer(api_key="your-openai-key")
insights = analyzer.analyze_scraped_data(scraped_data)

Step 5: Build the Backend API

Create a FastAPI backend that orchestrates all components:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Optional
import asyncio
 
app = FastAPI(title="AI Web Scraping App")
 
class ScrapeRequest(BaseModel):
    url: str
    prompt: str
    analyze: Optional[bool] = True
 
class ScrapeResponse(BaseModel):
    data: dict
    analysis: Optional[str] = None
    request_id: str
 
@app.post("/scrape", response_model=ScrapeResponse)
async def scrape_and_analyze(request: ScrapeRequest):
    """Scrape data and optionally analyze it"""
    
    try:
        # Step 1: Scrape data
        scrape_response = sgai_client.smartscraper(
            website_url=request.url,
            user_prompt=request.prompt
        )
        
        analysis = None
        if request.analyze:
            # Step 2: Analyze data with AI
            analyzer = AIAnalyzer(api_key="your-openai-key")
            analysis = analyzer.analyze_scraped_data(scrape_response['result'])
        
        return ScrapeResponse(
            data=scrape_response['result'],
            analysis=analysis,
            request_id=scrape_response['request_id']
        )
        
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
 
@app.get("/health")
async def health_check():
    return {"status": "healthy"}

Step 6: Frontend Implementation with React and Next.js

Create a React frontend that interacts with your API:

// components/ScrapingDashboard.tsx
import React, { useState } from 'react';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { Button } from '@/components/ui/button';
import { Input } from '@/components/ui/input';
import { Textarea } from '@/components/ui/textarea';
 
interface ScrapeData {
  data: any;
  analysis?: string;
  request_id: string;
}
 
export default function ScrapingDashboard() {
  const [url, setUrl] = useState('');
  const [prompt, setPrompt] = useState('');
  const [loading, setLoading] = useState(false);
  const [result, setResult] = useState<ScrapeData | null>(null);
 
  const handleScrape = async () => {
    setLoading(true);
    try {
      const response = await fetch('/api/scrape', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          url,
          prompt,
          analyze: true
        }),
      });
 
      const data = await response.json();
      setResult(data);
    } catch (error) {
      console.error('Scraping failed:', error);
    } finally {
      setLoading(false);
    }
  };
 
  return (
    <div className="max-w-4xl mx-auto p-6 space-y-6">
      <Card>
        <CardHeader>
          <CardTitle>AI Web Scraper</CardTitle>
        </CardHeader>
        <CardContent className="space-y-4">
          <Input
            placeholder="Enter URL to scrape"
            value={url}
            onChange={(e) => setUrl(e.target.value)}
          />
          <Textarea
            placeholder="Describe what data you want to extract"
            value={prompt}
            onChange={(e) => setPrompt(e.target.value)}
            rows={3}
          />
          <Button 
            onClick={handleScrape} 
            disabled={loading || !url || !prompt}
          >
            {loading ? 'Scraping...' : 'Scrape & Analyze'}
          </Button>
        </CardContent>
      </Card>
 
      {result && (
        <div className="grid grid-cols-1 md:grid-cols-2 gap-6">
          <Card>
            <CardHeader>
              <CardTitle>Scraped Data</CardTitle>
            </CardHeader>
            <CardContent>
              <pre className="text-sm overflow-auto max-h-96">
                {JSON.stringify(result.data, null, 2)}
              </pre>
            </CardContent>
          </Card>
 
          {result.analysis && (
            <Card>
              <CardHeader>
                <CardTitle>AI Analysis</CardTitle>
              </CardHeader>
              <CardContent>
                <div className="whitespace-pre-wrap text-sm">
                  {result.analysis}
                </div>
              </CardContent>
            </Card>
          )}
        </div>
      )}
    </div>
  );
}

Step 7: Enhance with Cursor AI

Use Cursor's AI capabilities to improve your code:

Code Generation: Use Cursor to generate boilerplate code and components
Error Fixing: Let Cursor help debug and fix issues
Code Optimization: Get suggestions for improving performance
Documentation: Generate comments and documentation

// Use Cursor AI to generate utility functions
// Prompt: "Create a utility function to format scraped data for display"
 
export const formatScrapedData = (data: any): string => {
  if (typeof data === 'string') {
    return data;
  }
  
  if (Array.isArray(data)) {
    return data.map(item => formatScrapedData(item)).join('\n');
  }
  
  if (typeof data === 'object' && data !== null) {
    return Object.entries(data)
      .map(([key, value]) => `${key}: ${formatScrapedData(value)}`)
      .join('\n');
  }
  
  return String(data);
};

Step 8: Deploy with Patched

Use Patched to manage and deploy your AI agents in production:

# patched_config.py
from patched import PatchedAgent
 
# Create an agent configuration
agent_config = {
    "name": "web-scraper-agent",
    "description": "AI-powered web scraping agent",
    "endpoints": [
        {
            "path": "/scrape",
            "method": "POST",
            "handler": "scrape_and_analyze"
        }
    ],
    "dependencies": [
        "scrapegraphai",
        "openai",
        "fastapi"
    ],
    "environment": {
        "SCRAPEGRAPH_API_KEY": "${SCRAPEGRAPH_API_KEY}",
        "OPENAI_API_KEY": "${OPENAI_API_KEY}"
    }
}
 
# Deploy the agent
agent = PatchedAgent(config=agent_config)
agent.deploy()

Step 9: Add Advanced Features

Real-time Updates with WebSockets

from fastapi import WebSocket
import json
 
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    
    while True:
        try:
            # Receive scraping request via WebSocket
            data = await websocket.receive_text()
            request_data = json.loads(data)
            
            # Process scraping request
            result = await process_scrape_request(request_data)
            
            # Send results back
            await websocket.send_text(json.dumps(result))
            
        except Exception as e:
            await websocket.send_text(json.dumps({
                "error": str(e)
            }))

Data Visualization

// components/DataVisualization.tsx
import { BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, Legend } from 'recharts';
 
interface DataVisualizationProps {
  data: any[];
}
 
export function DataVisualization({ data }: DataVisualizationProps) {
  const processedData = data.map((item, index) => ({
    name: item.name || `Item ${index + 1}`,
    value: parseFloat(item.value) || 0
  }));
 
  return (
    <BarChart width={600} height={300} data={processedData}>
      <CartesianGrid strokeDasharray="3 3" />
      <XAxis dataKey="name" />
      <YAxis />
      <Tooltip />
      <Legend />
      <Bar dataKey="value" fill="#8884d8" />
    </BarChart>
  );
}

Step 10: Production Deployment

Deploy your application using Vercel:

# Install Vercel CLI
npm install -g vercel
 
# Deploy frontend
vercel --prod
 
# Set environment variables
vercel env add SCRAPEGRAPH_API_KEY
vercel env add OPENAI_API_KEY

For the backend, use a service like Railway, Heroku, or AWS:

# Dockerfile
FROM python:3.11-slim
 
WORKDIR /app
 
COPY requirements.txt .
RUN pip install -r requirements.txt
 
COPY . .
 
EXPOSE 8000
 
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Best Practices and Tips

Error Handling

import logging
from functools import wraps
 
def handle_errors(func):
    @wraps(func)
    async def wrapper(*args, **kwargs):
        try:
            return await func(*args, **kwargs)
        except Exception as e:
            logging.error(f"Error in {func.__name__}: {str(e)}")
            raise HTTPException(
                status_code=500,
                detail=f"Internal server error: {str(e)}"
            )
    return wrapper
 
@app.post("/scrape")
@handle_errors
async def scrape_endpoint(request: ScrapeRequest):
    # Your scraping logic here
    pass

Rate Limiting

from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
 
limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
 
@app.post("/scrape")
@limiter.limit("10/minute")
async def scrape_endpoint(request: Request, scrape_request: ScrapeRequest):
    # Rate-limited scraping logic
    pass

Caching

from functools import lru_cache
import hashlib
 
@lru_cache(maxsize=100)
def get_cached_scrape_result(url: str, prompt: str):
    """Cache scraping results to avoid duplicate requests"""
    cache_key = hashlib.md5(f"{url}:{prompt}".encode()).hexdigest()
    
    # Check cache first
    cached_result = cache.get(cache_key)
    if cached_result:
        return cached_result
    
    # If not cached, perform scraping
    result = perform_scraping(url, prompt)
    
    # Cache result for 1 hour
    cache.set(cache_key, result, timeout=3600)
    
    return result

Monitoring and Analytics

from prometheus_client import Counter, Histogram, start_http_server
 
# Metrics
scrape_requests = Counter('scrape_requests_total', 'Total scrape requests')
scrape_duration = Histogram('scrape_duration_seconds', 'Scrape request duration')
 
@app.middleware("http")
async def add_metrics(request: Request, call_next):
    start_time = time.time()
    
    response = await call_next(request)
    
    # Record metrics
    scrape_requests.inc()
    scrape_duration.observe(time.time() - start_time)
    
    return response
 
# Start metrics server
start_http_server(8001)

Conclusion

This tutorial demonstrates how to build a comprehensive full-stack AI web application using modern tools and services. The combination of ScrapeGraphAI for data extraction, OpenAI for analysis, and various deployment tools creates a powerful platform for AI-driven web applications.

Key takeaways:

Rapid prototyping with AI-assisted development tools
Modular architecture allowing for easy scaling and maintenance
Production-ready deployment with proper error handling and monitoring
AI integration throughout the stack for intelligent data processing

Related Resources

Want to learn more about building AI applications and web scraping? Check out these guides:

Web Scraping 101 - Master the basics of web scraping
AI Agent Web Scraping - Advanced AI-powered scraping techniques
Mastering ScrapeGraphAI - Complete guide to ScrapeGraphAI
Building Intelligent Agents - AI agent development
Scraping with Python - Python web scraping techniques
Web Scraping Legality - Legal considerations

These resources will help you build sophisticated AI applications and become proficient in modern web scraping techniques.