使用 ScrapeGraphAI 抓取 Zillow 房地产数据:完整指南

·5 分钟阅读 min read·教程
Share:
使用 ScrapeGraphAI 抓取 Zillow 房地产数据:完整指南

Scraping Zillow Real Estate Data with ScrapeGraphAI

In today's competitive real estate market, having access to accurate, real-time data is essential for making informed decisions. In this article, we'll show you how to extract property data from Zillow using ScrapeGraphAI. This approach allows you to monitor market trends, analyze property pricing, and enrich your website with unique, data-driven content.

Why Scrape Zillow?

Zillow is one of the most popular real estate platforms, offering comprehensive property details including:

  • Sale Price: Track market trends and price fluctuations.
  • Property Details: Get information on bedrooms, bathrooms, square footage, and property type.
  • Real Estate Agents: Identify which agencies are handling listings.
  • Direct Listing Links: Access direct URLs for more detailed property views.

These insights are valuable for real estate professionals, market analysts, and content creators aiming to boost SEO with fresh, relevant content.

Getting Started

Before you start, ensure you have:

  1. Python 3.8 or later installed.
  2. The ScrapeGraphAI SDK installed via: pip install scrapegraph-py
  3. An API key from the ScrapeGraphAI Dashboard.

Python Example: Scraping Zillow Data

The following examples demonstrate how to scrape property listings from Zillow, specifically for houses in Miami, FL, using different programming languages:

Python Example

python
from scrapegraph_py import Client
from scrapegraph_py.logger import sgai_logger

sgai_logger.set_logging(level="INFO")

# Initialize the client
sgai_client = Client(api_key="sgai-********************")
# SmartScraper request
response = sgai_client.smartscraper(
    website_url="https://www.zillow.com/miami-fl/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22isMapVisible%22%3Atrue%2C%22mapBounds%22%3A%7B%22west%22%3A-80.38463926220705%2C%22east%22%3A-79.98295164013673%2C%22south%22%3A25.667985313542392%2C%22north%22%3A25.935352623241528%7D%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A12700%2C%22regionType%22%3A6%7D%5D%2C%22filterState%22%3A%7B%22sort%22%3A%7B%22value%22%3A%22globalrelevanceex%22%7D%7D%2C%22isListVisible%22%3Atrue%2C%22mapZoom%22%3A12%7D",
    user_prompt="extract me all the houses"
)

# Print the response
print(f"Request ID: {response['request_id']}")
print(f"Result: {response['result']}")

sgai_client.close()

JavaScript Example

javascript
import { Client } from 'scrapegraph-js';
import { z } from 'zod';

// Define the schema
const houseSchema = z.object({
  address: z.string(),
  price: z.string(),
  bedrooms: z.union([z.string(), z.number()]),
  bathrooms: z.union([z.string(), z.number()]),
  square_feet: z.union([z.string(), z.number()]),
  type: z.string(),
  agent: z.string(),
  link: z.string().optional()
});

type HouseSchema = z.infer<typeof houseSchema>;

// Initialize the client
const sgai_client = new Client("sgai-********************");

try {
  const response = await sgai_client.smartscraper({
    websiteUrl: "https://www.zillow.com/miami-fl/",
    userPrompt: "extract me all the houses",
    outputSchema: houseSchema
  });

  console.log('Request ID:', response.requestId);
  console.log('Result:', response.result);
} catch (error) {
  console.error(error);
} finally {
  sgai_client.close();
}

cURL Example

bash
curl -X 'POST' \
  'https://api.scrapegraphai.com/v1/smartscraper' \
  -H 'accept: application/json' \
  -H 'SGAI-APIKEY: sgai-********************' \
  -H 'Content-Type: application/json' \
  -d '{
  "website_url": "https://www.zillow.com/miami-fl/",
  "user_prompt": "extract me all the houses",
  "output_schema": {
    "type": "object",
    "properties": {
      "address": { "type": "string" },
      "price": { "type": "string" },
      "bedrooms": { "type": ["string", "number"] },
      "bathrooms": { "type": ["string", "number"] },
      "square_feet": { "type": ["string", "number"] },
      "type": { "type": "string" },
      "agent": { "type": "string" },
      "link": { "type": "string" }
    },
    "required": ["address", "price", "bedrooms", "bathrooms", "square_feet", "type", "agent"]
  }
}'

Expected Output

Running the script returns a JSON object with a list of houses. For example:

json
{
  "houses": [
    {
      "address": "481 NE 29th St #606, Miami, FL 33137",
      "price": "$465,000",
      "bedrooms": 2,
      "bathrooms": 2,
      "square_feet": 836,
      "type": "Condo for sale",
      "agent": "SOUTH BEACH ESTATES, LLC"
    },
    {
      "address": "677 NE 24th St APT 703, Miami, FL 33137",
      "price": "$310,000",
      "bedrooms": 1,
      "bathrooms": 2,
      "square_feet": 696,
      "type": "Condo for sale",
      "agent": "INMO BROKERS GROUP, LLC."
    },
    {
      "address": "9001 SW 77th Ave APT C305, Miami, FL 33156",
      "price": "$250,000",
      "bedrooms": 1,
      "bathrooms": 1,
      "square_feet": 699,
      "type": "Condo for sale",
      "agent": "WILLIS WILSON & ASSOCIATES"
    },
    {
      "address": "300 NW 42nd Ave APT 802, Miami, FL 33126",
      "price": "$328,900",
      "bedrooms": 2,
      "bathrooms": 2,
      "square_feet": 972,
      "type": "Condo for sale",
      "agent": "MIAMI NEW REALTY"
    },
    {
      "address": "1770 NW 51st Ter, Miami, FL 33142",
      "price": "$469,000",
      "bedrooms": 3,
      "bathrooms": 2,
      "square_feet": 1152,
      "type": "Coming soon",
      "agent": "LONDON FOSTER REALTY"
    },
    {
      "address": "4242 NW 2nd St APT 604, Miami, FL 33126",
      "price": "$530,000",
      "bedrooms": 3,
      "bathrooms": 2,
      "square_feet": 1270,
      "type": "Condo for sale",
      "agent": "MOURIZ PROPERTIES"
    },
    {
      "address": "1451 Brickell Ave #PENTHOUSE 54, Miami, FL 33131",
      "price": "$17,750,000",
      "bedrooms": 4,
      "bathrooms": 5,
      "square_feet": 4184,
      "type": "Condo for sale",
      "agent": "DOUGLAS ELLIMAN"
    },
    {
      "address": "1210 SW 91st Ave, Miami, FL 33174",
      "price": "$680,000",
      "bedrooms": 3,
      "bathrooms": 2,
      "square_feet": 1751,
      "type": "House for sale",
      "agent": "FLORIDIAN FIRST REALTY CORP",
      "link": "https://www.zillow.com/homedetails/1210-SW-91st-Ave-Miami-FL-33174/44181078_zpid/"
    },
    {
      "address": "2341 NW 24th Ave, Miami, FL 33142",
      "price": "$890,000",
      "bedrooms": "4",
      "bathrooms": "2",
      "square_feet": "2,342",
      "listing_type": "For sale by owner",
      "link": "https://www.zillow.com/homedetails/2341-NW-24th-Ave-Miami-FL-33142/43817911_zpid/"
    }
  ]
}

Best Practices for Real Estate Data Scraping

When scraping real estate websites like Zillow:

  • Respect Rate Limits: Include delays between requests to avoid overwhelming the server.
  • Data Validation: Ensure the extracted data is accurate and clean.
  • Error Handling: Implement robust error handling to manage any issues during scraping.
  • Compliance: Always review and adhere to the website's terms of service and robots.txt guidelines.

Conclusion

Using ScrapeGraphAI to extract real estate data from Zillow is a powerful way to gain insights into property listings and market trends. Whether you're a real estate professional or a data-driven content creator, this method provides you with the timely, accurate information needed to make informed decisions.

Ready to start your real estate data extraction journey? Get your API key and dive in!

Did you find this article helpful?

Share it with your network!

Share:

Transform Your Data Collection

Experience the power of AI-driven web scraping with ScrapeGrapAI API. Start collecting structured data in minutes, not days.