Live Proxies

How to Scrape eBay with Python in 2026: Complete Guide

Learn how to scrape eBay with Python in 2026, extract listings and sold prices, handle anti-bot checks, and scale with rotating proxies.

How to Scrape eBay with Python
Live Proxies

Live Proxies Editorial Team

Content Manager

How To

9 April 2026

eBay's 2+ billion listings make it a valuable source for pricing research, competitor monitoring, and market analysis. But scraping it isn't straightforward: anti-bot detection blocks most standard approaches.

This guide provides 4 Python scrapers examples designed to reduce blocks and handle common anti-bot challenges on eBay, though reliability depends on request patterns, IP reputation, and ongoing site changes.

What you'll learn:

  • Scrape search results, item details, sold prices, and seller profiles

  • Bypass anti-bot detection with SeleniumBase UC Mode

  • Handle both old and new eBay page layouts

  • Scale to hundreds of pages using rotating proxies

Scrape eBay with Python: quick start

The best way to understand eBay scraping is to see working code first.

pip install seleniumbase beautifulsoup4

The code below searches eBay and extracts basic listing data. It uses SeleniumBase with uc=True (Undetected Chrome mode), which patches the ChromeDriver to avoid detection:

from seleniumbase import SB
from bs4 import BeautifulSoup
import re

def scrape_ebay_search(query, max_price=None):
   url = f"https://www.ebay.com/sch/i.html?_nkw={query}" + (
       f"&_udhi={max_price}" if max_price else ""
   )
   with SB(uc=True, incognito=True) as sb:
       sb.uc_open_with_reconnect(url, reconnect_time=3)
       sb.sleep(2)
       soup = BeautifulSoup(sb.get_page_source(), "html.parser")
       items = []
       skip_badges = ["New Listing", "Opens in a new window or tab"]
       for card in soup.select("li.s-card[data-listingid]"):
           if len(card.get("data-listingid") or "") > 15:
               continue
           if not (title_el := card.select_one(".s-card__title")):
               continue
           texts = [
               s.get_text(strip=True)
               for s in title_el.select("span")
               if s.get_text(strip=True) not in skip_badges
           ]
           title = max(texts, key=len) if texts else ""
           if not title or "Shop on eBay" in title:
               continue
           if not (price_el := card.select_one(".s-card__price")):
               continue
           match = re.search(
               r"[\d,]+\.?\d*", price_el.get_text(strip=True).replace(",", "")
           )
           if not match:
               continue
           items.append({"title": title, "price": float(match.group())})
       return items

if __name__ == "__main__":
   for item in scrape_ebay_search("wireless keyboard", max_price=150)[:5]:
       print(f"${item['price']:.2f} - {item['title'][:60]}")

   # Sample output:
   # $49.99 - Logitech K400 Plus Wireless Touch Keyboard with Built-in
   # $34.99 - Microsoft Wireless Keyboard 850 - Black
   # $89.00 - Logitech MX Keys Advanced Wireless Illuminated Keyboard

The code filters out eBay's promotional cards by skipping listings with unusually long IDs (over 15 characters) or "Shop on eBay" in the title, so you only get real product listings.

The uc_open_with_reconnect method includes a disconnect-reconnect cycle that helps bypass anti-bot protections. The reconnect_time=3 parameter specifies how long the driver stays disconnected from Chrome (3 seconds) while the page loads. The reconnect cycle may help reduce automation signals during page load, but detection outcomes still depend on multiple factors such as IP reputation, request frequency, and browser fingerprint consistency. Setting incognito=True starts each session with a clean browser state: no cookies, cached data, or history that could identify you as a returning visitor.

Why use BeautifulSoup after SeleniumBase? SeleniumBase handles browser automation and anti-bot bypass. Once we have the rendered HTML, BeautifulSoup is faster and more convenient for parsing the data. You could use Selenium's built-in selectors, but BeautifulSoup's API is cleaner for extracting multiple elements.

If it works, you have the foundation. The rest of this guide builds on these same patterns. If you encounter a CAPTCHA or block, that's expected; we'll cover how to handle this below.

When eBay detects suspicious activity, you may first encounter a browser verification check:

ebay-browser-verification-check

This is eBay's browser verification check, an interstitial page that runs fingerprinting checks before redirecting you to the content. It typically resolves on its own within a few seconds. The uc_open_with_reconnect method in SeleniumBase accounts for this by staying disconnected during page load, giving the check time to complete before the driver reconnects. If this check fails to resolve and you get stuck or blocked, your IP may be flagged. Rotating residential proxies helps by distributing requests across clean IPs.

If eBay still flags your session, you'll see an hCaptcha challenge:

hCaptcha-challenge

This hCaptcha challenge is eBay's way of filtering out automated traffic. The scrapers attempt to click the checkbox automatically when running in visible mode, but this isn't foolproof. Image puzzles require manual intervention. The scaling section covers how to minimise the likelihood of encountering CAPTCHAs in the first place.

Download the complete scrapers

The code snippets throughout this guide show the key patterns, but the full scrapers have additional features: CLI interfaces, proxy support, multi-page pagination, and comprehensive error handling. Download the complete files here:

Each scraper runs standalone from the command line:

# Search active listings
python ebay_search_scraper.py "mechanical keyboard"
python ebay_search_scraper.py "iphone 15" --condition used --max-price 800 --pages 3

# Scrape sold prices
python ebay_sold_scraper.py "ps5 console"
python ebay_sold_scraper.py "rolex submariner" --min-price 5000 --sort date_desc

# Scrape full item details
python ebay_item_scraper.py 315166443569
python ebay_item_scraper.py 315166443569 234567890123  # multiple items

# Scrape seller info
python ebay_seller_scraper.py 315166443569
python ebay_seller_scraper.py 315166443569 --feedback-pages 3 --type negative

Run python <scraper>.py --help for all available options.

What can you scrape from eBay?

eBay has several different data sources, and they vary considerably in how easy they are to access. If you're new to web scraping, eBay is actually one of the more complex sites to work with, but also one of the most valuable for eCommerce data.

Search results are your starting point. You get titles, prices, conditions, seller info, shipping costs, and thumbnails. Search results are often easier to extract compared to deeper pages because they contain structured listing cards, though they are still protected by anti-automation measures.

Individual listings contain the complete picture. Full descriptions, high-resolution image galleries, item specifics (brand, model, color, size), return policies, shipping options, quantity available, watcher counts, and variation options (sizes, colors). This is where you get data that search results simply don't include.

Sold listings are often the most useful data on eBay for pricing research. While active listings show asking prices, sold listings show what people actually paid. eBay keeps approximately 90 days of this data visible, accessed through the same search endpoint with different filters.

Seller profiles and reviews add context. Feedback scores, member tenure, detailed ratings, and individual buyer reviews. These pages have noticeably stronger anti-bot measures than search results. The dedicated feedback pages (/fdbk/) are the most challenging to scrape reliably.

One thing you won't find is aggregated inventory. You can see how many listings a seller has and the quantity available on each listing, but there's no single endpoint showing total stock across all their listings. You'd have to scrape each listing individually and sum the quantities.

Does eBay allow web scraping?

Not officially. eBay's Terms of Service prohibit automated data collection without permission. eBay offers official APIs. The Browse API covers many common use cases, and if it works for your needs, that's the recommended path.

For data the APIs don't expose (like historical sold prices or detailed seller analytics), some choose to scrape public listing data. The scrapers in this guide only access publicly visible information. No login required, no personal user data. They include delays and retry logic to minimize server load.

If you're scraping at scale or for commercial purposes, consider consulting legal counsel about your specific use case.

Python setup for eBay scraping

eBay's anti-bot protections require a real browser to bypass. SeleniumBase in UC (Undetected Chrome) mode makes this straightforward. It runs an actual Chrome browser with anti-detection patches applied to the ChromeDriver, so you don't have to configure anything manually.

Requirements: Python 3.10+ and Chrome browser installed.

Installation

Install the dependencies in a virtual environment:

python -m venv ebay_scraper_env                                                                                                                                                  
source ebay_scraper_env/bin/activate  # Windows: ebay_scraper_env\Scripts\activate                                                                                               
                                                                                                                                                                                  
pip install seleniumbase beautifulsoup4

A note on headless mode: If you're running on a server without a display, you might be tempted to use headless=True. Unfortunately, Some headless configurations can increase detection risk, though results vary depending on browser fingerprinting setup and request behavior.

A better approach is Xvfb (X Virtual Framebuffer), which runs a headed browser in a virtual display:

# Ubuntu/Debian                                                                                                                                                                  
sudo apt-get install xvfb                                                                                                                                                        
                                                                                                                                                                                  
# Then in your code                                                                                                                                                              
with SB(uc=True, xvfb=True) as sb:                                                                                                                                               
    # runs like a headed browser, but without needing a real display

eBay URL parameter reference

eBay's search URLs use query parameters to control filtering and sorting. Understanding these lets you construct any search programmatically:

CONDITION_MAP = {
    "new": "1000",
    "open_box": "1500",
    "refurbished": "2000",
    "used": "3000",
    "for_parts": "7000",
}

SORT_MAP = {
    "best_match": "12",
    "price_asc": "15",
    "price_desc": "16",
    "ending_soonest": "1",
    "newly_listed": "10",
}

SOLD_SORT_MAP = {
    "best_match": "12",
    "price_asc": "15",
    "price_desc": "16",
    "date_desc": "13",
    "date_asc": "1",
}

Example: _ebay​.com/sch/i.html?nkw=laptop&LH_ItemCondition=3000 filters to used items only.

Scrape eBay search results with Python

Search results are typically where eBay data collection begins. eBay serves different HTML layouts to different users. You might get the modern s-card elements or the older s-item layout. The scraper handles both automatically.

Using browser DevTools, you can see that each listing is an li element with the class s-card and a data-listingid attribute containing the item ID. The title, price, and seller info are nested inside with their own class selectors.

listing

Setting up the scraper class

The scraper class stores configuration that affects how it interacts with eBay:

class EbayScraper:
    BASE_URL = "https://www.ebay.com/sch/i.html"
    RECONNECT_TIME = 3

    def __init__(
        self, headless: bool = False, xvfb: bool = False, proxy: str | None = None
    ):
        self.headless = headless
        self.xvfb = xvfb and platform.system() == "Linux"
        self.proxy = proxy

Why these defaults? Setting headless=False is intentional. UC Mode is more easily detected in headless mode, as eBay's anti-bot systems look for headless browser signatures. The xvfb option is Linux-only (hence the platform check) and provides a better alternative for servers without displays. The proxy parameter accepts a string like "user:pass@host:port" for rotating residential proxies, which becomes important when scraping at scale.

Building the search URL

The search method constructs an eBay URL from your parameters and validates inputs against known parameter maps to avoid broken URLs:

def search(
    self,
    query: str,
    category: int = 0,
    page: int = 1,
    min_price: float | None = None,
    max_price: float | None = None,
    condition: str | None = None,
    sort: str | None = None,
) -> dict:
    params = {"_nkw": query, "_sacat": category, "_pgn": page}

    if min_price is not None:
        params["_udlo"] = min_price
    if max_price is not None:
        params["_udhi"] = max_price
    if condition and condition.lower() in CONDITION_MAP:
        params["LH_ItemCondition"] = CONDITION_MAP[condition.lower()]
    if sort and sort.lower() in SORT_MAP:
        params["_sop"] = SORT_MAP[sort.lower()]

    url = f"{self.BASE_URL}?{urlencode(params)}"
    return self._fetch_and_parse(url)

Parameter breakdown:

Parameter Purpose Example
_nkw Search keywords (name keywords) laptop
_sacat Category ID (0 = all categories) 175672 (laptops)
_pgn Page number for pagination 1, 2, 3...
_udlo / _udhi Price range (user-defined low/high) 100, 500
LH_ItemCondition Condition filter 3000 (used)
_sop Sort order preference 15 (price low to high)

The check condition.lower() in CONDITION_MAP prevents invalid parameters from reaching the URL. If someone passes condition="mint" (not a valid eBay filter), it's silently ignored rather than breaking the request.

Fetching and handling anti-bot detection

The fetch method handles browser interaction and CAPTCHA detection. The code below is simplified for clarity.

def _fetch_and_parse(self, url: str) -> dict:
    results = {"url": url, "items": [], "total_results": 0, "error": None}

    try:
        with SB(
            uc=True,
            headless=self.headless,
            incognito=True,
            xvfb=self.xvfb,
            proxy=self.proxy,
        ) as sb:

            sb.uc_open_with_reconnect(url, reconnect_time=self.RECONNECT_TIME)
            random_delay()

            page_source = sb.get_page_source()

            if (
                "Pardon Our Interruption" in page_source
                or "captcha" in page_source.lower()
            ):
                logger.warning("CAPTCHA detected, attempting to solve...")
                try:
                    sb.uc_gui_click_captcha()
                    random_delay()
                    page_source = sb.get_page_source()
                except Exception as e:
                    logger.error(f"CAPTCHA handling failed: {e}")

            if "Pardon Our Interruption" in page_source:
                results["error"] = "Blocked by anti-bot (CAPTCHA not solved)"
                return results

            return self._parse_html(page_source, results)

    except Exception as e:
        results["error"] = str(e)
        return results

Setting incognito=True starts each browser session without cookies or cached data, preventing eBay from building a profile across multiple scraping sessions.

About uc_gui_click_captcha(): This SeleniumBase method attempts to click CAPTCHA checkboxes automatically. It works for simple checkbox CAPTCHAs but not complex image challenges. When it fails, the scraper returns an error rather than hanging indefinitely.

Random delays between requests help avoid detection patterns. Fixed delays can create identifiable patterns that anti-bot systems may flag.

Parsing the HTML: handling dual layouts

eBay uses two different layouts, so the scraper checks for both. The modern s-card layout appears for most users, but some still receive the legacy s-item layout:

def _parse_html(self, html: str, results: dict) -> dict:
    soup = BeautifulSoup(html, "html.parser")

    items = soup.select("li.s-card[data-listingid]")

    if items:
        for item in items:
            listing = self._parse_card_item(item)
            if listing:
                results["items"].append(listing)
    else:
        for item in soup.select("li.s-item"):
            listing = self._parse_legacy_item(item)
            if listing:
                results["items"].append(listing)

    results["total_results"] = len(results["items"])

    total_elem = soup.select_one(".srp-controls__count-heading span")
    if total_elem:
        match = re.search(r"([\d,]+)", total_elem.get_text(strip=True))
        if match:
            results["total_on_ebay"] = int(match.group(1).replace(",", ""))

    return results

eBay A/B tests different layouts and may serve different HTML based on geography, device, or random assignment. Supporting both ensures the scraper works regardless of which version you receive.

The [data-listingid] attribute selector targets listing cards specifically. The code also filters out promotional content by checking for "shop on ebay" in titles, since some promotional cards can have listing IDs.

Extracting item details from the modern layout

The parsing method uses defensive coding. Every extraction is wrapped in checks because eBay's HTML structure varies between listings:

def _parse_card_item(self, item) -> dict | None:
  try:
      listing_id = item.get("data-listingid")

      title_elem = item.select_one(".s-card__title span.su-styled-text")
      title = title_elem.get_text(strip=True) if title_elem else None
      if title and "shop on ebay" in title.lower():
          return None

      img_elem = item.select_one(".s-card__image")
      image_url = img_elem.get("src") if img_elem else None

      link_elem = item.select_one("a.s-card__link[href]")
      item_url = None
      if link_elem:
          item_url = link_elem.get("href")
          if item_url and "?" in item_url:
              item_url = item_url.split("?")[0]

      condition_elem = item.select_one(".s-card__subtitle span.su-styled-text")
      condition = condition_elem.get_text(strip=True) if condition_elem else None

Stripping URL tracking parameters (?_trkparms=...) keeps your stored URLs clean. These parameters don't affect the page content you'll see.

Price extraction handles both single prices and ranges:

        price_elems = item.select(".s-card__price")
        prices = [
            p.get_text(strip=True)
            for p in price_elems
            if p.get_text(strip=True) not in ["to", ""]
        ]
        
        price = None
        price_range = None
        if len(prices) == 1:
            price = self._clean_price(prices[0])
        elif len(prices) >= 2:
            price = self._clean_price(prices[0])
            price_range = {
                "min": self._clean_price(prices[0]),
                "max": self._clean_price(prices[-1]),
            }

Some listings show a range like "$25.00 to $35.00" for items with variants. The scraper captures both the minimum price (useful for sorting and filtering) and the full range for accurate display.

Seller info, shipping, and engagement metrics are extracted from attribute rows:

      shipping = None
      free_shipping = False
      seller_name = None
      seller_feedback_pct = None
      sold_count = None
      watchers = None

      for row in item.select(".s-card__attribute-row"):
          text = row.get_text(strip=True)
          text_lower = text.lower()

          if "shipping" in text_lower:
              shipping = text
              free_shipping = "free" in text_lower
          elif "positive" in text_lower and "%" in text:
              seller_match = re.match(
                  r"(.+?)([\d.]+)%\s*positive", text, re.IGNORECASE
              )
              if seller_match:
                  seller_name = seller_match.group(1).strip()
                  seller_feedback_pct = float(seller_match.group(2))
          elif "sold" in text_lower:
              sold_match = re.search(r"(\d+)\s*sold", text_lower)
              if sold_match:
                  sold_count = int(sold_match.group(1))
          elif "watch" in text_lower:
              watch_match = re.search(r"(\d+)\s*watch", text_lower)
              if watch_match:
                  watchers = int(watch_match.group(1))

      return {
          "listing_id": listing_id,
          "title": title,
          "price": price,
          "price_range": price_range,
          "currency": "USD",
          "condition": condition,
          "shipping": shipping,
          "free_shipping": free_shipping,
          "seller": {
              "name": seller_name,
              "feedback_pct": seller_feedback_pct
          } if seller_name else None,
          "image_url": image_url,
          "item_url": item_url,
          "sold_count": sold_count,
          "watchers": watchers,
      }
  except Exception as e:
      logger.debug(f"Error parsing card: {e}")
      return None

Returning None on exception lets the scraper skip problematic items rather than crashing the entire operation.

Multi-page scraping in a single session

For scraping multiple pages efficiently, the browser session stays open and navigation methods change after the first page:

def search_multiple_pages(self, query: str, max_pages: int = 3, **kwargs) -> dict:
    all_items = []

    try:
        with SB(
            uc=True,
            headless=self.headless,
            incognito=True,
            xvfb=self.xvfb,
            proxy=self.proxy,
        ) as sb:

            for page_num in range(1, max_pages + 1):
                params = {"_nkw": query, "_sacat": 0, "_pgn": page_num}
                url = f"{self.BASE_URL}?{urlencode(params)}"

                if page_num == 1:
                    sb.uc_open_with_reconnect(url, reconnect_time=self.RECONNECT_TIME)
                else:
                    sb.open(url)

                random_delay()
                page_source = sb.get_page_source()

                if "Pardon Our Interruption" in page_source:
                    break

                results = self._parse_html(page_source, {"items": []})

                if not results["items"]:
                    break

                all_items.extend(results["items"])

                if page_num < max_pages:
                    random_delay()

            return {
                "items": all_items,
                "total_results": len(all_items),
                "pages_scraped": page_num,
                "error": None,
            }

    except Exception as e:
        return {"items": all_items, "error": str(e)}

The anti-bot bypass (uc_open_with_reconnect) is slower and more resource-intensive. Once a session is established on the first page, subsequent pages typically load without challenge using regular sb.open(). This approach is faster and mimics how users actually browse, clicking through results rather than starting fresh each time.

Usage example and output

Here's how to use the scraper:

scraper = EbayScraper()

# Basic search
results = scraper.search("mechanical keyboard")

# With filters
results = scraper.search(
    query="mechanical keyboard",
    min_price=50,
    max_price=150,
    condition="used",
    sort="price_asc",
)

# Multiple pages in one session
results = scraper.search_multiple_pages(
    query="mechanical keyboard", max_pages=5, condition="new"
)

for item in results["items"]:
    print(f"${item['price']:.2f} - {item['title']}")

Example output structure:

{
    "url": "https://www.ebay.com/sch/i.html?_nkw=mechanical+keyboard&_sacat=0&_pgn=1",
    "items": [
        {
            "listing_id": "116434118798",
            "title": "Cherry MX Mechanical Keyboard Tactile Brown Switches",
            "price": 199.99,
            "price_range": null,
            "currency": "USD",
            "condition": "Brand New",
            "buy_format": "Buy It Now",
            "shipping": "Free International Shipping",
            "free_shipping": true,
            "free_returns": false,
            "location": "Located in China",
            "seller": {
                "name": "furyauction",
                "feedback_pct": 100.0,
                "feedback_count": 1700,
            },
            "image_url": "https://i.ebayimg.com/images/g/.../s-l500.webp",
            "item_url": "https://www.ebay.com/itm/116434118798",
            "is_new_listing": false,
            "is_sponsored": true,
            "sold_count": 24,
            "watchers": 15,
            "bids": null,
        }
    ],
    "total_results": 60,
    "total_on_ebay": 23000,
    "error": null,
}

Fields like sold_count, watchers, and bids appear when eBay displays them on the search result card. Not all listings show this information.

Further reading: 8 Best Private Proxies in 2026 (Tested & Ranked) and 8 Best Rotating Proxies in 2026 (Tested and Ranked).

Scrape individual eBay product pages

Sometimes search results aren't enough. You need the full description, high-res images, or complete item specifics. Individual listing pages contain much more data, though they're also more heavily protected.

The key insight: eBay embeds structured JSON data (JSON-LD) directly in the page source. Extracting from this JSON is more reliable than parsing HTML, which changes frequently. The scraper uses JSON extraction as the primary method with HTML parsing as a fallback.

ebay-page

What you get

Individual listing pages contain data you can't get from search results:

  • Full item specifics: brand, model, MPN, UPC/EAN, dimensions, material, etc.

  • High-resolution images: up to 1600px (search results only give thumbnails)

  • Complete seller profile: feedback score, items sold, member tenure, top-rated status

  • Shipping & returns details: costs, delivery estimates, return policy

  • Variations: all available options (colors, sizes) with stock status

  • Full quantity data: exact available count and total sold history

Setting up the item scraper

Product pages are more heavily protected than search results, so RECONNECT_TIME is set to 4 seconds to allow eBay's anti-bot checks to complete before retrieving the page content.

class EbayItemScraper:
    RECONNECT_TIME = 4

    def __init__(
        self, headless: bool = False, xvfb: bool = False, proxy: str | None = None
    ):
        self.headless = headless
        self.xvfb = xvfb and platform.system() == "Linux"
        self.proxy = proxy

Fetching and extracting data

The get_item method fetches the page and extracts data from embedded JSON first, then fills gaps with HTML parsing. The code below is simplified for clarity. The full source includes CAPTCHA handling and proxy support.

def get_item(self, item_id: str) -> dict:
    url = f"https://www.ebay.com/itm/{item_id}"
    result = {
        "item_id": item_id,
        "url": url,
        "title": None,
        "price": {},
        "condition": {},
        "images": [],
        "item_specifics": {},
        "seller": {},
        "shipping": {},
        "returns": {},
        "quantity": {},
        "watchers": None,
        "sold_count": None,
        "location": None,
        "ships_to": None,
        "variations": [],
        "category": {},
        "error": None,
    }

    try:
        with SB(
            uc=True,
            headless=self.headless,
            incognito=True,
            xvfb=self.xvfb,
            proxy=self.proxy,
        ) as sb:

            sb.uc_open_with_reconnect(url, reconnect_time=self.RECONNECT_TIME)
            random_delay()

            page_source = sb.get_page_source()

            # Primary: extract from embedded JSON
            item_data = self._extract_json_data(page_source)
            if item_data:
                result = self._parse_item_data(item_data, result)

            # Fallback: fill missing fields from HTML
            result = self._parse_html_fallback(page_source, result)

            return result

    except Exception as e:
        result["error"] = str(e)
        return result

Extracting embedded JSON

eBay embeds JSON-LD (structured data for SEO) in script tags. This data is more stable than HTML selectors:

def _extract_json_data(self, html: str) -> dict | None:
    combined_data = {}

    # Extract JSON-LD product data
    ld_json_pattern = r'<script[^>]*type="application/ld\+json"[^>]*>(.+?)</script>'
    ld_matches = re.findall(ld_json_pattern, html, re.DOTALL)

    for match in ld_matches:
        try:
            data = json.loads(match.strip())
            if isinstance(data, dict) and data.get("@type") == "Product":
                combined_data["ld_product"] = data
            elif isinstance(data, dict) and data.get("@type") == "BreadcrumbList":
                combined_data["ld_breadcrumb"] = data
        except json.JSONDecodeError:
            continue

    # Extract high-res image URLs directly
    image_pattern = r'"https://i\.ebayimg\.com/images/g/[^"]+/s-l\d+\.(?:jpg|png|webp)"'
    image_matches = re.findall(image_pattern, html)
    if image_matches:
        combined_data["extracted_images"] = list(
            set(m.strip('"') for m in image_matches)
        )

    return combined_data if combined_data else None

Why JSON-LD? This structured data is required for search engines, so eBay maintains it consistently. It contains title, price, condition, seller info, and product identifiers (SKU, MPN, UPC/EAN).

Parsing the JSON data

The JSON-LD Product object contains most of what you need:

def _parse_item_data(self, data: dict, result: dict) -> dict:
    if "ld_product" in data:
        product = data["ld_product"]
        result["title"] = product.get("name")
        result["description"] = product.get("description")

        # Images
        if product.get("image"):
            images = product["image"]
            result["images"] = [images] if isinstance(images, str) else images

        # Price from offers
        offers = product.get("offers", {})
        if isinstance(offers, list) and offers:
            offers = offers[0]
        if isinstance(offers, dict):
            result["price"] = {
                "amount": self._parse_price(offers.get("price")),
                "currency": offers.get("priceCurrency", "USD"),
            }

        # Condition
        if product.get("itemCondition"):
            result["condition"]["name"] = product["itemCondition"].replace(
                "https://schema.org/", ""
            )

        # Product identifiers
        if product.get("brand"):
            brand = product["brand"]
            result["item_specifics"]["Brand"] = (
                brand.get("name") if isinstance(brand, dict) else brand
            )
        if product.get("mpn"):
            result["item_specifics"]["MPN"] = product["mpn"]
        if product.get("gtin13"):
            result["item_specifics"]["EAN"] = product["gtin13"]

    return result

Getting high-resolution images

eBay image URLs contain a size code like s-l500 (500px). Replacing this with s-l1600 returns the highest resolution available:

# Convert extracted images to the highest resolution
if "extracted_images" in data:
    hi_res_images = []
    for img in data["extracted_images"]:
        hi_res = re.sub(r"/s-l\d+\.", "/s-l1600.", img)
        if hi_res not in hi_res_images:
            hi_res_images.append(hi_res)
    result["images"] = hi_res_images

This works because eBay stores images at multiple resolutions. The size is just a URL parameter.

HTML fallback for missing data

Some fields aren't in JSON-LD (watchers, sold count, variations). The scraper extracts these from HTML:

def _parse_html_fallback(self, html: str, result: dict) -> dict:
    soup = BeautifulSoup(html, "html.parser")

    # Watchers count
    if not result.get("watchers"):
        watchers_match = re.search(
            r"(\d+)\s*(?:people are watching|watchers)", html, re.IGNORECASE
        )
        if watchers_match:
            result["watchers"] = int(watchers_match.group(1))

    # Sold count
    if not result.get("sold_count"):
        sold_match = re.search(r"(\d+)\s*sold", html, re.IGNORECASE)
        if sold_match:
            result["sold_count"] = int(sold_match.group(1))

    # Variations (colors, sizes)
    if not result.get("variations"):
        result["variations"] = self._parse_variations(html)

    return result

Description handling

eBay loads item descriptions in an iframe for security (to isolate seller HTML from the main page). The scraper captures the description iframe URL:

# Description is in an iframe
desc_elem = soup.select_one("#desc_ifr")
if desc_elem and desc_elem.get("src"):
    result["description_url"] = desc_elem.get("src")

You can fetch the description URL separately to retrieve the full seller-provided description HTML.

Usage example

Here's how to use the scraper:

scraper = EbayItemScraper()

# Works with both item IDs and full URLs
item = scraper.get_item("315166443569")
item = scraper.get_item("https://www.ebay.com/itm/315166443569")

# Multiple items in one session
items = scraper.get_multiple_items(["315166443569", "234567890123"])

print(f"Title: {item['title']}")
print(f"Price: ${item['price']['amount']}")
print(f"Images: {len(item['images'])} high-res photos")
print(f"Specifics: {item['item_specifics']}")

Sample output

Here's what the extracted data looks like:

{
    "item_id": "315166443569",
    "url": "https://www.ebay.com/itm/315166443569",
    "title": "Apple AirPods Pro (2nd Generation) with MagSafe Charging Case",
    "price": {"amount": 189.99, "currency": "USD"},
    "condition": {"name": "NewCondition"},
    "images": [
        "https://i.ebayimg.com/images/g/xxxxx/s-l1600.jpg",
        "https://i.ebayimg.com/images/g/yyyyy/s-l1600.jpg",
    ],
    "item_specifics": {
        "Brand": "Apple",
        "MPN": "MQD83AM/A",
        "Model": "AirPods Pro 2nd Generation",
        "Connectivity": "Bluetooth",
        "Color": "White",
    },
    "seller": {
        "name": "techdeals_store",
        "feedback_pct": 99.2,
        "feedback_score": 15420,
        "top_rated": true,
    },
    "shipping": {
        "free": true,
        "cost": 0,
        "estimated_delivery": "Thu, Feb 6 - Mon, Feb 10",
    },
    "returns": {
        "accepted": true,
        "period": 30,
        "policy": "30 days returns. Buyer pays for return shipping.",
    },
    "quantity": {"available": 48, "sold": 312},
    "watchers": 89,
    "sold_count": 312,
    "location": "Los Angeles, California, United States",
    "ships_to": "Worldwide",
    "variations": [
        {"type": "Color", "name": "White", "available": true},
        {"type": "Color", "name": "Black", "available": false},
    ],
    "category": {
        "path": ["Electronics", "Portable Audio & Headphones", "Headphones"],
        "leaf": "Headphones",
    },
    "error": null,
}

Fields like variations, watchers, and sold_count are included when available on the listing. Not all listings display this information.

Scrape eBay sold listings and price history

Sold listings show what people actually paid, not asking prices. For pricing research, this is often the most useful data source.

When viewing sold listings, you'll see "Sold [date]" labels on each item showing exactly when it sold and for how much. Data you won't find on active listings.

lsold-listing

What you can do with this data

Sold price data opens up several practical use cases:

  • Price your items to sell: see what similar items actually sold for, not just what sellers are asking

  • Find deals on active listings: compare asking prices to recent sold prices

  • Track market trends: monitor how prices change over time for specific items

  • Validate product sourcing: check if items are worth reselling before buying inventory

Limitation: eBay typically displays sold listings from recent months, though the exact timeframe may vary depending on category and listing conditions. For longer-term price history, you'd need to collect and store data over time.

The key difference: LH_Sold and LH_Complete parameters

The sold scraper is structurally similar to the search scraper, with two critical URL parameters that filter to completed sales:

def search_sold(self, query: str, **kwargs) -> dict:
    params = {
        "_nkw": query,
        "_sacat": kwargs.get("category", 0),
        "_pgn": kwargs.get("page", 1),
        "LH_Sold": "1",  # Only items that sold
        "LH_Complete": "1",  # Only completed listings
    }

    # Add optional filters...
    url = f"{self.BASE_URL}?{urlencode(params)}"
    return self._fetch_and_parse(url)

Why both parameters? LH_Complete=1 shows all completed listings, including unsold items that ended without a buyer. Adding LH_Sold=1 filters to only items that actually sold. Together, they give you actual transaction data: what buyers were willing to pay.

Sorting sold listings

Sold listings support date-based sorting, not available in active search:

SOLD_SORT_MAP = {
    "best_match": "12",
    "price_asc": "15",
    "price_desc": "16",
    "date_desc": "13",  # Most recent sales first (default)
    "date_asc": "1",    # Oldest sales first
}

Use date_desc (default) to see the most recent sales, which better reflect current market prices.

Parsing sold-specific fields

Sold listings have additional fields to extract: the sale date and whether it sold via auction, “Buy It Now”, or “Best Offer”. The code below is simplified for clarity. The full source handles both modern and legacy eBay layouts:

def _parse_sold_item(self, item) -> dict | None:
    try:
        # ... standard fields (title, image, etc.) ...

        # Sold price
        price_elem = item.select_one(".s-card__price")
        sold_price = (
            self._clean_price(price_elem.get_text(strip=True)) if price_elem else None
        )

        # Sale date (e.g., "Sold Jan 24, 2026") - extracted from caption
        sold_date = None
        caption_elem = item.select_one(".s-card__caption")
        if caption_elem:
            caption_text = caption_elem.get_text(strip=True)
            if caption_text.lower().startswith("sold"):
                sold_date = caption_text.replace("Sold", "").strip()

        # Determine buy format from attribute rows
        bids = None
        buy_format = None
        for row in item.select(".s-card__attribute-row"):
            text = row.get_text(strip=True).lower()
            if "bid" in text:
                bid_match = re.search(r"(\\d+)\\s*bid", text)
                if bid_match:
                    bids = int(bid_match.group(1))
                buy_format = "Auction"
            elif "buy it now" in text:
                buy_format = "Buy It Now"
            elif "best offer" in text:
                buy_format = "Best Offer Accepted"

        return {
            "listing_id": listing_id,
            "title": title,
            "sold_price": sold_price,
            "sold_date": sold_date,
            "buy_format": buy_format,
            "bids": bids,
            # ...
        }
    except Exception:
        return None

Tracking the buy format matters for price analysis. Auction prices often differ significantly from fixed prices for the same item. Auctions might go below market value with little competition, or above market value in bidding wars. Best offer sales indicate the seller accepted less than the listed price.

Calculating price statistics

The sold scraper includes automatic price analysis:

def _calculate_price_stats(self, items: list) -> dict:
    prices = [item["sold_price"] for item in items if item.get("sold_price")]

    if not prices:
        return {}

    stats = {
        "count": len(prices),
        "min": min(prices),
        "max": max(prices),
        "avg": round(statistics.mean(prices), 2),
        "median": round(statistics.median(prices), 2),
    }

    # Standard deviation requires at least 2 values
    if len(prices) >= 2:
        stats["std_dev"] = round(statistics.stdev(prices), 2)

    # Distribution buckets for quick analysis
    stats["price_ranges"] = {
        "under_25": len([p for p in prices if p < 25]),
        "25_to_50": len([p for p in prices if 25 <= p < 50]),
        "50_to_100": len([p for p in prices if 50 <= p < 100]),
        "100_to_250": len([p for p in prices if 100 <= p < 250]),
        "250_to_500": len([p for p in prices if 250 <= p < 500]),
        "over_500": len([p for p in prices if p >= 500]),
    }

    return stats

Why median over average? For pricing data, the median is usually more useful. A few unusually high or low sales (outliers) can skew the average significantly, but the median gives you the true "middle" price that most buyers paid.

Standard deviation indicates price consistency. A high value means prices vary widely (useful for identifying markets where condition or seller reputation significantly affects price), while a low value indicates stable pricing.

Usage and output

Here's how to search sold listings and get automatic price statistics:

scraper = EbaySoldScraper()

results = scraper.search_sold_multiple_pages(
    query="sony wh-1000xm4",
    max_pages=3,
    condition="used",
    sort="date_desc",  # Most recent sales first
)

stats = results["price_statistics"]
print(f"Analyzed {stats['count']} recent sales")
print(f"Price range: ${stats['min']:.2f} - ${stats['max']:.2f}")
print(f"Median sold price: ${stats['median']:.2f}")

Here's what the output looks like:

{
    "items": [
        {
            "listing_id": "127632981214",
            "title": "Sony WH-1000XM4 Wireless Noise Canceling Headphones",
            "sold_price": 127.50,
            "currency": "USD",
            "sold_date": "Jan 24, 2026",
            "condition": "Pre-Owned",
            "buy_format": "Auction",
            "bids": 12,
            "shipping": "Free shipping",
            "seller": {
                "name": "audio_deals",
                "feedback_pct": 99.8,
                "feedback_count": 4520,
            },
            "image_url": "https://i.ebayimg.com/images/g/xxxxx/s-l500.jpg",
            "item_url": "https://www.ebay.com/itm/127632981214",
        }
    ],
    "price_statistics": {
        "count": 59,
        "min": 45.00,
        "max": 189.99,
        "avg": 112.34,
        "median": 108.50,
        "std_dev": 32.15,
        "price_ranges": {
            "under_25": 0,
            "25_to_50": 3,
            "50_to_100": 22,
            "100_to_250": 34,
            "250_to_500": 0,
            "over_500": 0,
        },
    },
    "total_results": 59,
    "pages_scraped": 3,
    "error": null,
}

The buy_format field indicates how the item sold: "Auction", "Buy It Now", or "Best Offer Accepted". The bids field is only populated for auction sales.

Scrape eBay seller profiles and reviews

Seller data adds important context to pricing research. A product at $50 from a seller with 99.8% positive feedback over 10,000 transactions is different from the same product at $45 from someone with 12 reviews.

A note on seller pages: These are more protected than search results. SeleniumBase's CAPTCHA handling helps, and for larger volumes, rotating residential proxies can make a noticeable difference.

seller-page

What you can do with this data

Seller data enables several practical applications:

  • Assess seller reliability: check feedback percentage and detailed ratings before buying

  • Identify trusted sellers: find top-rated sellers with high volume and consistent ratings

  • Analyze competitor sellers: understand their feedback patterns and common complaints

  • Monitor seller reputation: track rating changes over time for specific sellers

Getting seller info from a listing

The most reliable approach is to extract seller data from a product listing page rather than the seller's profile page directly. Listing pages have lighter anti-bot protection and contain seller info, detailed ratings, and recent reviews. The code below is simplified for clarity. The full source includes CAPTCHA handling and multiple fallback selectors.

def get_seller_from_listing(self, item_id: str) -> dict:
    url = f"https://www.ebay.com/itm/{item_id}"
    result = {
        "item_id": item_id,
        "url": url,
        "seller": {},
        "detailed_ratings": {},
        "reviews": [],
        "total_feedback_count": None,
        "error": None,
    }

    try:
        with SB(
            uc=True,
            headless=self.headless,
            incognito=True,
            xvfb=self.xvfb,
            proxy=self.proxy,
        ) as sb:

            sb.uc_open_with_reconnect(url, reconnect_time=4)
            random_delay()

            page_source = sb.get_page_source()
            seller_data = self._parse_seller_from_listing(page_source)

            result["seller"] = seller_data
            result["detailed_ratings"] = seller_data.pop("detailed_ratings", {})
            result["reviews"] = seller_data.pop("reviews", [])

            return result

    except Exception as e:
        result["error"] = str(e)
        return result

Parsing seller details

The listing page contains seller info in several sections. The scraper extracts from the store information section and falls back to alternative selectors:

def _parse_seller_from_listing(self, html: str) -> dict:
    soup = BeautifulSoup(html, "html.parser")
    seller = {}

    # Primary: store information section
    store_info = soup.select_one(".x-store-information")
    if store_info:
        store_name = store_info.select_one(".x-store-information__store-name a")
        if store_name:
            seller["username"] = store_name.get_text(strip=True)
            seller["store_url"] = store_name.get("href")

        # Parse "99.3% positive feedback • 804K items sold"
        highlights = store_info.select_one(".x-store-information__highlights")
        if highlights:
            text = highlights.get_text(strip=True)
            pct_match = re.search(r"([\d.]+)%\s*positive", text, re.IGNORECASE)
            if pct_match:
                seller["positive_feedback_pct"] = float(pct_match.group(1))
            items_match = re.search(
                r"([\d.]+[KMB]?)\s*items?\s*sold", text, re.IGNORECASE
            )
            if items_match:
                seller["items_sold"] = self._parse_abbreviated_number(
                    items_match.group(1)
                )

    return seller

The abbreviated number problem: eBay displays "4.4K items sold" rather than "4,400 items sold" for readability. A helper function converts these:

def _parse_abbreviated_number(self, text: str) -> int:
    """Convert '4.4K' to 4400, '1.2M' to 1200000, etc."""
    text = text.strip().upper()
    multipliers = {"K": 1000, "M": 1000000, "B": 1000000000}
    for suffix, mult in multipliers.items():
        if suffix in text:
            return int(float(text.replace(suffix, "")) * mult)
    return int(float(text.replace(",", "")))

Extracting detailed seller ratings

eBay tracks four specific metrics with 1 to 5 star ratings: accurate description, shipping cost, shipping speed, and communication. These provide more nuance than the overall feedback percentage. A seller might have 99% positive feedback but a 4.2 star shipping speed rating, indicating a pattern of slow deliveries.

# Detailed ratings section
ratings_section = soup.select_one(
    ".fdbk-detail-seller-rating, [data-testid='seller-rating']"
)
if ratings_section:
    rating_items = ratings_section.select(".fdbk-detail-seller-rating__item")
    for item in rating_items:
        label_elem = item.select_one(".fdbk-detail-seller-rating__label")
        value_elem = item.select_one(".fdbk-detail-seller-rating__value")
        if label_elem and value_elem:
            label = label_elem.get_text(strip=True).lower()
            value = float(value_elem.get_text(strip=True))

            if "description" in label:
                seller["detailed_ratings"]["accurate_description"] = value
            elif "shipping cost" in label:
                seller["detailed_ratings"]["shipping_cost"] = value
            elif "shipping speed" in label:
                seller["detailed_ratings"]["shipping_speed"] = value
            elif "communication" in label:
                seller["detailed_ratings"]["communication"] = value

Filtering feedback

The get_feedback_page() method supports several filters for targeted analysis:

# Get only negative reviews
result = scraper.get_feedback_page(
    username="seller_name",
    pages=3,
    rating_type="negative",  # "all", "positive", "negative", "neutral"
)

# Filter by topic
result = scraper.get_feedback_page(
    username="seller_name",
    topic="shipping",  # quality, value, shipping, description, etc.
)

# Only reviews with photos
result = scraper.get_feedback_page(username="seller_name", photos_only=True)

Filtering for negative reviews is useful for identifying recurring issues before purchasing from a seller.

Usage and output

Here's how to get seller data from a listing:

scraper = EbaySellerScraper()

# Works with both item IDs and full URLs
result = scraper.get_seller_from_listing("315166443569")
result = scraper.get_seller_from_listing("https://www.ebay.com/itm/315166443569")

print(f"Seller: {result['seller']['username']}")
print(f"Feedback: {result['seller']['positive_feedback_pct']}%")
print(f"Items sold: {result['seller']['items_sold']}")

Here's what the output looks like:

{
    "item_id": "315166443569",
    "url": "https://www.ebay.com/itm/315166443569",
    "seller": {
        "username": "Topmate Official",
        "store_url": "https://www.ebay.com/str/topmateofficial",
        "positive_feedback_pct": 98.8,
        "items_sold": 4400,
        "member_since": "Feb 2022",
        "feedback_score": 992,
        "location": "Baldwin Park, CA, United States",
        "is_top_rated": false,
    },
    "detailed_ratings": {
        "accurate_description": 4.9,
        "shipping_cost": 5.0,
        "shipping_speed": 5.0,
        "communication": 5.0,
    },
    "reviews": [
        {
            "comment": "The wireless keyboard and mouse work great. Both connected fast and feel solid.",
            "rating": "positive",
            "time_period": "Past 6 months",
            "verified_purchase": true,
        }
    ],
    "total_feedback_count": 992,
    "error": null,
}

Note: The scraper extracts recent reviews shown on the listing page (~5 to 10 reviews). For more reviews, use get_feedback_page() with the seller's username.

The seller scraper also provides:

  • get_seller_profile(username): scrape the seller's profile page directly

  • get_feedback_page(username, pages=3): scrape paginated feedback with filters

  • search_seller_items(username): get the seller's current listings

Further reading: What Is a Forward Proxy? Forward vs Reverse Proxy, and What It’s Used For and 8 Best Proxies for AI Tools and Scalable Data Collection in 2026.

Scraping eBay reliably at scale

The scrapers above handle typical use cases without modification. If you're planning to scrape more than a few dozen pages, these techniques will help you do it reliably.

Rate limiting and delays

Anti-bot systems track request frequency. Too fast, and you'll trigger CAPTCHA or blocks; too slow, and you waste time. The optimal range for eBay is typically 2 to 5 seconds between requests.

The scrapers in this guide already include randomized delays:

import random
import time

def random_delay(min_sec: float = 2.0, max_sec: float = 5.0):
    """Pause for a random interval to appear more human-like."""
    delay = random.uniform(min_sec, max_sec)
    time.sleep(delay)
    return delay

Random delays work better than fixed ones. Predictable intervals are a bot fingerprint. Varying your timing mimics natural browsing patterns.

Scaling up: when to add proxies

Your IP has a reputation score. As you make more requests, that score degrades until you start seeing CAPTCHAs or blocks. Block thresholds vary significantly depending on request frequency, IP reputation, and browsing behavior, so there is no consistent page limit per IP.

Residential proxies solve this by distributing requests across many IPs, so no single IP accumulates enough requests to trigger limits. They're the standard choice for eCommerce sites like eBay because they use real ISP addresses that blend in with normal traffic.

When to add proxies:

  • Under 50 pages: Your IP is usually fine

  • 50 to 500 pages: Residential proxies recommended

  • 500+ pages/day: Residential proxies + rotation

Live Proxies provides proxies from your dashboard in this format: IP:PORT:USERNAME-ACCESS_CODE-SID:PASSWORD

To use with SeleniumBase, restructure it as username:password@server:port (without http:// prefix. SeleniumBase docs explicitly state not to include it):

# Dashboard format: 45.127.248.131:7383:LV71125532-mDmfksl3onyoy-1:bW2VN4Zc5YSyK5nF82tK
# Restructure for SeleniumBase:
PROXY_URL = "LV71125532-mDmfksl3onyoy-1:[email protected]:7383"

with SB(uc=True, incognito=True, proxy=PROXY_URL) as sb:
    sb.uc_open_with_reconnect(url, reconnect_time=3)

The session ID (-1, -2, etc.) in the username gives you sticky sessions with the same IP for up to 24 hours, which helps maintain consistent behavior during a scraping session. Remove the session ID suffix for rotating sessions (new IP each request).

Rotating proxies automatically cycle through IPs, so each request appears to come from a different user. Live Proxies also offers private IP allocation, which can reduce overlap with other users, but IP reputation may still be affected by prior activity or target-site detection mechanisms.

Scaling beyond 100 pages? Check out our residential proxy plans: activation takes ~10 minutes.

Retry logic with exponential backoff

When requests fail, retrying immediately usually fails too. Your IP is still flagged. Exponential backoff gives the rate limiter time to reset:

from functools import wraps

def retry_on_failure(max_retries: int = 3, initial_delay: float = 2.0):
    """Decorator that retries failed functions with exponential backoff."""

    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            delay = initial_delay
            for attempt in range(max_retries + 1):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt < max_retries:
                        logger.warning(
                            f"Attempt {attempt + 1} failed, retrying in {delay}s..."
                        )
                        time.sleep(delay)
                        delay *= 2  # Double the delay each time
                    else:
                        raise

        return wrapper

    return decorator

How it works: First retry waits 2 seconds, second waits 4, third waits 8. This pattern is standard in production systems. It's how well-behaved clients handle temporary failures without hammering servers.

@retry_on_failure(max_retries=2, initial_delay=3.0)
def _fetch_and_parse(self, url: str) -> dict:
  # ... scraping logic ...

Conclusion

You now have 4 working scrapers for eBay: search results, item details, sold prices, and seller profiles.

The patterns here (SeleniumBase UC mode, random delays, fallback selectors) work beyond eBay too. Most eCommerce sites use similar protections.

For small projects, the default settings are sufficient. When you're ready to scale, residential proxies help maintain consistent success rates.

Get started with Live Proxies →

FAQs

How do I find the eBay item ID and why does it matter?

It's the 12-digit number in the listing URL: for example, ebay.com/itm/315166443569. On the listing page, you'll find it in the "About this item" section. The scrapers accept either the full URL or just the ID.

Why does my scraper return different prices than what I see in the browser?

Most commonly, location-based display: eBay shows different prices, currencies, and VAT based on your IP location. If your scraper runs from a different IP than your browser, prices will differ. Other causes: JavaScript-loaded content not rendering fully, or shipping displayed differently. To diagnose, run the scraper and browser from the same IP at the same time.

How do I scrape eBay safely on a small scale for a personal project?

Use the scrapers with default settings: they include 2 to 5 second random delays. Avoid headless mode (it's detectable). For very light use, you may not need proxies immediately, but there's no guaranteed "safe" threshold: eBay's detection depends on your IP reputation, request patterns, and timing. If you start seeing CAPTCHAs, add proxies or reduce frequency.

How do I keep my eBay scraper stable when page layouts change?

The item scraper prioritizes JSON-LD data (eBay's structured data), which is separate from HTML and doesn't change when visual layouts update. The search scraper includes fallback selectors for both new and legacy layouts. Monitor your output regularly: empty results or missing fields could signal layout changes, blocks, or other issues.