eBay's 2+ billion listings make it a valuable source for pricing research, competitor monitoring, and market analysis. But scraping it isn't straightforward: anti-bot detection blocks most standard approaches.
This guide provides 4 Python scrapers examples designed to reduce blocks and handle common anti-bot challenges on eBay, though reliability depends on request patterns, IP reputation, and ongoing site changes.
What you'll learn:
-
Scrape search results, item details, sold prices, and seller profiles
-
Bypass anti-bot detection with SeleniumBase UC Mode
-
Handle both old and new eBay page layouts
-
Scale to hundreds of pages using rotating proxies
Scrape eBay with Python: quick start
The best way to understand eBay scraping is to see working code first.
pip install seleniumbase beautifulsoup4
The code below searches eBay and extracts basic listing data. It uses SeleniumBase with uc=True (Undetected Chrome mode), which patches the ChromeDriver to avoid detection:
from seleniumbase import SB
from bs4 import BeautifulSoup
import re
def scrape_ebay_search(query, max_price=None):
url = f"https://www.ebay.com/sch/i.html?_nkw={query}" + (
f"&_udhi={max_price}" if max_price else ""
)
with SB(uc=True, incognito=True) as sb:
sb.uc_open_with_reconnect(url, reconnect_time=3)
sb.sleep(2)
soup = BeautifulSoup(sb.get_page_source(), "html.parser")
items = []
skip_badges = ["New Listing", "Opens in a new window or tab"]
for card in soup.select("li.s-card[data-listingid]"):
if len(card.get("data-listingid") or "") > 15:
continue
if not (title_el := card.select_one(".s-card__title")):
continue
texts = [
s.get_text(strip=True)
for s in title_el.select("span")
if s.get_text(strip=True) not in skip_badges
]
title = max(texts, key=len) if texts else ""
if not title or "Shop on eBay" in title:
continue
if not (price_el := card.select_one(".s-card__price")):
continue
match = re.search(
r"[\d,]+\.?\d*", price_el.get_text(strip=True).replace(",", "")
)
if not match:
continue
items.append({"title": title, "price": float(match.group())})
return items
if __name__ == "__main__":
for item in scrape_ebay_search("wireless keyboard", max_price=150)[:5]:
print(f"${item['price']:.2f} - {item['title'][:60]}")
# Sample output:
# $49.99 - Logitech K400 Plus Wireless Touch Keyboard with Built-in
# $34.99 - Microsoft Wireless Keyboard 850 - Black
# $89.00 - Logitech MX Keys Advanced Wireless Illuminated Keyboard
The code filters out eBay's promotional cards by skipping listings with unusually long IDs (over 15 characters) or "Shop on eBay" in the title, so you only get real product listings.
The uc_open_with_reconnect method includes a disconnect-reconnect cycle that helps bypass anti-bot protections. The reconnect_time=3 parameter specifies how long the driver stays disconnected from Chrome (3 seconds) while the page loads. The reconnect cycle may help reduce automation signals during page load, but detection outcomes still depend on multiple factors such as IP reputation, request frequency, and browser fingerprint consistency. Setting incognito=True starts each session with a clean browser state: no cookies, cached data, or history that could identify you as a returning visitor.
Why use BeautifulSoup after SeleniumBase? SeleniumBase handles browser automation and anti-bot bypass. Once we have the rendered HTML, BeautifulSoup is faster and more convenient for parsing the data. You could use Selenium's built-in selectors, but BeautifulSoup's API is cleaner for extracting multiple elements.
If it works, you have the foundation. The rest of this guide builds on these same patterns. If you encounter a CAPTCHA or block, that's expected; we'll cover how to handle this below.
When eBay detects suspicious activity, you may first encounter a browser verification check:

This is eBay's browser verification check, an interstitial page that runs fingerprinting checks before redirecting you to the content. It typically resolves on its own within a few seconds. The uc_open_with_reconnect method in SeleniumBase accounts for this by staying disconnected during page load, giving the check time to complete before the driver reconnects. If this check fails to resolve and you get stuck or blocked, your IP may be flagged. Rotating residential proxies helps by distributing requests across clean IPs.
If eBay still flags your session, you'll see an hCaptcha challenge:

This hCaptcha challenge is eBay's way of filtering out automated traffic. The scrapers attempt to click the checkbox automatically when running in visible mode, but this isn't foolproof. Image puzzles require manual intervention. The scaling section covers how to minimise the likelihood of encountering CAPTCHAs in the first place.
Download the complete scrapers
The code snippets throughout this guide show the key patterns, but the full scrapers have additional features: CLI interfaces, proxy support, multi-page pagination, and comprehensive error handling. Download the complete files here:
-
ebay_search_scraper.py: Search results with filtering and sorting
-
ebay_item_scraper.py: Complete item details from individual listings
-
ebay_sold_scraper.py: Sold listings with price statistics
-
ebay_seller_scraper.py: Seller profiles and reviews
Each scraper runs standalone from the command line:
# Search active listings
python ebay_search_scraper.py "mechanical keyboard"
python ebay_search_scraper.py "iphone 15" --condition used --max-price 800 --pages 3
# Scrape sold prices
python ebay_sold_scraper.py "ps5 console"
python ebay_sold_scraper.py "rolex submariner" --min-price 5000 --sort date_desc
# Scrape full item details
python ebay_item_scraper.py 315166443569
python ebay_item_scraper.py 315166443569 234567890123 # multiple items
# Scrape seller info
python ebay_seller_scraper.py 315166443569
python ebay_seller_scraper.py 315166443569 --feedback-pages 3 --type negative
Run python <scraper>.py --help for all available options.
What can you scrape from eBay?
eBay has several different data sources, and they vary considerably in how easy they are to access. If you're new to web scraping, eBay is actually one of the more complex sites to work with, but also one of the most valuable for eCommerce data.
Search results are your starting point. You get titles, prices, conditions, seller info, shipping costs, and thumbnails. Search results are often easier to extract compared to deeper pages because they contain structured listing cards, though they are still protected by anti-automation measures.
Individual listings contain the complete picture. Full descriptions, high-resolution image galleries, item specifics (brand, model, color, size), return policies, shipping options, quantity available, watcher counts, and variation options (sizes, colors). This is where you get data that search results simply don't include.
Sold listings are often the most useful data on eBay for pricing research. While active listings show asking prices, sold listings show what people actually paid. eBay keeps approximately 90 days of this data visible, accessed through the same search endpoint with different filters.
Seller profiles and reviews add context. Feedback scores, member tenure, detailed ratings, and individual buyer reviews. These pages have noticeably stronger anti-bot measures than search results. The dedicated feedback pages (/fdbk/) are the most challenging to scrape reliably.
One thing you won't find is aggregated inventory. You can see how many listings a seller has and the quantity available on each listing, but there's no single endpoint showing total stock across all their listings. You'd have to scrape each listing individually and sum the quantities.
Does eBay allow web scraping?
Not officially. eBay's Terms of Service prohibit automated data collection without permission. eBay offers official APIs. The Browse API covers many common use cases, and if it works for your needs, that's the recommended path.
For data the APIs don't expose (like historical sold prices or detailed seller analytics), some choose to scrape public listing data. The scrapers in this guide only access publicly visible information. No login required, no personal user data. They include delays and retry logic to minimize server load.
If you're scraping at scale or for commercial purposes, consider consulting legal counsel about your specific use case.
Python setup for eBay scraping
eBay's anti-bot protections require a real browser to bypass. SeleniumBase in UC (Undetected Chrome) mode makes this straightforward. It runs an actual Chrome browser with anti-detection patches applied to the ChromeDriver, so you don't have to configure anything manually.
Requirements: Python 3.10+ and Chrome browser installed.
Installation
Install the dependencies in a virtual environment:
python -m venv ebay_scraper_env
source ebay_scraper_env/bin/activate # Windows: ebay_scraper_env\Scripts\activate
pip install seleniumbase beautifulsoup4
A note on headless mode: If you're running on a server without a display, you might be tempted to use headless=True. Unfortunately, Some headless configurations can increase detection risk, though results vary depending on browser fingerprinting setup and request behavior.
A better approach is Xvfb (X Virtual Framebuffer), which runs a headed browser in a virtual display:
# Ubuntu/Debian
sudo apt-get install xvfb
# Then in your code
with SB(uc=True, xvfb=True) as sb:
# runs like a headed browser, but without needing a real display
eBay URL parameter reference
eBay's search URLs use query parameters to control filtering and sorting. Understanding these lets you construct any search programmatically:
CONDITION_MAP = {
"new": "1000",
"open_box": "1500",
"refurbished": "2000",
"used": "3000",
"for_parts": "7000",
}
SORT_MAP = {
"best_match": "12",
"price_asc": "15",
"price_desc": "16",
"ending_soonest": "1",
"newly_listed": "10",
}
SOLD_SORT_MAP = {
"best_match": "12",
"price_asc": "15",
"price_desc": "16",
"date_desc": "13",
"date_asc": "1",
}
Example: _ebay.com/sch/i.html?nkw=laptop&LH_ItemCondition=3000 filters to used items only.
Scrape eBay search results with Python
Search results are typically where eBay data collection begins. eBay serves different HTML layouts to different users. You might get the modern s-card elements or the older s-item layout. The scraper handles both automatically.
Using browser DevTools, you can see that each listing is an li element with the class s-card and a data-listingid attribute containing the item ID. The title, price, and seller info are nested inside with their own class selectors.

Setting up the scraper class
The scraper class stores configuration that affects how it interacts with eBay:
class EbayScraper:
BASE_URL = "https://www.ebay.com/sch/i.html"
RECONNECT_TIME = 3
def __init__(
self, headless: bool = False, xvfb: bool = False, proxy: str | None = None
):
self.headless = headless
self.xvfb = xvfb and platform.system() == "Linux"
self.proxy = proxy
Why these defaults? Setting headless=False is intentional. UC Mode is more easily detected in headless mode, as eBay's anti-bot systems look for headless browser signatures. The xvfb option is Linux-only (hence the platform check) and provides a better alternative for servers without displays. The proxy parameter accepts a string like "user:pass@host:port" for rotating residential proxies, which becomes important when scraping at scale.
Building the search URL
The search method constructs an eBay URL from your parameters and validates inputs against known parameter maps to avoid broken URLs:
def search(
self,
query: str,
category: int = 0,
page: int = 1,
min_price: float | None = None,
max_price: float | None = None,
condition: str | None = None,
sort: str | None = None,
) -> dict:
params = {"_nkw": query, "_sacat": category, "_pgn": page}
if min_price is not None:
params["_udlo"] = min_price
if max_price is not None:
params["_udhi"] = max_price
if condition and condition.lower() in CONDITION_MAP:
params["LH_ItemCondition"] = CONDITION_MAP[condition.lower()]
if sort and sort.lower() in SORT_MAP:
params["_sop"] = SORT_MAP[sort.lower()]
url = f"{self.BASE_URL}?{urlencode(params)}"
return self._fetch_and_parse(url)
Parameter breakdown:
| Parameter | Purpose | Example |
|---|---|---|
| _nkw | Search keywords (name keywords) | laptop |
| _sacat | Category ID (0 = all categories) | 175672 (laptops) |
| _pgn | Page number for pagination | 1, 2, 3... |
| _udlo / _udhi | Price range (user-defined low/high) | 100, 500 |
| LH_ItemCondition | Condition filter | 3000 (used) |
| _sop | Sort order preference | 15 (price low to high) |
The check condition.lower() in CONDITION_MAP prevents invalid parameters from reaching the URL. If someone passes condition="mint" (not a valid eBay filter), it's silently ignored rather than breaking the request.
Fetching and handling anti-bot detection
The fetch method handles browser interaction and CAPTCHA detection. The code below is simplified for clarity.
def _fetch_and_parse(self, url: str) -> dict:
results = {"url": url, "items": [], "total_results": 0, "error": None}
try:
with SB(
uc=True,
headless=self.headless,
incognito=True,
xvfb=self.xvfb,
proxy=self.proxy,
) as sb:
sb.uc_open_with_reconnect(url, reconnect_time=self.RECONNECT_TIME)
random_delay()
page_source = sb.get_page_source()
if (
"Pardon Our Interruption" in page_source
or "captcha" in page_source.lower()
):
logger.warning("CAPTCHA detected, attempting to solve...")
try:
sb.uc_gui_click_captcha()
random_delay()
page_source = sb.get_page_source()
except Exception as e:
logger.error(f"CAPTCHA handling failed: {e}")
if "Pardon Our Interruption" in page_source:
results["error"] = "Blocked by anti-bot (CAPTCHA not solved)"
return results
return self._parse_html(page_source, results)
except Exception as e:
results["error"] = str(e)
return results
Setting incognito=True starts each browser session without cookies or cached data, preventing eBay from building a profile across multiple scraping sessions.
About uc_gui_click_captcha(): This SeleniumBase method attempts to click CAPTCHA checkboxes automatically. It works for simple checkbox CAPTCHAs but not complex image challenges. When it fails, the scraper returns an error rather than hanging indefinitely.
Random delays between requests help avoid detection patterns. Fixed delays can create identifiable patterns that anti-bot systems may flag.
Parsing the HTML: handling dual layouts
eBay uses two different layouts, so the scraper checks for both. The modern s-card layout appears for most users, but some still receive the legacy s-item layout:
def _parse_html(self, html: str, results: dict) -> dict:
soup = BeautifulSoup(html, "html.parser")
items = soup.select("li.s-card[data-listingid]")
if items:
for item in items:
listing = self._parse_card_item(item)
if listing:
results["items"].append(listing)
else:
for item in soup.select("li.s-item"):
listing = self._parse_legacy_item(item)
if listing:
results["items"].append(listing)
results["total_results"] = len(results["items"])
total_elem = soup.select_one(".srp-controls__count-heading span")
if total_elem:
match = re.search(r"([\d,]+)", total_elem.get_text(strip=True))
if match:
results["total_on_ebay"] = int(match.group(1).replace(",", ""))
return results
eBay A/B tests different layouts and may serve different HTML based on geography, device, or random assignment. Supporting both ensures the scraper works regardless of which version you receive.
The [data-listingid] attribute selector targets listing cards specifically. The code also filters out promotional content by checking for "shop on ebay" in titles, since some promotional cards can have listing IDs.
Extracting item details from the modern layout
The parsing method uses defensive coding. Every extraction is wrapped in checks because eBay's HTML structure varies between listings:
def _parse_card_item(self, item) -> dict | None:
try:
listing_id = item.get("data-listingid")
title_elem = item.select_one(".s-card__title span.su-styled-text")
title = title_elem.get_text(strip=True) if title_elem else None
if title and "shop on ebay" in title.lower():
return None
img_elem = item.select_one(".s-card__image")
image_url = img_elem.get("src") if img_elem else None
link_elem = item.select_one("a.s-card__link[href]")
item_url = None
if link_elem:
item_url = link_elem.get("href")
if item_url and "?" in item_url:
item_url = item_url.split("?")[0]
condition_elem = item.select_one(".s-card__subtitle span.su-styled-text")
condition = condition_elem.get_text(strip=True) if condition_elem else None
Stripping URL tracking parameters (?_trkparms=...) keeps your stored URLs clean. These parameters don't affect the page content you'll see.
Price extraction handles both single prices and ranges:
price_elems = item.select(".s-card__price")
prices = [
p.get_text(strip=True)
for p in price_elems
if p.get_text(strip=True) not in ["to", ""]
]
price = None
price_range = None
if len(prices) == 1:
price = self._clean_price(prices[0])
elif len(prices) >= 2:
price = self._clean_price(prices[0])
price_range = {
"min": self._clean_price(prices[0]),
"max": self._clean_price(prices[-1]),
}
Some listings show a range like "$25.00 to $35.00" for items with variants. The scraper captures both the minimum price (useful for sorting and filtering) and the full range for accurate display.
Seller info, shipping, and engagement metrics are extracted from attribute rows:
shipping = None
free_shipping = False
seller_name = None
seller_feedback_pct = None
sold_count = None
watchers = None
for row in item.select(".s-card__attribute-row"):
text = row.get_text(strip=True)
text_lower = text.lower()
if "shipping" in text_lower:
shipping = text
free_shipping = "free" in text_lower
elif "positive" in text_lower and "%" in text:
seller_match = re.match(
r"(.+?)([\d.]+)%\s*positive", text, re.IGNORECASE
)
if seller_match:
seller_name = seller_match.group(1).strip()
seller_feedback_pct = float(seller_match.group(2))
elif "sold" in text_lower:
sold_match = re.search(r"(\d+)\s*sold", text_lower)
if sold_match:
sold_count = int(sold_match.group(1))
elif "watch" in text_lower:
watch_match = re.search(r"(\d+)\s*watch", text_lower)
if watch_match:
watchers = int(watch_match.group(1))
return {
"listing_id": listing_id,
"title": title,
"price": price,
"price_range": price_range,
"currency": "USD",
"condition": condition,
"shipping": shipping,
"free_shipping": free_shipping,
"seller": {
"name": seller_name,
"feedback_pct": seller_feedback_pct
} if seller_name else None,
"image_url": image_url,
"item_url": item_url,
"sold_count": sold_count,
"watchers": watchers,
}
except Exception as e:
logger.debug(f"Error parsing card: {e}")
return None
Returning None on exception lets the scraper skip problematic items rather than crashing the entire operation.
Multi-page scraping in a single session
For scraping multiple pages efficiently, the browser session stays open and navigation methods change after the first page:
def search_multiple_pages(self, query: str, max_pages: int = 3, **kwargs) -> dict:
all_items = []
try:
with SB(
uc=True,
headless=self.headless,
incognito=True,
xvfb=self.xvfb,
proxy=self.proxy,
) as sb:
for page_num in range(1, max_pages + 1):
params = {"_nkw": query, "_sacat": 0, "_pgn": page_num}
url = f"{self.BASE_URL}?{urlencode(params)}"
if page_num == 1:
sb.uc_open_with_reconnect(url, reconnect_time=self.RECONNECT_TIME)
else:
sb.open(url)
random_delay()
page_source = sb.get_page_source()
if "Pardon Our Interruption" in page_source:
break
results = self._parse_html(page_source, {"items": []})
if not results["items"]:
break
all_items.extend(results["items"])
if page_num < max_pages:
random_delay()
return {
"items": all_items,
"total_results": len(all_items),
"pages_scraped": page_num,
"error": None,
}
except Exception as e:
return {"items": all_items, "error": str(e)}
The anti-bot bypass (uc_open_with_reconnect) is slower and more resource-intensive. Once a session is established on the first page, subsequent pages typically load without challenge using regular sb.open(). This approach is faster and mimics how users actually browse, clicking through results rather than starting fresh each time.
Usage example and output
Here's how to use the scraper:
scraper = EbayScraper()
# Basic search
results = scraper.search("mechanical keyboard")
# With filters
results = scraper.search(
query="mechanical keyboard",
min_price=50,
max_price=150,
condition="used",
sort="price_asc",
)
# Multiple pages in one session
results = scraper.search_multiple_pages(
query="mechanical keyboard", max_pages=5, condition="new"
)
for item in results["items"]:
print(f"${item['price']:.2f} - {item['title']}")
Example output structure:
{
"url": "https://www.ebay.com/sch/i.html?_nkw=mechanical+keyboard&_sacat=0&_pgn=1",
"items": [
{
"listing_id": "116434118798",
"title": "Cherry MX Mechanical Keyboard Tactile Brown Switches",
"price": 199.99,
"price_range": null,
"currency": "USD",
"condition": "Brand New",
"buy_format": "Buy It Now",
"shipping": "Free International Shipping",
"free_shipping": true,
"free_returns": false,
"location": "Located in China",
"seller": {
"name": "furyauction",
"feedback_pct": 100.0,
"feedback_count": 1700,
},
"image_url": "https://i.ebayimg.com/images/g/.../s-l500.webp",
"item_url": "https://www.ebay.com/itm/116434118798",
"is_new_listing": false,
"is_sponsored": true,
"sold_count": 24,
"watchers": 15,
"bids": null,
}
],
"total_results": 60,
"total_on_ebay": 23000,
"error": null,
}
Fields like sold_count, watchers, and bids appear when eBay displays them on the search result card. Not all listings show this information.
Further reading: 8 Best Private Proxies in 2026 (Tested & Ranked) and 8 Best Rotating Proxies in 2026 (Tested and Ranked).
Scrape individual eBay product pages
Sometimes search results aren't enough. You need the full description, high-res images, or complete item specifics. Individual listing pages contain much more data, though they're also more heavily protected.
The key insight: eBay embeds structured JSON data (JSON-LD) directly in the page source. Extracting from this JSON is more reliable than parsing HTML, which changes frequently. The scraper uses JSON extraction as the primary method with HTML parsing as a fallback.

What you get
Individual listing pages contain data you can't get from search results:
-
Full item specifics: brand, model, MPN, UPC/EAN, dimensions, material, etc.
-
High-resolution images: up to 1600px (search results only give thumbnails)
-
Complete seller profile: feedback score, items sold, member tenure, top-rated status
-
Shipping & returns details: costs, delivery estimates, return policy
-
Variations: all available options (colors, sizes) with stock status
-
Full quantity data: exact available count and total sold history
Setting up the item scraper
Product pages are more heavily protected than search results, so RECONNECT_TIME is set to 4 seconds to allow eBay's anti-bot checks to complete before retrieving the page content.
class EbayItemScraper:
RECONNECT_TIME = 4
def __init__(
self, headless: bool = False, xvfb: bool = False, proxy: str | None = None
):
self.headless = headless
self.xvfb = xvfb and platform.system() == "Linux"
self.proxy = proxy
Fetching and extracting data
The get_item method fetches the page and extracts data from embedded JSON first, then fills gaps with HTML parsing. The code below is simplified for clarity. The full source includes CAPTCHA handling and proxy support.
def get_item(self, item_id: str) -> dict:
url = f"https://www.ebay.com/itm/{item_id}"
result = {
"item_id": item_id,
"url": url,
"title": None,
"price": {},
"condition": {},
"images": [],
"item_specifics": {},
"seller": {},
"shipping": {},
"returns": {},
"quantity": {},
"watchers": None,
"sold_count": None,
"location": None,
"ships_to": None,
"variations": [],
"category": {},
"error": None,
}
try:
with SB(
uc=True,
headless=self.headless,
incognito=True,
xvfb=self.xvfb,
proxy=self.proxy,
) as sb:
sb.uc_open_with_reconnect(url, reconnect_time=self.RECONNECT_TIME)
random_delay()
page_source = sb.get_page_source()
# Primary: extract from embedded JSON
item_data = self._extract_json_data(page_source)
if item_data:
result = self._parse_item_data(item_data, result)
# Fallback: fill missing fields from HTML
result = self._parse_html_fallback(page_source, result)
return result
except Exception as e:
result["error"] = str(e)
return result
Extracting embedded JSON
eBay embeds JSON-LD (structured data for SEO) in script tags. This data is more stable than HTML selectors:
def _extract_json_data(self, html: str) -> dict | None:
combined_data = {}
# Extract JSON-LD product data
ld_json_pattern = r'<script[^>]*type="application/ld\+json"[^>]*>(.+?)</script>'
ld_matches = re.findall(ld_json_pattern, html, re.DOTALL)
for match in ld_matches:
try:
data = json.loads(match.strip())
if isinstance(data, dict) and data.get("@type") == "Product":
combined_data["ld_product"] = data
elif isinstance(data, dict) and data.get("@type") == "BreadcrumbList":
combined_data["ld_breadcrumb"] = data
except json.JSONDecodeError:
continue
# Extract high-res image URLs directly
image_pattern = r'"https://i\.ebayimg\.com/images/g/[^"]+/s-l\d+\.(?:jpg|png|webp)"'
image_matches = re.findall(image_pattern, html)
if image_matches:
combined_data["extracted_images"] = list(
set(m.strip('"') for m in image_matches)
)
return combined_data if combined_data else None
Why JSON-LD? This structured data is required for search engines, so eBay maintains it consistently. It contains title, price, condition, seller info, and product identifiers (SKU, MPN, UPC/EAN).
Parsing the JSON data
The JSON-LD Product object contains most of what you need:
def _parse_item_data(self, data: dict, result: dict) -> dict:
if "ld_product" in data:
product = data["ld_product"]
result["title"] = product.get("name")
result["description"] = product.get("description")
# Images
if product.get("image"):
images = product["image"]
result["images"] = [images] if isinstance(images, str) else images
# Price from offers
offers = product.get("offers", {})
if isinstance(offers, list) and offers:
offers = offers[0]
if isinstance(offers, dict):
result["price"] = {
"amount": self._parse_price(offers.get("price")),
"currency": offers.get("priceCurrency", "USD"),
}
# Condition
if product.get("itemCondition"):
result["condition"]["name"] = product["itemCondition"].replace(
"https://schema.org/", ""
)
# Product identifiers
if product.get("brand"):
brand = product["brand"]
result["item_specifics"]["Brand"] = (
brand.get("name") if isinstance(brand, dict) else brand
)
if product.get("mpn"):
result["item_specifics"]["MPN"] = product["mpn"]
if product.get("gtin13"):
result["item_specifics"]["EAN"] = product["gtin13"]
return result
Getting high-resolution images
eBay image URLs contain a size code like s-l500 (500px). Replacing this with s-l1600 returns the highest resolution available:
# Convert extracted images to the highest resolution
if "extracted_images" in data:
hi_res_images = []
for img in data["extracted_images"]:
hi_res = re.sub(r"/s-l\d+\.", "/s-l1600.", img)
if hi_res not in hi_res_images:
hi_res_images.append(hi_res)
result["images"] = hi_res_images
This works because eBay stores images at multiple resolutions. The size is just a URL parameter.
HTML fallback for missing data
Some fields aren't in JSON-LD (watchers, sold count, variations). The scraper extracts these from HTML:
def _parse_html_fallback(self, html: str, result: dict) -> dict:
soup = BeautifulSoup(html, "html.parser")
# Watchers count
if not result.get("watchers"):
watchers_match = re.search(
r"(\d+)\s*(?:people are watching|watchers)", html, re.IGNORECASE
)
if watchers_match:
result["watchers"] = int(watchers_match.group(1))
# Sold count
if not result.get("sold_count"):
sold_match = re.search(r"(\d+)\s*sold", html, re.IGNORECASE)
if sold_match:
result["sold_count"] = int(sold_match.group(1))
# Variations (colors, sizes)
if not result.get("variations"):
result["variations"] = self._parse_variations(html)
return result
Description handling
eBay loads item descriptions in an iframe for security (to isolate seller HTML from the main page). The scraper captures the description iframe URL:
# Description is in an iframe
desc_elem = soup.select_one("#desc_ifr")
if desc_elem and desc_elem.get("src"):
result["description_url"] = desc_elem.get("src")
You can fetch the description URL separately to retrieve the full seller-provided description HTML.
Usage example
Here's how to use the scraper:
scraper = EbayItemScraper()
# Works with both item IDs and full URLs
item = scraper.get_item("315166443569")
item = scraper.get_item("https://www.ebay.com/itm/315166443569")
# Multiple items in one session
items = scraper.get_multiple_items(["315166443569", "234567890123"])
print(f"Title: {item['title']}")
print(f"Price: ${item['price']['amount']}")
print(f"Images: {len(item['images'])} high-res photos")
print(f"Specifics: {item['item_specifics']}")
Sample output
Here's what the extracted data looks like:
{
"item_id": "315166443569",
"url": "https://www.ebay.com/itm/315166443569",
"title": "Apple AirPods Pro (2nd Generation) with MagSafe Charging Case",
"price": {"amount": 189.99, "currency": "USD"},
"condition": {"name": "NewCondition"},
"images": [
"https://i.ebayimg.com/images/g/xxxxx/s-l1600.jpg",
"https://i.ebayimg.com/images/g/yyyyy/s-l1600.jpg",
],
"item_specifics": {
"Brand": "Apple",
"MPN": "MQD83AM/A",
"Model": "AirPods Pro 2nd Generation",
"Connectivity": "Bluetooth",
"Color": "White",
},
"seller": {
"name": "techdeals_store",
"feedback_pct": 99.2,
"feedback_score": 15420,
"top_rated": true,
},
"shipping": {
"free": true,
"cost": 0,
"estimated_delivery": "Thu, Feb 6 - Mon, Feb 10",
},
"returns": {
"accepted": true,
"period": 30,
"policy": "30 days returns. Buyer pays for return shipping.",
},
"quantity": {"available": 48, "sold": 312},
"watchers": 89,
"sold_count": 312,
"location": "Los Angeles, California, United States",
"ships_to": "Worldwide",
"variations": [
{"type": "Color", "name": "White", "available": true},
{"type": "Color", "name": "Black", "available": false},
],
"category": {
"path": ["Electronics", "Portable Audio & Headphones", "Headphones"],
"leaf": "Headphones",
},
"error": null,
}
Fields like variations, watchers, and sold_count are included when available on the listing. Not all listings display this information.
Scrape eBay sold listings and price history
Sold listings show what people actually paid, not asking prices. For pricing research, this is often the most useful data source.
When viewing sold listings, you'll see "Sold [date]" labels on each item showing exactly when it sold and for how much. Data you won't find on active listings.

What you can do with this data
Sold price data opens up several practical use cases:
-
Price your items to sell: see what similar items actually sold for, not just what sellers are asking
-
Find deals on active listings: compare asking prices to recent sold prices
-
Track market trends: monitor how prices change over time for specific items
-
Validate product sourcing: check if items are worth reselling before buying inventory
Limitation: eBay typically displays sold listings from recent months, though the exact timeframe may vary depending on category and listing conditions. For longer-term price history, you'd need to collect and store data over time.
The key difference: LH_Sold and LH_Complete parameters
The sold scraper is structurally similar to the search scraper, with two critical URL parameters that filter to completed sales:
def search_sold(self, query: str, **kwargs) -> dict:
params = {
"_nkw": query,
"_sacat": kwargs.get("category", 0),
"_pgn": kwargs.get("page", 1),
"LH_Sold": "1", # Only items that sold
"LH_Complete": "1", # Only completed listings
}
# Add optional filters...
url = f"{self.BASE_URL}?{urlencode(params)}"
return self._fetch_and_parse(url)
Why both parameters? LH_Complete=1 shows all completed listings, including unsold items that ended without a buyer. Adding LH_Sold=1 filters to only items that actually sold. Together, they give you actual transaction data: what buyers were willing to pay.
Sorting sold listings
Sold listings support date-based sorting, not available in active search:
SOLD_SORT_MAP = {
"best_match": "12",
"price_asc": "15",
"price_desc": "16",
"date_desc": "13", # Most recent sales first (default)
"date_asc": "1", # Oldest sales first
}
Use date_desc (default) to see the most recent sales, which better reflect current market prices.
Parsing sold-specific fields
Sold listings have additional fields to extract: the sale date and whether it sold via auction, “Buy It Now”, or “Best Offer”. The code below is simplified for clarity. The full source handles both modern and legacy eBay layouts:
def _parse_sold_item(self, item) -> dict | None:
try:
# ... standard fields (title, image, etc.) ...
# Sold price
price_elem = item.select_one(".s-card__price")
sold_price = (
self._clean_price(price_elem.get_text(strip=True)) if price_elem else None
)
# Sale date (e.g., "Sold Jan 24, 2026") - extracted from caption
sold_date = None
caption_elem = item.select_one(".s-card__caption")
if caption_elem:
caption_text = caption_elem.get_text(strip=True)
if caption_text.lower().startswith("sold"):
sold_date = caption_text.replace("Sold", "").strip()
# Determine buy format from attribute rows
bids = None
buy_format = None
for row in item.select(".s-card__attribute-row"):
text = row.get_text(strip=True).lower()
if "bid" in text:
bid_match = re.search(r"(\\d+)\\s*bid", text)
if bid_match:
bids = int(bid_match.group(1))
buy_format = "Auction"
elif "buy it now" in text:
buy_format = "Buy It Now"
elif "best offer" in text:
buy_format = "Best Offer Accepted"
return {
"listing_id": listing_id,
"title": title,
"sold_price": sold_price,
"sold_date": sold_date,
"buy_format": buy_format,
"bids": bids,
# ...
}
except Exception:
return None
Tracking the buy format matters for price analysis. Auction prices often differ significantly from fixed prices for the same item. Auctions might go below market value with little competition, or above market value in bidding wars. Best offer sales indicate the seller accepted less than the listed price.
Calculating price statistics
The sold scraper includes automatic price analysis:
def _calculate_price_stats(self, items: list) -> dict:
prices = [item["sold_price"] for item in items if item.get("sold_price")]
if not prices:
return {}
stats = {
"count": len(prices),
"min": min(prices),
"max": max(prices),
"avg": round(statistics.mean(prices), 2),
"median": round(statistics.median(prices), 2),
}
# Standard deviation requires at least 2 values
if len(prices) >= 2:
stats["std_dev"] = round(statistics.stdev(prices), 2)
# Distribution buckets for quick analysis
stats["price_ranges"] = {
"under_25": len([p for p in prices if p < 25]),
"25_to_50": len([p for p in prices if 25 <= p < 50]),
"50_to_100": len([p for p in prices if 50 <= p < 100]),
"100_to_250": len([p for p in prices if 100 <= p < 250]),
"250_to_500": len([p for p in prices if 250 <= p < 500]),
"over_500": len([p for p in prices if p >= 500]),
}
return stats
Why median over average? For pricing data, the median is usually more useful. A few unusually high or low sales (outliers) can skew the average significantly, but the median gives you the true "middle" price that most buyers paid.
Standard deviation indicates price consistency. A high value means prices vary widely (useful for identifying markets where condition or seller reputation significantly affects price), while a low value indicates stable pricing.
Usage and output
Here's how to search sold listings and get automatic price statistics:
scraper = EbaySoldScraper()
results = scraper.search_sold_multiple_pages(
query="sony wh-1000xm4",
max_pages=3,
condition="used",
sort="date_desc", # Most recent sales first
)
stats = results["price_statistics"]
print(f"Analyzed {stats['count']} recent sales")
print(f"Price range: ${stats['min']:.2f} - ${stats['max']:.2f}")
print(f"Median sold price: ${stats['median']:.2f}")
Here's what the output looks like:
{
"items": [
{
"listing_id": "127632981214",
"title": "Sony WH-1000XM4 Wireless Noise Canceling Headphones",
"sold_price": 127.50,
"currency": "USD",
"sold_date": "Jan 24, 2026",
"condition": "Pre-Owned",
"buy_format": "Auction",
"bids": 12,
"shipping": "Free shipping",
"seller": {
"name": "audio_deals",
"feedback_pct": 99.8,
"feedback_count": 4520,
},
"image_url": "https://i.ebayimg.com/images/g/xxxxx/s-l500.jpg",
"item_url": "https://www.ebay.com/itm/127632981214",
}
],
"price_statistics": {
"count": 59,
"min": 45.00,
"max": 189.99,
"avg": 112.34,
"median": 108.50,
"std_dev": 32.15,
"price_ranges": {
"under_25": 0,
"25_to_50": 3,
"50_to_100": 22,
"100_to_250": 34,
"250_to_500": 0,
"over_500": 0,
},
},
"total_results": 59,
"pages_scraped": 3,
"error": null,
}
The buy_format field indicates how the item sold: "Auction", "Buy It Now", or "Best Offer Accepted". The bids field is only populated for auction sales.
Scrape eBay seller profiles and reviews
Seller data adds important context to pricing research. A product at $50 from a seller with 99.8% positive feedback over 10,000 transactions is different from the same product at $45 from someone with 12 reviews.
A note on seller pages: These are more protected than search results. SeleniumBase's CAPTCHA handling helps, and for larger volumes, rotating residential proxies can make a noticeable difference.

What you can do with this data
Seller data enables several practical applications:
-
Assess seller reliability: check feedback percentage and detailed ratings before buying
-
Identify trusted sellers: find top-rated sellers with high volume and consistent ratings
-
Analyze competitor sellers: understand their feedback patterns and common complaints
-
Monitor seller reputation: track rating changes over time for specific sellers
Getting seller info from a listing
The most reliable approach is to extract seller data from a product listing page rather than the seller's profile page directly. Listing pages have lighter anti-bot protection and contain seller info, detailed ratings, and recent reviews. The code below is simplified for clarity. The full source includes CAPTCHA handling and multiple fallback selectors.
def get_seller_from_listing(self, item_id: str) -> dict:
url = f"https://www.ebay.com/itm/{item_id}"
result = {
"item_id": item_id,
"url": url,
"seller": {},
"detailed_ratings": {},
"reviews": [],
"total_feedback_count": None,
"error": None,
}
try:
with SB(
uc=True,
headless=self.headless,
incognito=True,
xvfb=self.xvfb,
proxy=self.proxy,
) as sb:
sb.uc_open_with_reconnect(url, reconnect_time=4)
random_delay()
page_source = sb.get_page_source()
seller_data = self._parse_seller_from_listing(page_source)
result["seller"] = seller_data
result["detailed_ratings"] = seller_data.pop("detailed_ratings", {})
result["reviews"] = seller_data.pop("reviews", [])
return result
except Exception as e:
result["error"] = str(e)
return result
Parsing seller details
The listing page contains seller info in several sections. The scraper extracts from the store information section and falls back to alternative selectors:
def _parse_seller_from_listing(self, html: str) -> dict:
soup = BeautifulSoup(html, "html.parser")
seller = {}
# Primary: store information section
store_info = soup.select_one(".x-store-information")
if store_info:
store_name = store_info.select_one(".x-store-information__store-name a")
if store_name:
seller["username"] = store_name.get_text(strip=True)
seller["store_url"] = store_name.get("href")
# Parse "99.3% positive feedback • 804K items sold"
highlights = store_info.select_one(".x-store-information__highlights")
if highlights:
text = highlights.get_text(strip=True)
pct_match = re.search(r"([\d.]+)%\s*positive", text, re.IGNORECASE)
if pct_match:
seller["positive_feedback_pct"] = float(pct_match.group(1))
items_match = re.search(
r"([\d.]+[KMB]?)\s*items?\s*sold", text, re.IGNORECASE
)
if items_match:
seller["items_sold"] = self._parse_abbreviated_number(
items_match.group(1)
)
return seller
The abbreviated number problem: eBay displays "4.4K items sold" rather than "4,400 items sold" for readability. A helper function converts these:
def _parse_abbreviated_number(self, text: str) -> int:
"""Convert '4.4K' to 4400, '1.2M' to 1200000, etc."""
text = text.strip().upper()
multipliers = {"K": 1000, "M": 1000000, "B": 1000000000}
for suffix, mult in multipliers.items():
if suffix in text:
return int(float(text.replace(suffix, "")) * mult)
return int(float(text.replace(",", "")))
Extracting detailed seller ratings
eBay tracks four specific metrics with 1 to 5 star ratings: accurate description, shipping cost, shipping speed, and communication. These provide more nuance than the overall feedback percentage. A seller might have 99% positive feedback but a 4.2 star shipping speed rating, indicating a pattern of slow deliveries.
# Detailed ratings section
ratings_section = soup.select_one(
".fdbk-detail-seller-rating, [data-testid='seller-rating']"
)
if ratings_section:
rating_items = ratings_section.select(".fdbk-detail-seller-rating__item")
for item in rating_items:
label_elem = item.select_one(".fdbk-detail-seller-rating__label")
value_elem = item.select_one(".fdbk-detail-seller-rating__value")
if label_elem and value_elem:
label = label_elem.get_text(strip=True).lower()
value = float(value_elem.get_text(strip=True))
if "description" in label:
seller["detailed_ratings"]["accurate_description"] = value
elif "shipping cost" in label:
seller["detailed_ratings"]["shipping_cost"] = value
elif "shipping speed" in label:
seller["detailed_ratings"]["shipping_speed"] = value
elif "communication" in label:
seller["detailed_ratings"]["communication"] = value
Filtering feedback
The get_feedback_page() method supports several filters for targeted analysis:
# Get only negative reviews
result = scraper.get_feedback_page(
username="seller_name",
pages=3,
rating_type="negative", # "all", "positive", "negative", "neutral"
)
# Filter by topic
result = scraper.get_feedback_page(
username="seller_name",
topic="shipping", # quality, value, shipping, description, etc.
)
# Only reviews with photos
result = scraper.get_feedback_page(username="seller_name", photos_only=True)
Filtering for negative reviews is useful for identifying recurring issues before purchasing from a seller.
Usage and output
Here's how to get seller data from a listing:
scraper = EbaySellerScraper()
# Works with both item IDs and full URLs
result = scraper.get_seller_from_listing("315166443569")
result = scraper.get_seller_from_listing("https://www.ebay.com/itm/315166443569")
print(f"Seller: {result['seller']['username']}")
print(f"Feedback: {result['seller']['positive_feedback_pct']}%")
print(f"Items sold: {result['seller']['items_sold']}")
Here's what the output looks like:
{
"item_id": "315166443569",
"url": "https://www.ebay.com/itm/315166443569",
"seller": {
"username": "Topmate Official",
"store_url": "https://www.ebay.com/str/topmateofficial",
"positive_feedback_pct": 98.8,
"items_sold": 4400,
"member_since": "Feb 2022",
"feedback_score": 992,
"location": "Baldwin Park, CA, United States",
"is_top_rated": false,
},
"detailed_ratings": {
"accurate_description": 4.9,
"shipping_cost": 5.0,
"shipping_speed": 5.0,
"communication": 5.0,
},
"reviews": [
{
"comment": "The wireless keyboard and mouse work great. Both connected fast and feel solid.",
"rating": "positive",
"time_period": "Past 6 months",
"verified_purchase": true,
}
],
"total_feedback_count": 992,
"error": null,
}
Note: The scraper extracts recent reviews shown on the listing page (~5 to 10 reviews). For more reviews, use get_feedback_page() with the seller's username.
The seller scraper also provides:
-
get_seller_profile(username): scrape the seller's profile page directly
-
get_feedback_page(username, pages=3): scrape paginated feedback with filters
-
search_seller_items(username): get the seller's current listings
Further reading: What Is a Forward Proxy? Forward vs Reverse Proxy, and What It’s Used For and 8 Best Proxies for AI Tools and Scalable Data Collection in 2026.
Scraping eBay reliably at scale
The scrapers above handle typical use cases without modification. If you're planning to scrape more than a few dozen pages, these techniques will help you do it reliably.
Rate limiting and delays
Anti-bot systems track request frequency. Too fast, and you'll trigger CAPTCHA or blocks; too slow, and you waste time. The optimal range for eBay is typically 2 to 5 seconds between requests.
The scrapers in this guide already include randomized delays:
import random
import time
def random_delay(min_sec: float = 2.0, max_sec: float = 5.0):
"""Pause for a random interval to appear more human-like."""
delay = random.uniform(min_sec, max_sec)
time.sleep(delay)
return delay
Random delays work better than fixed ones. Predictable intervals are a bot fingerprint. Varying your timing mimics natural browsing patterns.
Scaling up: when to add proxies
Your IP has a reputation score. As you make more requests, that score degrades until you start seeing CAPTCHAs or blocks. Block thresholds vary significantly depending on request frequency, IP reputation, and browsing behavior, so there is no consistent page limit per IP.
Residential proxies solve this by distributing requests across many IPs, so no single IP accumulates enough requests to trigger limits. They're the standard choice for eCommerce sites like eBay because they use real ISP addresses that blend in with normal traffic.
When to add proxies:
-
Under 50 pages: Your IP is usually fine
-
50 to 500 pages: Residential proxies recommended
-
500+ pages/day: Residential proxies + rotation
Live Proxies provides proxies from your dashboard in this format: IP:PORT:USERNAME-ACCESS_CODE-SID:PASSWORD
To use with SeleniumBase, restructure it as username:password@server:port (without http:// prefix. SeleniumBase docs explicitly state not to include it):
# Dashboard format: 45.127.248.131:7383:LV71125532-mDmfksl3onyoy-1:bW2VN4Zc5YSyK5nF82tK
# Restructure for SeleniumBase:
PROXY_URL = "LV71125532-mDmfksl3onyoy-1:[email protected]:7383"
with SB(uc=True, incognito=True, proxy=PROXY_URL) as sb:
sb.uc_open_with_reconnect(url, reconnect_time=3)
The session ID (-1, -2, etc.) in the username gives you sticky sessions with the same IP for up to 24 hours, which helps maintain consistent behavior during a scraping session. Remove the session ID suffix for rotating sessions (new IP each request).
Rotating proxies automatically cycle through IPs, so each request appears to come from a different user. Live Proxies also offers private IP allocation, which can reduce overlap with other users, but IP reputation may still be affected by prior activity or target-site detection mechanisms.
Scaling beyond 100 pages? Check out our residential proxy plans: activation takes ~10 minutes.
Retry logic with exponential backoff
When requests fail, retrying immediately usually fails too. Your IP is still flagged. Exponential backoff gives the rate limiter time to reset:
from functools import wraps
def retry_on_failure(max_retries: int = 3, initial_delay: float = 2.0):
"""Decorator that retries failed functions with exponential backoff."""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
delay = initial_delay
for attempt in range(max_retries + 1):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt < max_retries:
logger.warning(
f"Attempt {attempt + 1} failed, retrying in {delay}s..."
)
time.sleep(delay)
delay *= 2 # Double the delay each time
else:
raise
return wrapper
return decorator
How it works: First retry waits 2 seconds, second waits 4, third waits 8. This pattern is standard in production systems. It's how well-behaved clients handle temporary failures without hammering servers.
@retry_on_failure(max_retries=2, initial_delay=3.0)
def _fetch_and_parse(self, url: str) -> dict:
# ... scraping logic ...
Conclusion
You now have 4 working scrapers for eBay: search results, item details, sold prices, and seller profiles.
The patterns here (SeleniumBase UC mode, random delays, fallback selectors) work beyond eBay too. Most eCommerce sites use similar protections.
For small projects, the default settings are sufficient. When you're ready to scale, residential proxies help maintain consistent success rates.
Get started with Live Proxies →
FAQs
How do I find the eBay item ID and why does it matter?
It's the 12-digit number in the listing URL: for example, ebay.com/itm/315166443569. On the listing page, you'll find it in the "About this item" section. The scrapers accept either the full URL or just the ID.
Why does my scraper return different prices than what I see in the browser?
Most commonly, location-based display: eBay shows different prices, currencies, and VAT based on your IP location. If your scraper runs from a different IP than your browser, prices will differ. Other causes: JavaScript-loaded content not rendering fully, or shipping displayed differently. To diagnose, run the scraper and browser from the same IP at the same time.
How do I scrape eBay safely on a small scale for a personal project?
Use the scrapers with default settings: they include 2 to 5 second random delays. Avoid headless mode (it's detectable). For very light use, you may not need proxies immediately, but there's no guaranteed "safe" threshold: eBay's detection depends on your IP reputation, request patterns, and timing. If you start seeing CAPTCHAs, add proxies or reduce frequency.
How do I keep my eBay scraper stable when page layouts change?
The item scraper prioritizes JSON-LD data (eBay's structured data), which is separate from HTML and doesn't change when visual layouts update. The search scraper includes fallback selectors for both new and legacy layouts. Monitor your output regularly: empty results or missing fields could signal layout changes, blocks, or other issues.




