Full-site crawlers, anti-bot scraping APIs, browser automation runtimes, no-code monitors, parser libraries, and LLM-ready extraction tools for AI, research, and data pipelines.
20 tools found
40K+ GitHub stars, 80K+ companies
LLM-first crawl and scrape API for turning pages or full sites into markdown, JSON, screenshots, and mapped URLs with managed rendering and agent workflows.
Scrapes millions of pages daily
Open-source crawling framework for JavaScript and Python that combines request orchestration, queueing, proxies, and browser automation for reliable scraper development.
Trusted by Microsoft, McKinsey & Accenture
Actor-based cloud platform for running scrapers, browser automation jobs, and website content crawlers with built-in datasets, scheduling, storage, proxies, and AI integrations.
Trusted by 20,000+ customers globally
Managed web-data platform combining proxy infrastructure, browser and crawl APIs, anti-bot unblocking, ready datasets, and enterprise web-data delivery at scale.
Popular scraping service
Managed scraping API that handles headless browsers, proxy rotation, JavaScript rendering, screenshots, and anti-bot bypass so teams only focus on extraction.
Widely adopted automation tool
Headless Chrome automation library for scripted browsing, rendering, screenshots, PDFs, and custom scraping workflows in JavaScript environments.
Used by 35K+ companies
Cross-browser automation runtime for scripted interaction, rendering, screenshots, testing, and custom extraction flows in modern engineering stacks.
Leading web scraping framework
Battle-tested Python crawling framework for building large scraping jobs, request pipelines, and repeatable extractors with full control over the crawl stack.
Trusted by 150+ enterprise customers
jQuery-style HTML parser for Node.js that extracts and transforms page markup once retrieval is already handled elsewhere.
Popular open-source tool
Lightweight reader API that converts any reachable URL into LLM-friendly markdown or JSON for agent prompts, retrieval, and downstream AI workflows.
AI-powered scraping solution
Prompt-driven extraction toolkit for using LLMs to pull structured data from pages without manual selector authoring, with options for code or API workflows.
Used by 42 Fortune 500 companies
AI-powered extraction platform for turning messy web pages into normalized entities, structured records, and knowledge-graph-style web data feeds.
4.5M+ users since 2016
No-code scraping platform with desktop and cloud execution, auto-detect workflows, templates, scheduling, and exports for business data collection at scale.
Popular web scraping tool
Visual scraping tool for dynamic websites that uses browser rendering, click workflows, and scheduled runs to export structured data without custom code.
Industry-standard automation tool
Multi-language browser automation framework for scripted interaction, UI testing, and custom browsing flows across the major browser engines.
Used by 1,983+ companies worldwide
Python parsing library for turning raw HTML and XML into navigable document trees when you already control fetching or crawling upstream.
1M+ users trust it
Chrome extension for quickly extracting tabular data from pages into spreadsheets without writing code or setting up a crawler stack.
9,000+ GitHub stars
No-code scraping and monitoring platform for point-and-click robots, deep scraping workflows, scheduled alerts, and turning websites into APIs or live datasets.
Enterprise scraping solution
All-in-one web scraping platform that combines automated unblocking, headless rendering, AI extraction, proxy intelligence, and managed compliance-minded data collection.
Modern web scraping platform
Rust-powered crawler built for LLM data pipelines, large site traversal, and extraction workflows that output clean content and structured crawl results.
Side-by-side tool comparisons to help you decide
Latest features, tools, and updates