WebScraper class

A web scraper that uses proxies to avoid detection and blocking

Available extensions

Constructors

WebScraper({required ProxyManager proxyManager, ProxyHttpClient? httpClient, String? defaultUserAgent, Map<String, String>? defaultHeaders, int defaultTimeout = 30000, int maxRetries = 3, AdaptiveScrapingStrategy? adaptiveStrategy, SiteReputationTracker? reputationTracker, ScrapingLogger? logger})
Creates a new WebScraper with the given parameters

Properties

hashCode int
The hash code for this object.
no setterinherited
logger ScrapingLogger
Gets the scraping logger
no setter
proxyManager ProxyManager
The proxy manager for getting proxies
final
reputationTracker SiteReputationTracker
Gets the site reputation tracker
no setter
runtimeType Type
A representation of the runtime type of the object.
no setterinherited

Methods

close() → void
Closes the HTTP client
extractData({required String html, required String selector, String? attribute, bool asText = true}) List<String>
Parses HTML content and extracts data using CSS selectors
extractStructuredData({required String html, required Map<String, String> selectors, Map<String, String?>? attributes}) List<Map<String, String>>
Parses HTML content and extracts structured data using CSS selectors
fetchFromProblematicSite({required String url, Map<String, String>? headers, int? timeout = 60000, int? retries = 5}) Future<String>

Available on WebScraper, provided by the WebScraperExtension extension

Fetches HTML content from a problematic site using specialized techniques
fetchHtml({required String url, Map<String, String>? headers, int? timeout, int? retries}) Future<String>
Fetches HTML content from the given URL
fetchHtmlWithRetry({required String url, Map<String, String>? headers, int? timeout, int? retries, int initialBackoffMs = 500, double backoffMultiplier = 1.5, int maxBackoffMs = 10000}) Future<String>

Available on WebScraper, provided by the WebScraperExtension extension

Fetches HTML content with enhanced error handling and retry logic
fetchJson({required String url, Map<String, String>? headers, int? timeout, int? retries}) Future<Map<String, dynamic>>
Fetches JSON content from the given URL
noSuchMethod(Invocation invocation) → dynamic
Invoked when a nonexistent method or property is accessed.
inherited
toString() String
A string representation of this object.
inherited

Operators

operator ==(Object other) bool
The equality operator.
inherited