Web Scraper Firefox Extension

Your web browser will send what is known as a “User Agent” for every page you access. This is a string to tell the server what kind of device you are accessing the page with. Here are some common User Agent strings:

BrowserUser Agent
Firefox on Windows XPMozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6
Chrome on LinuxMozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.63 Safari/534.3
Internet Explorer on Windows VistaMozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)
Opera on Windows VistaOpera/9.00 (Windows NT 5.1; U; en)
AndroidMozilla/5.0 (Linux; U; Android 0.5; en-us) AppleWebKit/522+ (KHTML, like Gecko) Safari/419.3
IPhoneMozilla/5.0 (iPhone; U; CPU like Mac OS X; en) AppleWebKit/420+ (KHTML, like Gecko) Version/3.0 Mobile/1A543a Safari/419.3
BlackberryMozilla/5.0 (BlackBerry; U; BlackBerry 9800; en) AppleWebKit/534.1+ (KHTML, Like Gecko) Version/6.0.0.141 Mobile Safari/534.1+
Python urllibPython-urllib/2.1
Old Google BotGooglebot/2.1 ( http://www.googlebot.com/bot.html)
New Google BotMozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
MSN Botmsnbot/1.1 (+http://search.msn.com/msnbot.htm)
Yahoo BotYahoo! Slurp/Site Explorer

Webscraper.io is a web scraping tool provider with a Chrome browser extension and a Firefox add-on. The webScraper.io Chrome extension is one of the best web scrapers you can install as a Chrome extension. With over 300,000 downloads – and impressive customer reviews in the store, this extension is a must-have for web scrapers. Web Scraper Web Scraper is a chrome browser extension and a library built for data extraction from web pages. Using this extension you can create a plan (sitemap) how a web site should be traversed and what should be extracted. Using these sitemaps the Web Scraper will navigate the site accordingly and extract all data.

Web Scraper Firefox Extension

Free Web Scraper Tool

You can find your own current User Agent here.

Web Scraper Firefox Extension Windows 10

Some webpages will use the User Agent to display content that is customized to your particular browser. For example if your User Agent indicates you are using an old browser then the website may return the plain HTML version without any AJAX features, which may be easier to scrape.

Some websites will automatically block certain User Agents, for example if your User Agent indicates you are accessing their server with a script rather than a regular web browser.

Extension

Fortunately it is easy to set your User Agent to whatever you like:

  • For FireFox you can use User Agent Switcher extension.
  • For Chrome there is currently no extension, but you can set the User Agent from the command line at startup: chromium-browser –user-agent=”my custom user agent”
  • For Internet Explorer you can use the UAPick extension.
  • And for Python scripts you can set the proxy header with:

    proxy = urllib2.ProxyHandler({‘http’: IP})
    opener = urllib2.build_opener(proxy)
    opener.urlopen(‘http://www.google.com’)

Web Scraper Firefox Extension Download

Scraper

Using the default User Agent for your scraper is a common reason to be blocked, so don’t forget.

Please enable JavaScript to view the comments powered by Disqus.blog comments powered by

Firefox Extension Web Scraper

Disqus