Vincent Cadoret Revit & BIM Specialist

5Apr/100

Free and simple web scraping

Posted by Vincent

What is web scraping? It's the process of automatically navigating through web sites to extract relevant data. Although some people use this to navigate through simple sites, it's mostly for power users.

If you search for "web scraping software" in Google you will find a ton of different companies. But the best and most flexible one I have found yet is the iMacros plugin for Firefox. It allows you to script pages using javascript and simplifies the task of extracting the text from the pages.

By running it in Firefox it also allows you to easily route through a proxy which is useful if you are afraid a site may block you from scanning their website too often.

The only issue I have run into is when I run a script in Firefox for Windows there seems to be a memory leak that makes it crash after a few hours... but it runs fine on Mac OS X!

http://www.iopus.com/iMacros/firefox/