Scraping the web is a favorite pastime of many sourcers. In fact, a recent SourceCon Denver session about web scraping was one of the most popular sessions of the conference. If you’re new to web scraping, here are a few resources from the SourceCon archives to help you get started.
1) Aaron Lintz shows the SourceCon Live audience how to use Import.io
2) Glenn Gutmacher shows the SourceCon Live audience how to use Outwit Hub
3) Todd Davis’ Presentation in the 2013 SourceCon Labs shows how to extract data from Facebook with Memonic
http://vimeo.com/76926819
Tools needed to do what Todd teaches in the video above:
Article Continues Below
Firefox – Chrome, Internet Explorer, and other browsers won’t allow you to select multiple lines of text from a web page. This can only be done with Firefox.
Memonic – A great web clipping tool, you will also need to install the Memonic Firefox Add On.
Firefox Ad Ons – LinkedIn | Social Friend Finder (SFF) | Duck Duck Go | Google Maps (get company numbers and call)
Great information. I have used import.io. Now a days i am writing php and python scripts to web scrape. I have created custom web scraper to scrape sites like yell, yelp and many more.