Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Archive by Author

How to scrape and parse Wikipedia

Today’s exercise is to create a list of the longest and deepest caves in the UK from Wikipedia. Wikipedia pages for geographical structures often contain Infoboxes (that panel on the right hand side of the page). The first job was for me to design an Template:Infobox_ukcave which was fit for purpose. Why ukcave? Well, if […]

How to get along with an ASP webpage

Fingal County Council of Ireland recently published a number of sets of Open Data, in nice clean CSV, XML and KML formats. Unfortunately, the one set of Open Data that was difficult to obtain, was the list of sets of open data. That’s because the list was separated into four separate pages. The important thing […]

Tweeting the drilling

A very long time ago I discovered the easiest webscraping target: the locations of all the North Sea Oil wells. Once you webcrawl through the index pages, the entries were pretty straightforward. There were dates, water depths (in feet or metres), GPS locations and so on. The code, if you want to look at it, […]

We're hiring!