Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.


Hacks and Hackers Hack Day Report

(Posted by Richard)

Hack Day

Last Friday we organised The Hacks and Hackers Hack Day to see what happens when you put journalists and developers in the same room and ask them to come up with a data-driven story in one day. We also wanted to see how ScraperWiki would fare being used in anger in one of the areas we designed it for – data-driven journalism*.

We were a bit worried beforehand about whether the whole event would work, whether journalists and developers – who generally have different working practices – would work well together, and whether the hack day format would translate well to journalism or not. But it turned out to be a fantastic day and a great learning experience, with 30 top developers and journalists from organisations like the BBC, Financial Times, Times Labs and mySociety, as well as freelancers and members of the ScraperWiki team.

Groups formed quickly as people started discussing their ideas, and by the end of the day we all sat down to watch presentations from nine projects, who used everything from screen-scraping to old-fashioned cold-calling to get at data and turn it into something meaningful.

Here’s a list of what the teams built – or scroll down to the bottom to see a video of who won.

The Lazy Commuter

Shiva Kumar-Naspuri from the BBC, Francis Irving of mySociety, Simon Briscoe from the Financial Times and Simon Willison (who, as you will see below, managed to clone himself for the day) of the Guardian set out to find “where the laziest people in Britain live”. Using data from data.gov.uk they mapped and profiled the places where people make the most really short journeys in the UK.

Conservative Safe Seats

Developer Edmund van der Burg, freelance journalist Anne Marie Cumiskey, Charlie Duff from HRzone.co.uk, Ian McDonald of the BBC and Dafydd Vaughn munged a whole host of datasets together to produce an analysis of the new Conservative candidates in the 12 safest Tory seats in Britain.

Their conclusions: British white and male, average age 53, Oxford-educated, rarely on Facebook or Twitter.


Data.gov.uk Format Verifier

After a few initial conversations about how journalists could work what is and is not useful on the new data.gov.uk website, Tom Morris came up with the idea for a tool to crowdsource information on the format of the datasets available – and built the data.gov.uk format verifier.

At the time of writing, of the 2905 datasets listed on data.gov.uk, all but 405 had already been classified using the tool, and there is talk of importing the data created back into data.gov.uk!

This wasn’t remotely the kind of project we thought would come out of the hack day, but is a lovely example of what can happen when people from one discipline explain a problem to someone from another.

They Write For You

Jonathan Richards from Times Labs and developers Anna Powell-Smith, Premasagar Rose, Simon Willison and Julian Burgess took a look at which MPs write for which newspapers.

Using ScraperWiki to get at the data contained on the online news archive Journalisted and the list of MPs from the TheyWorkForYou.com API, they came up with this visulisation:

They Write For You

One surprising finding was that the Guardian has twice as many articles by MPs as any other newspaper. Full version

Lords Interests

Rob McKinnon from Who’s Lobbying did a comparison on which members claim what from the House of Lords.

City Hall Goes to Lunch

For my own little hack, I started thinking what I could do with a dataset I’d recently added to ScraperWiki that listed gifts and freebies that the Mayor of London and London Assembly members have received.

With a bit of data jiggling, I worked out that you can map where they have been taken out to dinner, as well as calculating the porkiest AM. (More in this in a future blog post.)

Who Pays Who (Enterprise Ireland)

Gavin Sheridan from TheStory.ie and Duncan Parkes of mySociety used ScraperWiki to combine a list of grants made by Enterprise Ireland (which Gavin had aquired via an FOI request) with the profile data listed on the Enterprise Ireland website. This will no doubt be a source for stories in the near future.


Writer and designer David McCandless and Simon Willison from the Guardian set about turning a year’s worth of horoscopes, screen-scraped from the Yahoo astrology website, into a beautiful, tongue-firmly-in-cheek visualisation that shows the bunkum of star signs in their full glory.

Bank Holidays

As the presentations were taking place, there was a last-minute hack from Julian Burgess and Chris Taggart in the form of a reusable screen-scraper for grabbing the offical list of UK bank holidays (the government provides these on a web page hidden away on direct.gov.uk).

The winners…

A huge thanks to Tom Loosemore from 4iP and Tom Steinberg from mySociety for judging the projects. Here’s the video of them announcing the winners, along with the winning presentations.

All in all it was a great day, and the quality of the projects was first-rate given the limited time. We’d love to organise anther Hacks and Hackers Hack Day, so please get in touch with us if you are interested!

* ScraperWiki wasn’t compulsory, but it was lovely to see it used in some of the hacks and we got some great feedback.

We're hiring!