Conquering Copyright and Scaling Open Data Projects – How Chris Taggart is Counting Culture

Chris Taggart is a founder of OpenlyLocal and OpenCorporates. He says “When people ask what I do I say I open up data, sometimes whether people like it or not.” In the beginning he didn’t really expect much to come of his first scrapers “other than maybe being told off by the councils, because all the councils at that time had got things on their website saying this is copyright”.

He did it anyway with a very profound outcome:

I expected them to send me a take down notice … actually that didn’t happen. What did happen is that a couple of councils contacted us and said we like what you’re doing, will you start scraping us.

His first success spurred him on to create an even more ambitious project. Corporate data. He knew he’d be looking at a vast array of sources scattered across the web, in different languages and formats. So he made call out on ScraperWiki for OpenCorporates. It currently has information from 22 million companies across 28 jurisdictions. And it’s an alpha! I caught up with him on Skpye to find out what he’s learnt about conquering copyright and scaling open data projects.

Tags: chris taggart, opencorporates, OpenlyLocal, video

One Response to “Conquering Copyright and Scaling Open Data Projects – How Chris Taggart is Counting Culture”

Henare Degan September 16, 2011 at 10:20 am #

Just so you know you’re not alone Chris, I can echo your experiences with what we’ve found here in Australia.

As far as I know the government doesn’t have a list of all planning authorities or even councils in Australia – we had to crowd source it.

When we setup OpenAustralia.org the volunteers put in a lot of time ensuring we got permission to republish the Crown Copyrighted Hansard.

With PlanningAlerts we took the “ask for forgiveness rather than permission” approach and where did it get us? We regularly get councils emailing us asking if they could please be added to PlanningAlerts (i.e. scraped).

Not only that but a number of councils around Australia now embed a Google Map from PlanningAlerts on their site. We love that they’re doing this but it does mean we’re in the perplexing situation of scraping copyrighted data from a council website, republishing it using an open API, which the councils then use to put on their website!

ScraperWiki

Extract tables from PDFs and scrape the web

Blog

Conquering Copyright and Scaling Open Data Projects – How Chris Taggart is Counting Culture

One Response to “Conquering Copyright and Scaling Open Data Projects – How Chris Taggart is Counting Culture”