data-driven london week

view of the Shard from Shoreditch Most mornings this week, I awoke in the mystical land of Hackney, and battled hordes of hipster-cyclists to make my way to the Google Campus – a refuge of data-folk. At least, that’s how I like to remember it.

As I blogged last week, several ScraperWikians attended and spoke at a range of events, all put on to the tune of “Big Data.” I spent Monday evening with a friendly meetup group talking about the importance of data in marketing. And on Wednesday, I watched a very smart presentation by Thomas Stone (hopefully, soon-to-be Dr. Stone) from prediction.io, which looks to be an interesting, open-source project for developers to call upon machine learning without the need for proprietary lock-in.

Alongside Stone, I also learned about Games Analytics from their COO, Mark Robinson. The gist of the talk was that games – particularly online games – give their producers the chance to deeply understand how players actually use their product. Through continuous contact with the players, they can learn: what stops them from playing, where they find it difficult to continue, how many times they log-in before purchasing… What I liked about this, was the lack of hand-wavey discussion about “data leading to insights.” Instead, Robinson’s talk focused on how this data can lead to quite practical decisions, such as making levels of a game quicker at the start, reducing the cost in places, and increasing it in others.

Between those two events, I had the tremendous privilege of joining around 120 others for the W3C’s Open Data on the Web. The remarkable brain-power per square inch at the workshop was mentioned quite a few times, and – although I tend to feel disinclined to perpetuate that kind of talk – I must agree. The Campus hosted architects, businesspeople, developers, hackers and scientists, from government bodies, universities, NGOs and foundations mixed with large companies (including IBM, Adobe, Tesco and Google).

I was particularly drawn to discussions about building and growing businesses on data. I’m intrigued by, and think ScraperWiki is well-placed to, work on addressing the use of open data to augment private data – for example: taking aggregated customer data, and matching with government stats, open geographic data, public social media, etc. I’ve got a few ideas for some tooling to the new ScraperWiki platform, which I’d like to explore in a few weeks.

I don’t feel there is enough space here to do proper justice to the topics covered, but suffice it to say I’m glad I had a chance to go, and was able to take part in the afternoon’s Barcamp (our team discussed the application of the recent revolution of distributed coding workflows to data handling – in other words, Github for data).

I would also like to point out a few of the sessions, and recommend the papers to read:

I don’t have a link yet to Tescos’ talk (just the abstract) about their huge sets of data (product, customers, locations, journeys…), but if anyone has, or as soon as I find it, I’ll put it here!

ScraperWiki

Extract tables from PDFs and scrape the web

Blog

data-driven london week