What’s occurin’? Loads in fact, at our first Welsh Hacks and Hackers Hack Day! From schools from space to catering colleges with a Food Safety Standard of 2, we had an amazing day. Check out the video by Gavin Owen: We got five teams: Co-Ordnance – This project aimed to be a local business tracker. […]
600 Lines of Code, 748 Revisions = A Load of Bubbles
When Channel 4’s Dispatches came across 1,100 pages of PDFs, known as the National Asset Register, they knew they had a problem on their hands. All that data, caged in a pixelated prison. So ScraperWiki let loose ‘The Julian’. What ‘The Stig’ is to Top Gear, ‘The Julian’ is to ScraperWiki. That and our CTO. […]
Spot and Normalize Inconsistent Measures
Here’s an example of why you have to be very careful when scraping, and why your normal run-of-the-mill technology that makes assumptions won’t cut it: One of our super-users, Julian Todd, decided to scrape the Vehicle Certification Agency (VCA) website on new car fuel consumption and exhaust emissions figures. And he spotted this: And another […]
US visa lottery winners statistics – not just the numbers
The ScraperWiki community has a mish-mash of user interests, so we have a mish-mash of data, scrapers and views. It’s actually quite fun to spend time looking around, to see what people have done and how they have approached a scrape. Samuel Chinweoke Nwaobia found some data on US visa lottery winners (or Green Card […]
Hacks/Hackers London
First of all, the Iraq War Logs: Round One – The Cleaning Documents, records and words all hugely intimidating in their vastness. But some tools to help are MySQL, Ultraedit and Google Refine. But this stage is incredibly frustrating. Round Two – The Problem How do you tackle the types of documents? There was even […]
The Web Data Revolution – a new future for journalism
This event hosted by The Guardian. They say: “The web not only gives easy access to billions of statistics on every matter – from MP’s expenses to the location of every public convenience in the UK – but also provides the tools to visualise said information, giving a clarity of voice and an equality of access […]