I went to the Datakind‘s New York Datadive last November and met the Microfinance Information Exchange (MIX), a group that ‘delivers data services, analysis, research and business information on the institutions that provide financial services to the world’s poor’. They wanted to see whether web-scraping could save them from manually gathering data. So fellow divers and I showed MIX the utility […]
5 yr old goes ‘potty’ at Devon and Somerset Fire Service (Emergencies and Data Driven Stories)
It’s 9:54am in Torquay on a Wednesday morning: One appliance from Torquays fire station was mobilised to reports of a child with a potty seat stuck on its head. On arrival an undistressed two year old female was discovered with a toilet seat stuck on her head. Crews used vaseline and the finger kit to remove the […]
Handling exceptions in scrapers
When requesting and parsing data from a source with unknown properties and random behavior (in other words, scraping), I expect all kinds of bizarrities to occur. Managing exceptions is particularly helpful in such cases. Here is some ways that an exception might be raised. [][0] #The list has no zeroth element, so this raises an […]
Parsing panic
This is a guest post by Martha Rotter, co-founder of Woop.ie and recently launched Irish technology magazine Idea. Hey remember the Wikipedia blackout? I do, because I was highly amused by the number of students panicking due to papers or homework they seemingly could not complete without this one website. One of my favourite things to […]
Is scraping legal?
Lots of people, when they hear about ScraperWiki, ask “is scraping legal? how can you build a business off that?”. Usually to follow up by saying “we do it in our company, but we would never tell anyone”. This is strange to us, as we have come from a world of good scraping. Taking Government […]
International Data Journalism Awards….deadline fast approaching..(10th April 2012)
Everybody is talking and trying to do ‘data journalism’ and the first ever International Data Journalism Awards have been established to recognise the huge effort that people are making in this field. It’s a great opportunity to showcase your work. Backed by Google, the prizes are generous at €45,000 (over $55,000) to six winners and […]
Fine set of graphs at the Office of National Statistics
It’s difficult to keep up. I’ve just noticed a set of interesting interactive graphs over at the Office of National Statistics (UK). If the world is about people, then the most fundamental dataset of all must be: Where are the people? And: What stage of life are they living through? A Population Pyramid is a […]
Telling Stories with Data: Life at a Hispanic Serving University in Texas!
Guest post by Cindy Royal I’m an associate professor in the School of Journalism and Mass Communication at Texas State University in San Marcos. We’re just a short distance from Austin, with a large (>34,000 students) and diverse campus. Since I joined the faculty at Texas State, I have been focusing on advancing students’ technology […]
From CMS to DMS: C is for Content, D is for Data
This is a joint blog post by Francis Irving, CEO of ScraperWiki, and Rufus Pollock, Founder of the Open Knowledge Foundation. It’s being cross-posted to both blogs. Content Management Systems, remember those? It’s 1994. You haven’t heard of the World Wide Web yet. Your brother goes to a top university. He once overheard some geeks […]
Welcome Jane, from Ubuntu, to ScraperWiki’s board
What’s a company’s board of directors for? Ultimately it’s to hire or fire the CEO. But that doesn’t happen very often. What happens more often is board meetings. But what are they for? They give directors a status update, and are a place to do legal administrivia. But even that is most efficiently done in […]