Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Blog

What happened in New York

At our New York datacamp, we set out to liberate data, teach people to liberate data, and find stories in data.

About 100 people showed up for the event, and about 40 of them attended the Learn to Scrape sessions.

The hacking was punctuated by talks by Tom Lee of the Sunlight Foundation and Jake Porway of Data Without Borders

JDCNY journalists and developers at work

Projects

Dan Nguyen scraped Florida mugshots from and used face.com‘s API to analyse each photo to tell you the arrestee’s mood.

Mike Caprio and team cleaned a spreadsheet of 80,000 records from the New York lobbyist website to power a site on New York lobbyists based on the Chicago Lobbyists site It appears that $120 million was spend on New York on lobbiests in 2011.

JDCNY: the Stop & Frisk group hard at work

Michael Keller, Marc Georges et al. related the NYPD stop, question and frisk data nine mosques referenced in an NYPD report on surveillance in order to see whether there had been unusual changes in stopping activity around these mosques.

The dataset is insanely messy, but they fortunately had access to a relatively clean version that Data Without Borders had
developed in November.

They were still going strong after the data camp. Refusing to leave, they moved to a different to a different room after getting kicked out of the data camp space.

I helped one team relate contracts from Open Book New York to data that they had scraped by hand (!) from hand-written forms in order to identify pontential conflicts of interest.

I helped another team identify potential stories (outliers) in the NYC Open Data graffiti locations dataset.

Susan McGregor was “clearly hooked” because she liberated lobbyist contract details the next evening instead of watching the Superbowl.

JDCNY attendees wrangling lobbyist data

Technical Awards

Mike Caprio won Best Data Liberator for liberating the Iowa accident reports database.

Michelle Koeth won Best Creation of an API for scraping New York, NY hospitals from Medicare Hospital Compare

Jeremy Baron, from UN peacekeeping team, won Best Use of ScraperWiki for scraping United Nations PDFs. This team also scraped peacekeeping statistics and contributions

Honorary ScraperWikian

Susan McGregor was awarded Honorary ScraperWikian. We haven’t decided what that means yet. 🙂

Learning

JDCNY: the Stop & Frisk team present their workTeaching the Learn to Scrape sessions and working with many of the project teams, I got the impression that we had opened participants to thinking more about how data can be scraped, transformed and analyzed to identify unusual subsets and potential stories.

Our Learn to Scrape sessions seemed to work as well; I found several participants who had claimed no knowledge of webscraping prior to the sessions to be creating reasonably complex scrapers by the next afternoon.

What Next?

More data camps are coming up, and several groups plan on contining to work on their projects. But in the mean time, we now have lots of data for you to analyze!

Trackbacks/Pingbacks

  1. The UN peacekeeping mission contributions mostly baked | ScraperWiki Data Blog - February 22, 2012

    […] the Columbia event I was quite pleased to create a database of un_peacekeeping_statistics from a set of zip files of […]

We're hiring!