cuny – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 “the impact on our industry only begins this weekend” says Susan E McGregor, Professor at the world’s foremost school of journalism https://blog.scraperwiki.com/2012/02/the-impact-on-our-industry-only-begins-this-weekend-says-susan-mcgregor-professor-at-the-worlds-foremost-school-of-journalism/ Wed, 01 Feb 2012 18:15:23 +0000 http://blog.scraperwiki.com/?p=758216134 This is a guest blog post by Susan E. McGregor – Assistant Professor at the Tow Center for Digital Journalism Columbia University

The Tow Center for Digital Journalism at Columbia University Graduate School of Journalism is proud to be partnering with Knight News Challenge winner ScraperWiki this Friday and Saturday for their first Journalism Data Camp in the U.S. This event provides us with an opportunity to host a wide range of programmers, journalists and educators interested in expanding access to essential data sets, while connecting those communities to one another. We are also looking forward to extending the impact of this weekend’s activities by working in conjunction with our colleagues at the Stabile Center for Investigative Journalism and The New York World to further pursue those stories related to New York accountability issues that may be touched on during this weekend’s data “liberation” activities.

As an online tool, ScraperWiki is an innovative technical platform that allows users to build, test, and execute programmatic “scrapers” that transform web pages and pdfs into more accessible, usable data formats. As an online archive and repository, ScraperWiki helps improve access to scraped data sets by making them collectively available on their website. Finally, as a web-based collaboration space, ScraperWiki helps convene journalists and programmers around projects of shared interest, in addition to fostering peer-to-peer training and support.

Each of the above features of the ScraperWiki platform resonates closely with the Tow Center’s own priorities for data journalism. Making data available in formats that can be easily parsed, analyzed, and distributed is an essential part of data transparency, and the accountability journalism it serves. Providing a public access point for that data allows both journalists and their audiences to fact-check and elaborate upon the work that their peers have done, leveraging it against future projects and creating more comprehensive resources. And of course, the knowledge sharing and collaboration that takes place between programmers and journalists through ScraperWiki echoes the Tow Center’s mandate to educate and innovate at the intersection of computer science and journalism, both through its own dual-degree program in computer science and journalism, and through such public events as this one.

While we are certain that ScraperWiki will find ready adoption in cities and newsrooms throughout the country in the months to come, we look forward to growing an ongoing relationship with ScraperWiki and its contributors here in the New York area. By hosting this event we hope to introduce many of our students and colleagues to a truly remarkable tool, one whose impact on our industry only begins this weekend.

]]>
758216134