jtownend – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 A reluctant goodbye to Guardian Local https://blog.scraperwiki.com/2011/04/a-reluctant-goodbye-to-guardian-local/ Thu, 28 Apr 2011 13:30:25 +0000 http://blog.scraperwiki.com/?p=758214699 ScraperWiki is sad to hear that Guardian Local is being wound down, just over a year after its public launch. We’ve had the good fortune to work with the talented Guardian Local journalists at three of our Hacks & Hackers events: in Cardiff, Leeds and Glasgow.

We would like to say a particular thank you to the project’s editor, Sarah Hartley, for her generous help. We wish Sarah, Hannah, John and Michael the very best in their new ventures, whatever they may be.

As you can see from the comments under the Guardian post announcing the sites’ closure, the beatbloggers, led by Sarah, have done amazing work for their respective communities. It’s testament to their hard work and energy that they’ve built up such a loyal following in a short space of time.

Michael MacLeod from Guardian Edinburgh at our Glasgow event (right):

Hacks & Hackers Glasgow: the BBC College of Journalism video https://blog.scraperwiki.com/2011/04/hacks-hackers-glasgow-the-bbc-college-of-journalism-video/ https://blog.scraperwiki.com/2011/04/hacks-hackers-glasgow-the-bbc-college-of-journalism-video/#comments Tue, 12 Apr 2011 08:07:25 +0000 http://blog.scraperwiki.com/?p=758214651 Last month we celebrated the final leg of our UK & Ireland Hacks & Hackers tour in Glasgow, at an event hosted by BBC Scotland and supported by BBC College of Journalism and Guardian Open Platform. You can read more about it here. Other coverage includes:

The BBC College of Journalism kindly filmed the whole thing and the videos are now available to watch. The whole playlist can be viewed here, or watch each segment in the clips below:

https://blog.scraperwiki.com/2011/04/hacks-hackers-glasgow-the-bbc-college-of-journalism-video/feed/ 4 758214651
Hacks & Hackers Cardiff: the video https://blog.scraperwiki.com/2011/04/hacks-hackers-cardiff-the-video/ Tue, 05 Apr 2011 07:35:01 +0000 http://blog.scraperwiki.com/?p=758214623 Gavin Owen, a recent postgraduate student at the Skillset Media Academy Wales, has produced this excellent video from the Hacks & Hackers Hack Day in Cardiff last month. Read about the projects here, and watch what the journalists and programmers got up to, below:

New event! Hacks & Hackers Glasgow (#hhhglas) https://blog.scraperwiki.com/2011/02/new-event-hacks-hackers-glasgow-hhhglas/ https://blog.scraperwiki.com/2011/02/new-event-hacks-hackers-glasgow-hhhglas/#comments Tue, 22 Feb 2011 07:01:43 +0000 http://blog.scraperwiki.com/?p=758214325 Calling journalists, bloggers, programmers and designers in Scotland!

Scraperwiki is pleased to announce another hacks & hackers hack day: in Glasgow. BBC Scotland is hosting and sponsoring the one day event, with support from BBC College of Journalism. As with our other UK hack days, Guardian Open Platform is providing the prizes.

Web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data. It’s completely free (food provided) and open to both BBC and non BBC staff. It will take place at the Viewing Theatre, Pacific Quay, Glasgow on Friday 25 March 2011.

Any questions? Please email judith@scraperwiki.com.

https://blog.scraperwiki.com/2011/02/new-event-hacks-hackers-glasgow-hhhglas/feed/ 2 758214325
New event! Hacks and Hackers Hack Day Cardiff (#hhhCar) https://blog.scraperwiki.com/2011/02/new-event-hacks-and-hackers-hack-day-cardiff-hhhcar/ https://blog.scraperwiki.com/2011/02/new-event-hacks-and-hackers-hack-day-cardiff-hhhcar/#comments Wed, 09 Feb 2011 08:11:50 +0000 http://blog.scraperwiki.com/?p=758214239 The UK Hacks & Hackers tour carries on – into 2011. Our first stop: Wales.

Scraperwiki, which provides award-winning tools for screen scraping,data mining and visualisation, will hold a one day practical hack day* at the Atrium in Cardiff on Friday 11 March, 2011.

Web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data.

We would like to thank our main sponsor Skillset Cymru, our hosts the Atrium and our prize sponsors Guardian Local, Guardian Open Platform and Cardiff School of Journalism, Media and Cultural Studies for making the event possible.

“Skillset Cymru is very pleased to be supporting the Cardiff Scraperwiki Hacks and Hackers Hack Day this March,” says Gwawr Hughes, director, Skillset Cymru.

“This exciting event will bring journalists and computer programmers and designers together to explore the scraping, storage, aggregation, and distribution of public data in more useful, structured formats.

“It is at the forefront of data journalism and should be of great interest to the media industry across the board here in Wales.”

More details

Who’s it for? We hope to attract ‘hacks’ and ‘hackers’ from all different types of backgrounds: people from big media organisations, as well as individual online publishers and freelancers.

What will I get out of it?
The aim is to show journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show programmers how to find, develop, and polish stories and features. To see what happened at our past events in Liverpool and Birmingham visit the ScraperWiki blog. Here’s a video showing what happened in Belfast.

How much? NOTHING! It’s absolutely free, thanks to our sponsors. Food and refreshments will be provided throughout the day. If you have special dietary requirements please email judith [at] scraperwiki.com.

What should I bring? We would encourage people to come along with ideas for local ‘datasets’ that are of interest. In addition we will create a list of suggested data sets at the introduction on the morning of the event but flexibility is key for this event. If you have a laptop, please bring this too.

So what exactly will happen on the day? Armed with their laptops and WIFI, journalists and developers will be put into teams of around four to develop their ideas, with the aim of finishing final projects that can be published and shared publicly. Each team will then present their project to the whole group. Winners will receive prizes at the end of the day.

*Not sure what a hack day is? Let’s go with the Wikipedia definition: It “an event where developers, designers and people with ideas gather to build ‘cool stuff'”…

With thanks to our sponsors:

Keep an eye on the ScraperWiki blog for details about Scraperwiki events. Hacks & Hackers Hack Day Glasgow is scheduled for March 25 2011. For additional information please contact judith [at] scraperwiki.com.

https://blog.scraperwiki.com/2011/02/new-event-hacks-and-hackers-hack-day-cardiff-hhhcar/feed/ 2 758214239
Belfast Hacks & Hackers – the video https://blog.scraperwiki.com/2010/12/belfast-hacks-hackers-the-video/ https://blog.scraperwiki.com/2010/12/belfast-hacks-hackers-the-video/#comments Mon, 13 Dec 2010 12:09:20 +0000 http://blog.scraperwiki.com/?p=758214138 As we’ve previously reported, the Belfast Hacks and Hackers Hack Day in November was a great success with some brilliant projects emerging, and we’re thrilled to post this video, courtesy of the School of Media, Film and Journalism at the University of Ulster. Enjoy!

Hacks and Hackers Hack Day Belfast short film, by Eleaner Mulholland (Editor) and Sarah Gallagher (Interviewer)

https://blog.scraperwiki.com/2010/12/belfast-hacks-hackers-the-video/feed/ 2 758214138
Scraperwiki launches first student event in Liverpool https://blog.scraperwiki.com/2010/11/scraperwiki-launches-first-student-event-in-liverpool/ https://blog.scraperwiki.com/2010/11/scraperwiki-launches-first-student-event-in-liverpool/#comments Mon, 01 Nov 2010 13:14:05 +0000 http://blog.scraperwiki.com/?p=758213998 Following on from the success of the professional Liverpool Hacks and Hackers event in July, Scraperwiki, in partnership with Open Labs, is running a “Student Edition” of the Hacks and Hackers Hack Day for students from both LJMU’s School of Journalism and the School of Computing & Mathematical Sciences.

It will take place in Liverpool on Wednesday December 8, 2010 from 9.30am to 5pm at Liverpool John Moores University’s Art and Design Academy.

So what’s this hack day all about? This event is a practical day demonstrating programming & design techniques, creating online news stories and features based on datasets, culminating in a final prize giving session for the most interesting and well presented projects. This practical day will demonstrate how student coders, programmers, web developers and designers will pair up with student journalists to produce projects and stories based around publicly available data.

Who’s it for?
We want to attract students with skills in journalism, data visualisation, design, coding, programming, statistics and games development etc.

What will you get out of it?
The aim is to show students journalists how to use programming and design techniques to create online news stories and features; and vice versa, to show student coders/developers/programmers how to find, develop, and polish stories and features.

What should participants bring? We would encourage students to come along with ideas for local ‘datasets’ that they are interested in.  Flexibility is key for this event. If you have a laptop, please bring this too.

But what exactly will happen on the day itself? Armed with laptops and WIFI, students will be put into teams of around four to develop ideas, with the aim of finishing final projects that can be published and shared publicly. Each team will then present their project to the whole group.

Overall winners will receive a prize at the end of the day. Food and drink will be provided during the day! Any more questions? Please get in touch via aine[at]scraperwiki.com.

https://blog.scraperwiki.com/2010/11/scraperwiki-launches-first-student-event-in-liverpool/feed/ 2 758213998
Video: Leeds Hacks & Hackers Hack Day https://blog.scraperwiki.com/2010/10/video-leeds-hacks-hackers-hack-day/ Fri, 29 Oct 2010 11:07:03 +0000 http://blog.scraperwiki.com/?p=758213978 Here are a few video interviews from yesterday’s Hacks and Hackers Hack Day Leeds, with various participants and Sarah Hartley from Guardian Local. Follow this link for a write-up of all the projects.

Leeds Hacks and Hackers Hack Day: Planning maps; cutting up Leeds; researching brownfield; and finding the city’s blogging pulse https://blog.scraperwiki.com/2010/10/leeds-hacks-and-hackers-hack-day-planning-maps-cutting-up-leeds-researching-brownfield-and-finding-the-citys-blogging-pulse/ https://blog.scraperwiki.com/2010/10/leeds-hacks-and-hackers-hack-day-planning-maps-cutting-up-leeds-researching-brownfield-and-finding-the-citys-blogging-pulse/#comments Fri, 29 Oct 2010 11:03:56 +0000 http://blog.scraperwiki.com/?p=758213973 It was to West Yorkshire for the fifth stop on Scraperwiki’s UK & Ireland Hacks & Hacker tour, at Old Broadcasting House in the excellent city of Leeds.


A varied crowd turned out for yesterday’s hack day hosted by nti Leeds, and also sponsored by Guardian Open Platform and Guardian Local and Leeds Trinity Centre for Journalism. It included participants from the city council and regional newspapers, independent bloggers, designers and computer programmers – with all different kinds of experience.

With the introduction over, the competition began, fuelled by the usual Scraperwiki promise of pizza and beer; and Amazon vouchers for the winners – who would be decided by our three judges, Sarah Hartley, editor of Guardian Local, Linda Broughton, head of nti Leeds, and Richard Horsman, associate principal lecturer at Leeds Trinity Centre for Journalism.

Five groups formed around different areas of interest, but all with a Leeds focus. Brownfield Research, by Greg Brant, Rebecca Whittington, Jon Eland and Tom Mortimer-Jones was about discovering the past, present and planned future of brownfield sites using scrapes of planning applications and change of use applications combined with web-chat and related documents. It also aimed to include history of industrial disease and accidents and contamination on site.

Leeds Planning Map by Catherine O’Connor (@journochat), Elizabeth Sanderson (@Lizziesanderson), James Rothschild (@jrpmedia), John Baron (@GdnLeeds), Karl Schneider (@karlschneider), Matt Jones (@matt_jones86) allowed users to view all planning decisions in Leeds colour coded by accepted or refused applications (some of the team pictured below).


Find Me by software developer Marcus Houlden (@mhoulden) built a geolocation web application that displays current location, address, postcode, and links to nearest bus stops. He also started adding Yorkshire Water roadworks data.

The Leeds Pulse team scraped Live Journal data to produce a web application, built on Django, demonstrating negative and positive blogging attitudes across Leeds – drawing from 8,500 blog posts. It categorised “love, like or good” as positive, and “hate, bad or meh” as negative. The judges certainly weren’t ‘meh’ about it, and chose it as the runner-up.


Leeds Uncut, however, scooped the overall prize (team pictured above). Suzanne McTaggart, Amna Kaleem (@amnakaleem), Nick Crossland (@ncrossland), Michael Brunton-Spall (@bruntonspall) with some help from developer Martin Dunschen created a map showing the eight constituencies in Leeds to highlight how they are being affected by spending cuts and redundancies.

They also looked at job vacancies in each of the constituencies, to identify whether the creation of new jobs is offsetting the doom and gloom caused by spending cuts and job losses. Different shades of colour in the form of an “economic health thermometer” gave a visually effective overview of which constituencies are suffering the most and least.

The data for the project was gathered from job websites, news websites, the Guardian’s Cutswatch page and the Office of National Statistics, which provides figures on how many people are claiming unemployment benefit/jobseekers allowance each month, giving an indication of the number of new redundancies.

The three judges … were unanimous in deciding that the worthy winners had successfully collated trusted data and compiled an easy to use map visualisation.

… commented judge Sarah Hartley, who has written this account of the beginning, middle and end of the day.

£250 worth of Amazon.co.uk vouchers will be split up among the winners and runners up. An extra prize for the best scraper work, chosen by Scraperwiki’s Julian Todd, went to Matt Jones, who will continue to maintain the planning data scraper.

With thanks to all our sponsors and helpers mentioned above, and additionally Leeds Trinity’s Catherine O’Connor and Imran Ali.

Twitter conversation was via the #hhhleeds tag, and see below for a visualisation of some of the geotagged tweets (courtesy of remote onlooker Tony Hirst, @psychemedia):

You can find a Twitter list of delegates at this link here:

More links to be added as we spot them and photographs are coming… Please email judith at scraperwiki.com with more material, or leave links in the comment section below. I’d especially like to add in links to scrapers and data sets, so people can see how the projects were built.

Want to get involved? We’re still on tour! If you’d like to sponsor an event please get in touch with aine@scraperwiki.com.


https://blog.scraperwiki.com/2010/10/leeds-hacks-and-hackers-hack-day-planning-maps-cutting-up-leeds-researching-brownfield-and-finding-the-citys-blogging-pulse/feed/ 3 758213973
Hacks/Hackers London meetup to discuss Iraq War logs https://blog.scraperwiki.com/2010/10/hackshackerslondon/ Tue, 26 Oct 2010 10:02:50 +0000 http://blog.scraperwiki.com/?p=758213968 Scraperwiki will be supporting the November Hacks/Hackers London meetup at 7pm on Wednesday 24th November 2010 at The Irish Club, 2-4 Tudor Street, EC4Y 0AA, London. A few tickets are still available, but places are filling fast.


  • 7.00pm: The data journalism behind the Iraq War Logs James Ball, Bureau of Investigative Journalism

James, Development Producer for the Bureau of Investigative Journalism and Chief Data Analyst on the TBIJ/Channel 4 Dispatches investigation into the Iraq War Logs, will explain how data journalism powered the process.

  • 7.30pm: TBC
  • 8pm: Social!