#hhhglas – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 A Bonny Wee Hack Day at #hhhglas https://blog.scraperwiki.com/2011/03/a-bonny-wee-hack-day-at-hhhglas/ https://blog.scraperwiki.com/2011/03/a-bonny-wee-hack-day-at-hhhglas/#comments Mon, 28 Mar 2011 16:42:47 +0000 http://blog.scraperwiki.com/?p=758214491 For our first venture to Scotland where better to be than BBC Scotland! We had 8 teams of hacks and hackers digging around the Scottish data beat. For this very special occasion the ScraperWiki digger has donned tartan! With this special digger, fire incidents, planning applications, public-owned property and gifts councillors’ received have been mined. Here’s a word from our Francis Irving:

 

Now check out the projects:

Fire Bugs – This project scrapes the data from the Central Scotland Fire Service’s Recorded Incidents log, creating an alert when new incidents are logged. It also retrieves historic data.

The team consisted of 1 hack (Chris Sleight, from BBC Scotland) and 2 hackers (Ben Lyons and Paul Miller, from IRISS).

Central Scotland Fire Service put a lot of data on their website but as is usual, it was not in a very useful form. 60 incidents are put on the site but if you dig down you get over 15,000 buried records. For one day’s work, Fire Bugs scraped the records and decided to look at malicious false alarms. Luckily for them the language and structure of the records were consistent. They found that 3.5% of all calls were malicious false alarms. They even made a tree map on ScrpaerWiki using protoviz.  Fire Bugs have clearly opened up a huge amount of potential with this data.

Edinburgh Planning App Map – This is Edinburgh’s first automated map of local planning applications! This is a popular theme for our hack days and on ScraperWiki in general. Open Australia are using ScraperWiki for their planning alerts.

This team consists of 1 hack (Michael MacLeod, beatblogger for Guardian Edinburgh on the right) and one hacker (Robert McWilliam, from Blueflow on the left).

As Michael MacLeod pointed out, people dont’ know how to use the local council website. You can’t just type in your postcode to find applications near you. There’s a map online but it’s truly awful! The team scraped the site and made a map which updates everyday rather than just every week like on the council site. Michael used this new tool to take a closer look at his beat and found a planning application for urban paintball. What he duly noted was that the Facebook page was trying to be secretive about the location! Using the map, he found it was going to be right behind a block of flats. He wil be talking to residents!

Hide by the Clyde – This project creates a map to allow the user to compare exam results in different areas and correlates this against measures of social deprivation.

The team consists of 1 hack (Bruce Munro from BBC Scotland) and 3 hackers (Nicola Osborne from Edina, Sean Carroll from BBC Scotland and Bob Kerr from Open Street Map).

They looked at data from Learning and Teaching Scotland and scraped the search for schools form. Here is the map. From this freed data they were able to make a heat map of free school meals registration in Scotland and compare education statistics between Glasgow and Argyll & Bute for example. A major project would be to put all this information on one site in a user friendly format.

Public Buildings for Sale – is a tool to show all publicly-owned property that is for sale/rent. This will be a Scottish sister to ScraperWiki’s brownfield’s sites map. The project aims to answer the question: How much public land is being sold without our knowledge?

This team consisted of 1 hack (Peter Mackay from BBC Scotland on the right) and 1 hacker (Martyn Inglis from The Guardian on the left).

The data they wanted is on this horrible website on property sales and lettings from Scotland’s public sector. The nested html tables are very difficult to scrape. They managed to scrape this far and plan to remodel the data to make it searchable by postcode. From this, they want to glean more information about council’s buy and sell strategies.

Crash Test Dummies – This project takes three separate approaches to looking at Scottish road accident data.

  1. What accidents get reported?
  2. Are you safe on the roads?
  3. What affect do road safety measures make on your journey

The team consisted of 2 hacks (David Eyre and Brendon Crowther from BBC Scotland) and 2 hackers (Ali Craigmile and Mo McRoberts from BBC Scotland).

In just one day they managed to built a prototype for a “Mind how you go!” BBC Scotland site. They used road traffic accident reports based on 2005-2009 data to create a form that showed how likely you were to survive your journey depending on your age, sex and where you’re going! They built a spreadsheet from even more data so that the site had even more potential to go beyond the records. They also scraped Google searches of reported road traffic accidents and mapped the reports from BBC scotland from 2010.

BME ScotlandThis project aims to find out what are the effects of the recession on education! Is education a route to the ghetto? It aims to compare BME educational achievement with unemployment statistics to find out which areas of Scotland are economic no-gos.

The team consisted of 1 hack (Fin Wycherley) and 1 hacker (Paul McNally).

The lesson learnt here was that sometimes there’s not enough data to go around. What they know is that the African population is doing exceedingly well in education in Scotland. However, they also have a relatively high level of unemployment. The result was a call out for better data collection as none of the information fitted in a way that would help answer the question: Why?.

Edinburgh CouncilThis project searches Edinburgh Councillors gifts and expenses!

The team consists of 2 hacks (Paola Di Maoi and Anand Ramkissoon) and 1 hacker (James Baster).

As it turner out, the Council website is easy to scrape. The structure of the site is consistent and clean. ScraperWiki likes this! And so here is the scraper. As James pointed out, the data needs to be double checked for misspelled entries, etc. But the preliminary data shows that Lothian buses gave the most gifts and Phil Wheeler received the most gifts.

Magners CiderThis project aims to scrape the Magners League Rugby scores. The team consisted of a hack/hacker pair of Paul McNally (again, we love eager hackers!) and Tony Sinclair of BBC Scotland (who had to keep up the day job and so was not around for a picture). Apparently, a graphics operator had to input the information from the site by hand into the graphics system to produce the league tables you see on screen. Seeing as the graphics software can access spreadsheets, Tony thought “Why not automate the process by scraping?”. And this is what they did. So the scores have gone from ScraperWiki to TV!

And the winners are… (drum roll please)

  • 1st Prize: Edinburgh Planning Map App
  • 2nd Prize: Fire Bugs
  • 3rd Prize: Magners Cider
  • Best Scraper: Fire Bugs

A big shout out

Our judges, Jon Jacob from BBC College of Journalism, Allan Donald from STV and Huw Owen, Editor of Good Morning Scotland.

Our sponsors BBC Scotland, BBC College of Journalism and The Guardian Open Platform.

Edinburgh planning applications, fire incidents and Rugby scores – you’ve been ScraperWikied!

The winners and the judges


]]>
https://blog.scraperwiki.com/2011/03/a-bonny-wee-hack-day-at-hhhglas/feed/ 5 758214491
New event! Hacks & Hackers Glasgow (#hhhglas) https://blog.scraperwiki.com/2011/02/new-event-hacks-hackers-glasgow-hhhglas/ https://blog.scraperwiki.com/2011/02/new-event-hacks-hackers-glasgow-hhhglas/#comments Tue, 22 Feb 2011 07:01:43 +0000 http://blog.scraperwiki.com/?p=758214325 Calling journalists, bloggers, programmers and designers in Scotland!

Scraperwiki is pleased to announce another hacks & hackers hack day: in Glasgow. BBC Scotland is hosting and sponsoring the one day event, with support from BBC College of Journalism. As with our other UK hack days, Guardian Open Platform is providing the prizes.

Web developers and designers will pair up with journalists and bloggers to produce a number of projects and stories based on public data. It’s completely free (food provided) and open to both BBC and non BBC staff. It will take place at the Viewing Theatre, Pacific Quay, Glasgow on Friday 25 March 2011.

Any questions? Please email judith@scraperwiki.com.

]]>
https://blog.scraperwiki.com/2011/02/new-event-hacks-hackers-glasgow-hhhglas/feed/ 2 758214325