Royal Statistical Society – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 The Royal Statistical Society Conference–Exeter 2015 https://blog.scraperwiki.com/2015/09/the-royal-statistical-society-conferenceexeter-2015/ Fri, 11 Sep 2015 10:23:55 +0000 https://blog.scraperwiki.com/?p=758223991 rss-exeterScraperWiki have been off to the Royal Statistical Society Conference in Exeter to discuss our wares with the delegates. The conference was very friendly with senior RSS staff coming to see how we were doing through the week.

We shared the exhibitor space in the fine atrium of The Forum at Exeter University with Wiley, the Oxford University Press, the Cambridge University Press, the ONS, ATASS sports, Phastar, Taylor and Francis and DataKind, alongside the Royal Statistical Society’s own stand.

We talked to a wide range of people, some with whom we have done business already such as the Q-step programme, and the people from the Lancaster University Data Science MSc. We had interns from these two programmes over the summer. We’ve also done business with the ONS, who were there both as delegates and to try out the new ONS website on an expert audience. Other people we had met before on twitter, such as Scott Keir – the Head of education and statistical literacy at the Royal Statistical Society – and Kathyrn Torney, who won a data journalism award for her work on suicide in Northern Ireland.

Other people just dropped by for a chat, our ScraperWiki stickers are very popular with the parents of young children!

Our story is that we productionise the analysis and presentation of data. PDFtables.com is where the story starts, with a tool which accurately extracts the tabular content of PDF to Excel spreadsheets as an online service (and an API). DataBaker can then be used to convert an Excel spreadsheet in human-readable from, with boilerplate, pretty headings and so forth into a format more amenable to onward machine processing. DataBaker is a tool we developed for the ONS to help it transform the output of some of its dataflows where re-engineering the original software required more money or will than was available. The process is driven by short recipes which we trained the staff at the ONS to write, obviously we can write them for clients if preferred. The final stage is data presentation, and here we use an example from our contract software development: the Civil Service People Survey website. ORC International are contracted to run the actual survey, asking the questions and collating the results. For the 2014 survey the Civil Service took us on to provide the data exploration tool to be used by survey managers and, ultimately, all Civil Servants. The website uses the new GOV.UK styling and through in-memory processing is able to provide very flexible querying, very fast.

I’ve frequently been an academic delegate to conferences, this was my first time as an exhibitor at a conference. I have to say I commend the experience to my former academic colleagues! As an exhibitor I had a chair, a table for my lunch, and people came and talked to us about what we did with little prompting. Furthermore, I did not experience the dread pressure of trying to work out which of multiple parallel sessions I should attend!

As it was Aine and I went to a number of presentations, including Dame Julia Slingo’s one on uncertainty in climate and weather prediction, Andrew Hudson-Smith’s talk on urban informatics and Scott Zeger’s talk on Statistical Problems in Individualized health.

We liked Exeter, and the conference venue at the University. It was a short walk, up an admittedly steep hill, from the railway station and another short walk into town. The opening reception was held in the Royal Albert Memorial Museum, which is a very fine venue.

I joined the Royal Statistical Society having decided that they were were the closest thing data scientists had to a professional body in the UK, and in general they seemed like My Sort of People!

All in all a very interesting and worthwhile trip, we hope to continue and strengthen our relationship with the Royal Statistical Society and its members.

]]>
758223991