In previous blog posts I introduced the definitions of the elements used in my research to find the most prescribed items for each BNF Chapter,as they have been understood I will now reveal my findings. Using Tableau Public I found out the top 10 most prescribed items in 2014 for several months and results were […]
GP Prescribing Datasets
In a previous blog post I described the terms used in the GP Prescribing data. Here I will introduce you to the various datasets which are published in this series. They can all be found on the Health and Social Care Information Centre data catalogue page. Prescriptions Dispensed in the Community, Statistics for England Every […]
GP Prescribing data for the UK
Over the past few weeks I have been looking at GP Prescribing data from the Health & Social Care Information Centre, which presents the number of items and cost of all the different medication prescribed and dispensed by GP practices across the UK. The dataset amounts to millions of rows of data each month. I am […]
The four kinds of data PDF
At ScraperWiki, we talk to lots of customers who need to convert PDFs to Excel. Why are they doing it? The industries are diverse – banking, insurance, retail, logistics, political campaigning, energy… What separates them in data terms though, is each has one of four different kinds of workflow. A. Large tables These are PDFs […]
Burn the digital paper! A call to arms
This is a blog post version of a lunchtime talk I gave at the Open Data Institute. You may prefer to listen to it or use the slides. Stafford Beer Stafford Beer was a British cybernetician. He described four stages that happen when you get a computer. Each stage ends in disappointment. 1. Amazement It’s […]
….and suddenly I could convert my bank statement from PDF to Excel…
Do you ever: Need an old bank statement only to find out that the bank has archived it, and want to charge you to get it back? Spot check to make sure there are no fraudulent transactions on your account? Like to summarise all your big ticket items for a period? Need to summarise business expenses? […]
PDFTables.com: PHP, C# and VBA API examples
Invoices, bank statements, feeds of public data… Painful though it can be, many business workflows need to be able to take data in from PDFs. PDFTables.com has had an web API for a while. We’ve just added a few more language examples for C#, PHP and Visual Basic for Applications coders. You can find them […]
Summary – Big Data Value Association June Summit (Madrid)
In late June, 375 Europeans + 1 attended the Big Data Value Association (BVDA) Summit in Madrid. The BVDA is the private part of the Big Data Public Private Partnership. The Public part is the European Commission. The delivery mechanism is Horizon 2020 and €500m funding . The PPP commenced in 2015 and runs to […]
Book review: Docker Up & Running by Karl Matthias and Sean P. Kane
This last week I have been reading Docker Up & Running by Karl Matthias and Sean P. Kane, a newly published book on Docker – a container technology which is designed to simplify the process of application testing and deployment. Docker is a very new product, first announced in March 2013, although it is based […]
Spreadsheets are code: EuSpRIG conference.
I’m back from presenting a talk on DataBaker at the EuSpRIG conference. It’s amazing to see a completely different world of how people use Excel – I’ve been busy tearing the data out of spreadsheets for the Office of National Statistics and using macros to open PDF files in Excel directly using PDFTables. So whilst I’ve […]