Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Archive | Data Science

Which car should I (not) buy? Find out, with the ScraperWiki MOT website…

I am finishing up my MSc Data Science placement at ScraperWiki and, by extension, my MSc Data Science (Computing Specialism) programme at Lancaster University. My project was to build a website to enable users to investigate the MOT data. This week the result of that work, the ScraperWiki MOT website, went live. The aim of […]

The most prescribed medication for each BNF Chapter

In previous blog posts I introduced the definitions of the elements used in my research to find the most prescribed items for each BNF Chapter,as they have been understood I will now reveal my findings. Using Tableau Public I found out the top 10 most prescribed items in 2014 for several months and results were […]

GP Prescribing Datasets

In a previous blog post I described the terms used in the GP Prescribing data. Here I will introduce you to the various datasets which are published in this series. They can all be found on the Health and Social Care Information Centre data catalogue page. Prescriptions Dispensed in the Community, Statistics for England Every […]

GP Prescribing data for the UK

Over the past few weeks I have been looking at GP Prescribing data from the Health & Social Care Information Centre, which presents the number of items and cost of all the different medication prescribed and dispensed by GP practices across the UK. The dataset amounts to millions of rows of data each month. I am […]

Book Review: Learning Spark by Holden Karau, Andy Konwinski, Patrick Wendell and Matei Zaharia

Apache Spark is a system for doing data analysis which can be run on a single machine or across a cluster, it  is pretty new technology – initial work was in 2009 and Apache adopted it in 2013. There’s a lot of buzz around it, and I have a problem for which it might be […]

Which plane had the most accidents?

Searching by facets Last year, ScraperWiki helped migrate lots of specialist datasets to GOV.UK. This afternoon, we happened to notice that the Air Accidents Investigation Branch reports, which we scraped from their old site, are live. The user interface is called Finder Frontend, and is used by GOV.UK wherever the user needs to search for […]

Scientists and Engineers… of What?

“All scientists are the same, no matter their field.” OK that sounds like a good ‘quotable’ quote, and since I didn’t see it said by anyone else, I can claim it as my own saying. The closest quote to this I saw was “No matter what engineering field you’re in, you learn the same basic […]

Elasticsearch and elasticity: building a search for government documents

Based in Paris, the OECD is the Organisation for Economic Co-operation and Development. As the name suggests, the OECD’s job is to develop and promote new social and economic policies. One part of their work is researching how open countries trade. Their view is that fewer trade barriers benefit consumers, through lower prices, and companies, […]

Book review: Mastering Gephi Network Visualisation by Ken Cherven

A little while ago I reviewed Ken Cherven’s book Network Graph Analysis and Visualisation with Gephi, it’s fair to say I was not very complementary about it. It was rather short, and had quite a lot of screenshots. It’s strength was in introducing every single element of the Gephi interface. This book, Mastering Gephi Network […]

We're hiring!