Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Blog

Pius Okoh

Hi, I’m Pius….

…and I’m the new thing at ScraperWiki. Yes you heard right, thing, not person or guy or anything human. Since I learnt that real-world entities could be modeled using programming language objects in order to answer questions or make inferences, one weird thing in my brain just interpreted it the other way – that real-world […]

Four specific things “agile” saved us from doing at ONS

There’s lots of both hype and cynicism around “agile”. Instead, look at this part of the original agile declaration. We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value: … Responding to change over Following a plan That is, while there […]

Book review: Cryptocurrency by Paul Vigna and Michael J. Casey

Amongst hipster start ups in the tech industry Bitcoin has been a thing for a while. As one of the more elderly members of this community I wanted to understand a bit more about it. Cryptocurrency: How Bitcoin and Digital Money are Challenging the Global Economic Order by Paul Vigna and Michael Casey fits this […]

Announcing PDFTables.com

PDFs were invented at the same time as the web.  As “digital paper”, they’re trustworthy and don’t change behind your back. This has a downside – often the definitive source of published data is a PDF. It’s hard to get tens of thousands of numbers out and into a spreadsheet or database. Copying and pasting is […]

A tool to help with your next job move

A guest post from Jyl Djumalieva. During February and March this year I had a wonderful opportunity to share the workspace with ScraperWiki team. As an aspiring data analyst, I found it very educational to learn how real-life data science happens. After observing ScraperWiki data scientist do some analytical heavy lifting I was inspired to embark […]

Adventures in Kaggle: Forest Cover Type Prediction

Regular readers of this blog will know I’ve read quite few machine learning books, now to put this learning into action. We’ve done some machine learning for clients but I thought it would be good to do something I could share. The Forest Cover Type Prediction challenge on Kaggle seemed to fit the bill. Kaggle […]

Book review: How Linux works by Brian Ward

A break since my last book review since I’ve been coding, rather than reading, on the commute into the ScraperWiki offices in Liverpool. Next up is How Linux Works by Brian Ward. In some senses this book follows on from Data Science at the Command Line by Jeroen Janssens. Data Science was about doing analysis […]

DataBaker – making spreadsheets machine-readable

Spreadsheets are often the way of choice for publishing data. They look great, are understandable by people who don’t use databases, and with judicious use of formatting you can represent complicated datasets in a way people can understand. The down side is that machines can’t understand them. Sure, you can export the file as CSV, but that […]

Book review: Data Science at the Command Line by Jeroen Janssens

In the mixed environment of ScraperWiki we make use of a broad variety of tools for data analysis. Data Science at the Command Line by Jeroen Janssens covers tools available at the Linux command line for doing data analysis tasks. The book is divided thematically into chapters on Obtaining, Scrubbing, Modeling, Interpreting Data with “intermezzo” […]

We're hiring!