Amongst hipster start ups in the tech industry Bitcoin has been a thing for a while. As one of the more elderly members of this community I wanted to understand a bit more about it. Cryptocurrency: How Bitcoin and Digital Money are Challenging the Global Economic Order by Paul Vigna and Michael Casey fits this […]
A tool to help with your next job move
A guest post from Jyl Djumalieva. During February and March this year I had a wonderful opportunity to share the workspace with ScraperWiki team. As an aspiring data analyst, I found it very educational to learn how real-life data science happens. After observing ScraperWiki data scientist do some analytical heavy lifting I was inspired to embark […]
DataBaker – making spreadsheets machine-readable
Spreadsheets are often the way of choice for publishing data. They look great, are understandable by people who don’t use databases, and with judicious use of formatting you can represent complicated datasets in a way people can understand. The down side is that machines can’t understand them. Sure, you can export the file as CSV, but that […]
Book review: Data Science at the Command Line by Jeroen Janssens
In the mixed environment of ScraperWiki we make use of a broad variety of tools for data analysis. Data Science at the Command Line by Jeroen Janssens covers tools available at the Linux command line for doing data analysis tasks. The book is divided thematically into chapters on Obtaining, Scrubbing, Modeling, Interpreting Data with “intermezzo” […]
Book review: Remote Pairing by Joe Kutner
Pair programming is an important part of the Agile process but sometimes the programmers are not physically co-located. At ScraperWiki we have staff who do both scheduled and ad hoc remote working therefore methods for working together remotely are important to us. A result of a casual comment on Twitter, I picked up “Remote Pairing” […]
Book review: Graph Databases by Ian Robinson, Jim Webber and Emil Eifrem
Regular readers will know I am on a bit of a graph binge at the moment. In computer science and mathematics graphs are collections of nodes joined by edges, they have all sorts of applications including the study of social networks and route finding. Having covered graph theory and visualisation, I now move on to […]
NewsReader: the developers story
ScraperWiki has been a partner in NewsReader, an EU Framework 7 research project, for the last couple of years. The aim of NewsReader is to give computers the power to “understand” the news; to extract from a myriad of news articles the underlying events which gave rise to those articles; the who, the where, the […]
Book review: Linked by Albert-László Barabási
I am on a bit of a graph theory binge, it started with an attempt to learn about Gephi, the graph visualisation software, which developed into reading a proper grown up book on graph theory. I then learnt a little more about practicalities on reading Seven Databases in Seven Weeks, which included a section on […]
Book review: Seven databases in Seven Weeks by Eric Redmond and Jim R. Wilson
I came to databases a little late in life, as a physical scientist I didn’t have much call for them. Then a few years ago I discovered the wonders of relational databases and the power of SQL. The ScraperWiki platform strongly encourages you to save data to SQLite databases to integrate with its tools. There […]
Exploring the ONS
The Office for National Statistics (ONS) is the United Kingdom statistical body charged by the government with the task of collecting and publishing statistics related to the economy, population and society of England and Wales at national, regional and local levels. The data is typically published in the form of Excel spreadsheets. The ONS is […]