Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Archive | Developer

Three hundred thousand tonnes of gold

On 2 July 2012, the US Government debt to the penny was quoted at $15,888,741,858,820.66. So I wrote this scraper to read the daily US government debt for every day back to 1996. Unfortunately such a large number overflows the double precision floating point notation in the database, and this same number gets expressed as […]

Software Archaeology and the ScraperWiki Data Challenge at #europython

There’s a term in technical circles called “software archaeology” – it’s when you spend time studying and reverse-engineering badly documented code, to make it work, or make it better. Scraper writing involves a lot of this stuff. ScraperWiki’s data scientists are well accustomed with a bit of archaeology here and there. But now, we want […]

Local ScraperWiki Library

It quite annoyed me that you can only use the scraperwiki library on a ScraperWiki instance; most of it could work fine elsewhere. So I’ve pulled it out (well, for Python at least) so you can use it offline. How to use pip install scraperwiki_local You can then import scraperwiki in scripts run on your […]

More Python libraries!

I installed some new Python libraries and restructured the Python libraries documentation page. Some highlights Gensim is “Topic Modelling for Humans”. Read the introduction to the documentation. I’m looking for an excuse to play with it. unidecode transliterates Unicode into ASCII. It’s helpful for things like making column names. Beautiful Soup 4 beta (It’s a […]

Introducing status.scraperwiki.com

So you can find out if parts of ScraperWiki aren’t working, we’ve added a new status page. It’s called status.scraperwiki.com, and looks like this: The page and the status monitoring is done by the excellent Pingdom. We’ve been using it for a while to alert us to outages, so there’s quite a bit of history […]

The Data Hob

Keeping with the baking metaphor, a hob is a projection or shelf at the back or side of a fireplace used for keeping food warm. The central part of a wheel into which the spokes are inserted looks kind of like a hob, and is called the hub (etymology). Lately there has been a move […]

Big fat aspx pages for thin data

My work is more with the practice of webscraping, and less in the high-faluting business plans and product-market-fit leaning agility. At the end of the day, someone must have done some actual webscraping — and the harder it is the better. During the final hours of the Columbia University hack day, I got to work […]

We're hiring!