Hi! We've renamed ScraperWiki.
The product is now QuickCode and the company is The Sensible Code Company.

Archive | Developer

JavaScript: The Good Parts

Book review: JavaScript: The Good Parts by Douglas Crockford

This week I’ve been programming in JavaScript, something of a novelty for me. Jealous of the Dear Leader’s automatically summarize tool I wanted to make something myself, hopefully a future post will describe my timeline visualising tool. Further motivations are that web scraping requires some knowledge of JavaScript since it is a key browser technology […]

Summarise #1: Grouping automatically for you

Late at night, after a long conversation in a bar (after Social Media Cafe), Zach mentioned one feature that everyone loved about Kasabi. It had an overview page, which automatically summarised each dataset. Of course, Kasabi did it using linked data – telling you how many of your triples were geographic locations, and how many […]

From future import x.scraperwiki.com

Time flies when you’re building a platform. At the start of the year, we announced the beginnings of a new, more powerful, more flexible ScraperWiki. More powerful because it exposes industry standards like SQL, SSH, and a persistent filesystem to developers, so they can scrape and crunch and export data pretty much however they like. […]

Tools of the trade

With the experience of a whole week of ScraperWiki, I am starting to appreciate the core tools of the professional Data Scientist. In the past I’ve written scrapers in Matlab, C# and Python. However, the house language for scraping at ScraperWiki is Python. It’s a good choice: a mature but modern language with a wide […]

The next evolution of ScraperWiki

Quietly, over the last few months, we’ve been rebuilding both the backend and the frontend of ScraperWiki. The new ScraperWiki has been built from the ground up to be more powerful for data scientists, and easier to use for everyone else. At its core, it’s about empowering people to take a hold of their data, […]

How to test shell scripts

Extreme hipster superheroes like me need tests for their shell. Here’s what’s available. YOLO: No automated testing Few shell scripts have any automated testing because shell programmers live life on the edge. Inevitably, this results in tedious manual ‘testing’. Loads of projects use this approach. git flow homeshick ievms rbenv z Here are some more. […]

DumpTruck 0.0.3

I’ve added some new features to DumpTruck. Changes Dictionary case sensitivity I removed the dictionaries with case-insensitive keys because that just seemed to be delaying the conversion to case sensitivity. Ordered Dictionaries DumpTruck.execute now returns a collections.OrderedDict for each row rather than a dict for each row. Also, order is respected on insert, so you […]

We're hiring!