Book review: Python for Data Analysis by Wes McKinney

As well as developing scrapers and a data platform, at ScraperWiki we also do data analysis. Some of this is just because we’re interested, other times it’s because clients don’t have the tools or the time to do the analysis they want themselves. Often the problem is with the size of the data. Excel is […]

Mastering space and time with jQuery deferreds

Recently Zarino and I were pairing on making improvements to a new scraping tool on ScraperWiki. We were working on some code that allows the person using the tool to pick out parts of some scraped data in order to extract a date into a new database column. For processing the data on the server […]

Programmers past, present and future

As a UX designer and part-time anthropologist, working at ScraperWiki is an awesome opportunity to meet the whole gamut of hackers, programmers and data geeks. Inside of ScraperWiki itself, I’m surrounded by guys who started programming almost before they could walk. But right at the other end, there are sales and support staff who only […]

The state of Twitter: Mitt Romney and Indonesian Politics

It’s no secret that a lot of people use ScraperWiki to search the Twitter API or download their own timelines. Our “basic_twitter_scraper” is a great starting point for anyone interested in writing code that makes data do stuff across the web. Change a single line, and you instantly get hundreds of tweets that you can […]

Software Archaeology and the ScraperWiki Data Challenge at #europython

There’s a term in technical circles called “software archaeology” – it’s when you spend time studying and reverse-engineering badly documented code, to make it work, or make it better. Scraper writing involves a lot of this stuff. ScraperWiki’s data scientists are well accustomed with a bit of archaeology here and there. But now, we want […]

Local ScraperWiki Library

It quite annoyed me that you can only use the scraperwiki library on a ScraperWiki instance; most of it could work fine elsewhere. So I’ve pulled it out (well, for Python at least) so you can use it offline. How to use pip install scraperwiki_local You can then import scraperwiki in scripts run on your […]

How to stop missing the good weekends

Far too often I get so stuck into the work week that I forget to monitor the weather for the weekend when I should be going off to play on my dive kayaks — an activity which is somewhat weather dependent. Luckily, help is at hand in the form of the ScraperWiki email alert system. […]

Job advert: Lead programmer

Oil wells, marathon results, planning applications… ScraperWiki is a Silicon Valley style startup, in the North West of England, in Liverpool. We’re changing the world of open data, and how data science is done together on the Internet. We’re looking for a programmer who’d like to: Revolutionise the tools for sharing data, and code that works with […]

Lots of new libraries

We’ve had lots of requests recently for new 3rd party libraries to be accessible from within ScraperWiki. For those of you who don’t know, yes, we take requests for installing libraries! Just send us word on the feedback form and we’ll be happy to install. Also, let us know why you want them as it’s […]

Scraped Data Something to Tweet About

I’m a coding pleb, or The Scraper’s Apprentice as I like to call myself. But I realise that’s no excuse, as many of the ScraperWiki users I talk to have not had formal coding lessons themselves. Indeed, some of our founders aren’t formally trained (we have a doctorate in Chemistry here!). I’ve been attempting to […]

ScraperWiki

Extract tables from PDFs and scrape the web

Tag Archives | python