Scraping guides: Dates and times

Working with dates and times in scrapers can get really tricky. So we’ve added a brand new scraping guide to the ScraperWiki documentation page, giving you copy-and-paste code to parse dates and times, and save them in the datastore. To get to it, follow the “Dates and times guide” link on the documentation page. The […]

Scraping guides: Parsing HTML using CSS selectors

We’ve added a new scraping copy-and-paste guide, so you can quickly get the lines of code you need to parse an HTML file using CSS selectors. Get to it from the documentation page: The HTML parsing guide is available in Ruby, Python and PHP. Just as with all documentation, you can choose which at the top right […]

Scraping guides: Excel spreadsheets

Following on from the CSV scraping guide, we’ve now added one about scraping Excel spreadsheets. You can get to them from the documentation page. The Excel scraping guide is available in Ruby, Python and PHP. Just as with all documentation, you can choose which at the top right of the page. As with CSV files, at first […]

Scraping guides: Values, separated by commas

When we revamped our documentation a while ago, we promised guides to specific scraper libraries, such as lxml, Nokogiri and so on. We’re now staring to roll those out. The first one is simple, but a good one. Go to the documentation page and you’ll find a new section called “scraping guides”. The CSV scraping guide is available […]

‘Documentation is like sex: when it is good, it is very, very good; and when it is bad, it is better than nothing’

You may have noticed that the design of the ScraperWiki site has changed substantially. As part of that, we made a few improvements to the documentation. Lots of you told us we had to make our documentation easier to find, more reliable and complete. We’ve reorganised it all under one contents page, called Documentation throughout […]

All recipes 30 minutes to cook

The other week we quietly added two tutorials of a new kind to the site, snuck in behind a radical site redesign. They’re instructive recipes, which anyone with a modicum of programming knowledge should be able to easily follow. 1. Introductory tutorial For programmers new to ScraperWiki, to a get an idea of what it […]

It’s SQL. In a URL.

Squirrelled away amongst the other changes to ScraperWiki’s site redesign, we made substantial improvements to the external API explorer. We’re going to concentrate on the SQLite function here as it is most import, but as you can see on the right there are other functions for getting out scraper metadata. Zarino and Julian have made […]

ScraperWiki

Extract tables from PDFs and scrape the web

Tag Archives | documentation