Report Bug

1. Select the “Code in your browser” tool

Create New Dataset

After registering and logging in, click the “Create a new dataset” button on your homepage.

Importer Chooser

You’ll be shown all the tools you can use to populate your new dataset.

We’re going to use the “Code in your browser” tool. Click it.

2. Pick a language

Pick a language

QuickCode supports dozens of languages.

We recommend Python, because it has a clean syntax and great data science libraries.

We will use Python for this tutorial.

3. Name your dataset

Rename your dataset

We’re going to scrape the UPS corporate blog. Although with small changes this should work for any WordPress blog.

Use the dropdown dataset menu to “Untitled dataset” to rename your dataset to something like “UPS blog posts”.

4. Scrape the data

Copy and paste this code into the code editor. It downloads the front page of the blog, and extracts information about each article.

#!/usr/bin/env python

import scraperwiki
import requests
import lxml.html

html = requests.get("http://blog.ups.com").content
dom = lxml.html.fromstring(html)

for entry in dom.cssselect('.theentry'):
    post = {
        'title': entry.cssselect('.entry-title')[0].text_content(),
        'author': entry.cssselect('.the-meta a')[0].text_content(),
        'url': entry.cssselect('a')[0].get('href'),
        'comments': int( entry.cssselect('.comment-number')[0].text_content() )
    }
    print post
    

Press the Run button. You’ll see information about each post printed in the console window.

5. Save to the datastore

To save to the datastore, put this in your code. It should go just after the print post. Make sure it is indented.

  scraperwiki.sql.save(['url'], post)

You don't have to use this special function. Any library, in any language, which makes a SQLite database file called scraperwiki.sqlite will do.

6. Use your data

QuickCode is built out of lots of tools that let you do stuff with your data. The tools always appear in the grey toolbar next to your dataset’s name.

Click the orange “View in a table” icon to see your data in a flexible table view.

Or click More tools… to do other things like automatically summarising your data or publishing it to a CKAN datahub.