Back to contents Shared PHP Python Ruby Choose a language:

ScraperWiki supports a number of 3rd party Ruby libraries that we recommend for screen scraping, data analysis and data visualisation.

If you would like us to add a library that isn't listed here, please get in touch.

Downloading

OpenURI
An easy-to-use wrapper for net/http, net/https and net/ftp. docs
Typhoeus
Simple HTTP interface, including parallel requests. docs

Parsing

XML (HTML, RSS, Atom...)

Nokogiri (鋸)
An HTML, XML, SAX, and Reader parser. Search by XPath or CSS3 selectors. docs
Hpricot
A fast, enjoyable HTML parser for Ruby. An older alternative to Nokogiri. docs
LibXML
Fast XML parsing library. docs
mechanize
Navigate and complete HTML forms. docs
RSS
Really Simple Syndication feed parser. docs

Spreadsheets

spreadsheet
Read and write Excel spreadsheets. docs
roo
Read and write many spreadsheet formats - Open Office, Excel (.xls and .xlsx), Google. docs
FasterCSV
In Ruby 1.9 this is now available as the standard CSV library. So please use that.
Google Spreadsheet
Read and write to spreadsheets on Google Docs.docs

Other

TMail
Parse emails. docs
PDF::Reader
Read PDF files. docs
Chronic
Natural language date/time parser. docs
Polylines
Easily handle Google polylines. docs

Data pipes

Google Data (GData)
Access any service using the Google Data protocol. docs
Highrise
Access sales lead data from 37signals' CRM. docs
Facebook Graph (rfgraph)
Simple wrapper to call the Facebook Graph API. docs

Visualising and analysing

AlchemyAPI
Access the AlchemyAPI cloud-based text mining platform. docs