Back to contents
Shared
PHP
Python
Ruby
Choose a language:
ScraperWiki supports a number of 3rd party Ruby libraries that we recommend for screen scraping, data analysis and data visualisation.
If you would like us to add a library that isn't listed here, please get in touch.
Downloading
- OpenURI
- An easy-to-use wrapper for net/http, net/https and net/ftp. docs
- Typhoeus
- Simple HTTP interface, including parallel requests. docs
Parsing
XML (HTML, RSS, Atom...)
- Nokogiri (鋸)
- An HTML, XML, SAX, and Reader parser. Search by XPath or CSS3 selectors. docs
- Hpricot
- A fast, enjoyable HTML parser for Ruby. An older alternative to Nokogiri. docs
- LibXML
- Fast XML parsing library. docs
- mechanize
- Navigate and complete HTML forms. docs
- RSS
- Really Simple Syndication feed parser. docs
Spreadsheets
- spreadsheet
- Read and write Excel spreadsheets. docs
- roo
- Read and write many spreadsheet formats - Open Office, Excel (.xls and .xlsx), Google. docs
- FasterCSV
- In Ruby 1.9 this is now available as the standard CSV library. So please use that.
- Google Spreadsheet
- Read and write to spreadsheets on Google Docs.docs
Other
- TMail
- Parse emails. docs
- PDF::Reader
- Read PDF files. docs
- Chronic
- Natural language date/time parser. docs
- Polylines
- Easily handle Google polylines. docs
Data pipes
- Google Data (GData)
- Access any service using the Google Data protocol. docs
- Highrise
- Access sales lead data from 37signals' CRM. docs
- Facebook Graph (rfgraph)
- Simple wrapper to call the Facebook Graph API. docs
Visualising and analysing
- AlchemyAPI
- Access the AlchemyAPI cloud-based text mining platform. docs