Back to contents
PHP
Python
Ruby
Choose a language:
Microsoft Excel spreadsheet files can contain lots of data in cells, formulae, notes and so on.
For more details, read the spreadsheet gem documentation.
Opening the file
Download the Excel file, and load it into the Spreadsheet class. This example
is taken from an FOI request
for tram accidents.
require 'spreadsheet'
require 'open-uri'
url = "http://www.whatdotheyknow.com/request/82804/response/208592/attach/2/ACCIDENTS%20TRAMS%20Laurderdale.xls"
book = nil
open url do |f|
book = Spreadsheet.open f
end
Directly select the worksheet you want by number or name.
sheet = book.worksheet 0
sheet = book.worksheet 'LAUDERDALE AVE'
If you're not sure what sheets there are, this prints them out.
book.worksheets.each do |sheet|
puts "Sheet called #{sheet.name} has #{sheet.row_count} rows and #{sheet.column_count} columns"
end
Reading values
Read the values of a row in one go.
row = sheet.row(4)
p row
Read individual cells like this.
p row[0]
Saving to the datastore
Work out sensible heading names. In this case we remove the "." from "FLEET NO.".
keys = sheet.row(2)
keys[1] = keys[1].gsub('.', '')
p keys
Make a hash from the rows and save them.
sheet.each_with_index do |row, rownumber|
# create dictionary of the row values
data = {}
row.each_index do |i|
data[keys[i]] = row[i]
end
data['rownumber'] = rownumber
# remove the empty column (which has a blank heading)
data.delete(nil)
# only save if it is a full row (rather than a blank line or a note)
if data['DATE'] != 'DATE' and data['DATE'] != nil and data['FLEET NO'] != nil
ScraperWiki::save_sqlite(unique_keys=['rownumber'], data)
end
end