Digging Olympic Data at Londinium MMXII

This is a guest post by Makoto Inoue, one of the organisers of this weekend’s Londinium MMXII hackathon. The Olympics! Only a few days to go until seemingly every news camera on the planet is pointed at the East End of London, for a month of sporting coverage. But for data diggers everywhere, this is […]

The state of Twitter: Mitt Romney and Indonesian Politics

It’s no secret that a lot of people use ScraperWiki to search the Twitter API or download their own timelines. Our “basic_twitter_scraper” is a great starting point for anyone interested in writing code that makes data do stuff across the web. Change a single line, and you instantly get hundreds of tweets that you can […]

Scraping the protests with Goldsmiths

Zarino here, writing from carriage A of the 10:07 London-to-Liverpool (the wonders of the Internet!). While our new First Engineer, drj, has been getting to grips with lots of the under-the-hood changes which’ll make ScraperWiki a lot faster and more stable in the very near future, I’ve been deploying ScraperWiki out on the frontline, with […]

How to scrape and parse Wikipedia

Today’s exercise is to create a list of the longest and deepest caves in the UK from Wikipedia. Wikipedia pages for geographical structures often contain Infoboxes (that panel on the right hand side of the page). The first job was for me to design an Template:Infobox_ukcave which was fit for purpose. Why ukcave? Well, if […]

Job advert: Lead programmer

Oil wells, marathon results, planning applications… ScraperWiki is a Silicon Valley style startup, in the North West of England, in Liverpool. We’re changing the world of open data, and how data science is done together on the Internet. We’re looking for a programmer who’d like to: Revolutionise the tools for sharing data, and code that works with […]

Make RSS with an SQL query

Lots of people have asked for it to be easier to get data out of ScraperWiki as RSS feeds. The Julian has made it so. The Web API now has an option to make RSS feeds as a format (i.e. instead of JSON, CSV or HTML tables). For example, Anna made a scraper that gets alocohol […]

It’s SQL. In a URL.

Squirrelled away amongst the other changes to ScraperWiki’s site redesign, we made substantial improvements to the external API explorer. We’re going to concentrate on the SQLite function here as it is most import, but as you can see on the right there are other functions for getting out scraper metadata. Zarino and Julian have made […]

Scrape it – Save it – Get it

I imagine I’m talking to a load of developers. Which is odd seeing as I’m not a developer. In fact, I decided to lose my coding virginity by riding the ScraperWiki digger! I’m a journalist interested in data as a beat so all I need to do is scrape. All my programming will be done […]

ScraperWiki Datastore – The SQL.

Recently at ScraperWiki we replaced the old datastore, which was creaking under the load, with a new, lighter and faster solution – all your data is now stored in Sqlite tables as part of the move towards pluggable datastores. In addition to the new increase in performance, using Sqlite also provides some other benefits such […]

ScraperWiki

Extract tables from PDFs and scrape the web

Tag Archives | API