rss – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 Make RSS with an SQL query https://blog.scraperwiki.com/2011/09/make-rss-with-an-sql-query/ https://blog.scraperwiki.com/2011/09/make-rss-with-an-sql-query/#comments Wed, 21 Sep 2011 11:22:16 +0000 http://blog.scraperwiki.com/?p=758215425 Lots of people have asked for it to be easier to get data out of ScraperWiki as RSS feeds.

The Julian has made it so.

The Web API now has an option to make RSS feeds as a format (i.e. instead of JSON, CSV or HTML tables).

For example, Anna made a scraper that gets alocohol licensing applications for Islington in London. She wanted an RSS feed to keep track of new applications using Google Reader.

To make it, she went to the Web API explorer page for the scraper, chose “rss2” for the format, and entered this SQL into the query box.

select licence_for as description,
       applicant as title,
       url as link,
       date_scraped as date
from swdata order by date_scraped desc limit 10

The clever part is the SQL “as” clauses. They let you select exactly what appears in the title and description and so on of the feed. The help that appears next to the “format” drop down when you choose “rss2” explains which fields need setting.

Since SQL is a general purpose language, you can do complicated things like concatenate strings if you need to. For most simple cases though, it is just a remapping of fields.

This is Anna’s final RSS feed of Islington alcohol license applications. There’s a thread on the ScraperWiki Google Group with more details, including how Anna made the date_scraped column.

Meanwhile, pezholio decided to use the new RSS feeds to track food safety inspections in Walsall. He used the new ScraperWiki RSS SQL API to make that scraper into an RSS feed.

Of course, these days RSS isn’t enough, he used the wonderful ifttt to map that RSS feed to Twitter. Now anyone can keep track of how safe restaurants in Walsall are by simply following @EatSafeWalsall.

Let us know if you ScraperWiki anything with RSS feeds!

P.S. Islington’s licensing system is run by Northgate, as are lots of others. It is likely that Anna’s scraper can easily be made to run for some other councils…

]]>
https://blog.scraperwiki.com/2011/09/make-rss-with-an-sql-query/feed/ 3 758215425
Views part 2 – Lincoln Council committees https://blog.scraperwiki.com/2011/01/views-part-2-lincoln-council-committees/ https://blog.scraperwiki.com/2011/01/views-part-2-lincoln-council-committees/#comments Tue, 04 Jan 2011 20:48:26 +0000 http://blog.scraperwiki.com/?p=758214062 (This is the second of two posts announcing ScraperWiki “views”. A new feature that Julian, Richard and Tom worked away and secretly launched a couple of months ago. Once you’ve scraped your data, how can you get it out again in just the form you want? See also: Views part 1 – Canadian weather stations.)

Lincoln Council committee updates

Sometimes you don’t want to output a visualisation, but instead some data in a specific form for use by another piece of software. You can think of this as using the ScraperWiki code editor to write the exact API you want on the server where the data is. This saves the person providing the data having to second guess every way someone might want to access it.

Andrew Beekan, who works at Lincoln City Council, has used this to make an RSS feed for their committee meetings. Their CMS software doesn’t have this facility built in, so he has to use a scraper to do it.

First he wrote a scraper in ScraperWiki for a “What’s new” search results page from Lincoln Council’s website. This creates a nice dataset containing the name, date and URL of each committee meeting. Next Andrew made a ScraperWiki view and wrote some Python to output exactly the XML that he wants.

Andrew then wraps the RSS feed in Feedburner for people who want email updates. This is all documented in the Council’s data directory. They used to use Yahoo pipes to do this, but Andrew is finding ScraperWiki easier maintain, even though some knowledge of programming is required.

Since then, Andrew has gone on to make a map for the Lincoln decent homes scheme, also using ScraperWiki views – he’s written a blog post about it.

]]>
https://blog.scraperwiki.com/2011/01/views-part-2-lincoln-council-committees/feed/ 2 758214062