Code in a Browser – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 Yahoo!Finance to Tableau via ScraperWiki https://blog.scraperwiki.com/2014/04/yahoofinance-to-tableau-via-scraperwiki/ Thu, 17 Apr 2014 10:24:51 +0000 https://blog.scraperwiki.com/?p=758221405 Our recently announced OData connector gives Tableau users access to a world of unstructured and semi-structured data.

In this post I’d like to demonstrate the power of a Python library, Pandas, and the Code in a Browser tool to get “live” stock market data from Yahoo!Finance into Tableau. Python is a well-established programming language with a rich ecosystem of software libraries which can provide access to a wide range of data.

This isn’t a route to doing high frequency trading but it is a demonstrates the principles of using ScraperWiki as an adaptor to data on the web. Although Tableau supports a wide range of data connections it can’t handle everything. As well as ready-made tools to collect data and serve it up in different formats, ScraperWiki allows users to write their own tools. The simplest method is to use the “Code in a browser” tool.

I wrote about the Pandas library a few weeks ago, its designed to provide some of the statistical and data processing functionality R to users of Python. It grew out of the work of a financial analyst, Wes McKinney, so naturally he added a little piece of functionality to pull in stock market data from Yahoo!Finance. The code required to do this is literally a single line.

To make data we collect using the pandas library available to all of ScraperWiki tools, like the OData connector or the View in a Table tool, we need to write the data into a local database.

You can see the code to get Yahoo!Finance data and make it available in the screenshot below, and you can get a copy directly from this GitHub gist.

code-in-browser-screenshot

Once you’ve entered the code, then you can run it immediately or schedule it to run regularly.

In less than 10 lines of code we’ve added a new data source to Tableau!

The most complicated part of the process is getting the pandas library to recognise the dates properly. This is by no means a polished tool but it is fully functioning and can easily be modified to collect different stock data. Obvious extensions would be to collect a list of stocks, and to provide a user interface.

Once we have the data then we can access it over OData, I followed Andrew Watson’s instructions for making a “candlestick” plot (here). And the resulting plot is shown below and can be found on Tableau Public.

image

On a desktop installation of Tableau you can refresh the data at the click of a button.

What data can you get in less than 10 lines of code?

]]>
758221405