Stefan Urbanek – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 Meet the User – Brewing up a data storm https://blog.scraperwiki.com/2011/04/meet-the-user-brewing-up-a-data-storm/ Thu, 21 Apr 2011 13:15:12 +0000 http://blog.scraperwiki.com/?p=758214666

By taking part in BigClean we got some very interesting users sharing space with you here on ScraperWiki. It being hosted in Prague meant we got to show off our installation of unicode! So meet (takže sa môžete zoznámiť) Stefan Urbanek.

His project, Data Brewery, is a Python framework for data mining. It’s like a coder’s version of Yahoo Pipes, where you link up nodes that stream in, process and output data. Something like this could be used for general ETL (extract, transform, load) business applications, but Data Brewery specialises in discovering what is in the data, and measuring its quality.

He’s blogged about creating a ScraperWiki backend for his data analysis framework. He even drew a pretty picture (much better than I could so here’s his copy):

He says the whole idea of ScraperWiki has made his life easier (glad to know our hard work is not for nothing!). And he plans to create more ScraperWiki plugins for analytical processing in Brewery.

]]>
758214666