Yahoo!Finance to Tableau via ScraperWiki

Our recently announced OData connector gives Tableau users access to a world of unstructured and semi-structured data. In this post I’d like to demonstrate the power of a Python library, Pandas, and the Code in a Browser tool to get “live” stock market data from Yahoo!Finance into Tableau. Python is a well-established programming language with […]

Getting Twitter connections

Introducing the Get Twitter Friends tool Our Twitter followers tool is one of our most popular: enter a Twitter username and it scrapes the followers of that account. We were often asked if it’s possible not only to get the users that follow a particular account, but the users that are followed by that account […]

Publish your data to Tableau with OData

We know that lots of you use data from our astonishingly simple Twitter tools in visualisation tools like Tableau. While you can download your data as a spreadsheet, getting it into Tableau is a fiddly business (especially where date formatting is concerned). And when the data updates, you’d have to do the whole thing over […]

New ScraperWiki tool lets you extract data from reports with complete accuracy

It’s not always possible to automate data gathering, even with scrapers. Often we find customers want to regularly update data in ScraperWiki via spreadsheets. Either they’ve made the spreadsheets via a report from another system (typically one that isn’t on the web), or they gather the data by hand (for example, by phoning someone up […]

The Tyranny of the PDF

Got a PDF you want to get data from? Try our easy web interface over at PDFTables.com! Why is ScraperWiki so interested in PDF files? Because the world is full of PDF files. The treemap above shows the scale of their dominance. In the treemap the area a segment covers is proportional to the number […]

Time to try Table Xtract

Getting data out of websites and PDFs has been a problem for years with the default solution being the prolific use of copy and paste. ScraperWiki has been working on a way to accurately extract tabular data from these sources and making it easy to export to Excel or CSV format. We have been internally testing and […]

Finding contact details from websites

Since August, I’ve been an intern at ScraperWiki. Unfortunately, that time’s shortly coming to an end. Over the past few months, I’ve learnt a huge amount. I’ve been surprised at just how fast-moving things are in a startup and I’ve been involved with several exciting projects. Before the internship ends, I thought it would be a […]

Table Scraping Is Hard

The Problem NHS trusts have been required to publish data on their expenditure over £25,000 in a bid for greater transparency; A well known B2B publisher came to us to aggregate that data and provide them with information spanning across the hundreds of different trusts, such as: who are the biggest contractors across the NHS? […]

Scrape anyone’s Twitter followers

Following our popular tool which makes it easy to scrape and download tweets, we’re pleased to announce a new one to get any Twitter account’s followers. To use it, log into ScraperWiki, choose “Create a new dataset” then pick the tool Then enter the name of the user you want (with or without the @). If they […]

Mastering space and time with jQuery deferreds

Recently Zarino and I were pairing on making improvements to a new scraping tool on ScraperWiki. We were working on some code that allows the person using the tool to pick out parts of some scraped data in order to extract a date into a new database column. For processing the data on the server […]

ScraperWiki

Extract tables from PDFs and scrape the web

Archive | Products