Our original browser coding product, ScraperWiki, is being reborn. We’re pleased to announce it is now called QuickCode. We’ve found that the most popular use for QuickCode is to increase coding skills in numerate staff, while solving operational data problems. What does that mean? I’ll give two examples. Department for Communities and Local Government run […]
Case study: Enrique Cocero getting political data from PDFs
Political strategy is international now. Enrique Cocero works from Madrid for his consultancy 7-50 Electoral Math, using data to understand voters and candidates in election campaigns across the world. He’s struggled with PDFs for a long time, and recently found PDF Tables via a Google search. He says: I used to have nightmares – I’m […]
The four kinds of data PDF
At ScraperWiki, we talk to lots of customers who need to convert PDFs to Excel. Why are they doing it? The industries are diverse – banking, insurance, retail, logistics, political campaigning, energy… What separates them in data terms though, is each has one of four different kinds of workflow. A. Large tables These are PDFs […]
….and suddenly I could convert my bank statement from PDF to Excel…
Do you ever: Need an old bank statement only to find out that the bank has archived it, and want to charge you to get it back? Spot check to make sure there are no fraudulent transactions on your account? Like to summarise all your big ticket items for a period? Need to summarise business expenses? […]
PDFTables: All the tables in one page, CSV
Lots of you have asked for it, and we’ve finally changed the Excel download format at PDFTables.com to put all the pages of your PDF into one worksheet. This is particularly useful if you have big tables that span multiple pages. You can still have the old format, just choose “Excel (multiple sheets)” from the […]
Elasticsearch and elasticity: building a search for government documents
Based in Paris, the OECD is the Organisation for Economic Co-operation and Development. As the name suggests, the OECD’s job is to develop and promote new social and economic policies. One part of their work is researching how open countries trade. Their view is that fewer trade barriers benefit consumers, through lower prices, and companies, […]
Announcing PDFTables.com
PDFs were invented at the same time as the web. As “digital paper”, they’re trustworthy and don’t change behind your back. This has a downside – often the definitive source of published data is a PDF. It’s hard to get tens of thousands of numbers out and into a spreadsheet or database. Copying and pasting is […]
GeoJSON into ScraperWiki will go!
Surely everyone likes things on maps? Driven by this thought we’re produced a new tool for the ScraperWiki Platform: an importer for GeoJSON. GeoJSON is a file format for encoding geographic information. It is based on JSON which is popular for web based APIs because it is light weight, flexible and easy to parse by […]
The story of getting Twitter data and its “missing middle”
We’ve tried hard, but sadly we are not able to bring back our Twitter data tools. Simply put, this is because Twitter have no route to market to sell low volume data for spreadsheet-style individual use. It’s happened to similar services in the past, and even to blog post instructions. There’s lots of confusion in the […]
Twitter tool update
Last week, our Twitter API use was suspended. We’re talking to various people at Twitter, DataSift and Gnip to try and resolve this. Unfortunately, we still can’t tell or predict when or if we’ll be able to bring the service back. To avoid making false promises, we’ve removed the tools from our website for now.