Comments on: The united lobbyists of pdf https://blog.scraperwiki.com/2010/03/467770546/ Extract tables from PDFs and scrape the web Thu, 14 Jul 2016 16:12:42 +0000 hourly 1 https://wordpress.org/?v=4.6 By: yorksranter https://blog.scraperwiki.com/2010/03/467770546/#comment-477 Sun, 08 Aug 2010 15:03:57 +0000 http://blog.scraperwiki.com/post/467770546#comment-477 The docs could do with attention – it’s not at all obvious what pdftoxml will spit out or on what basis (for example – is everything always text? text first? text with other stuff interspersed?)

]]>