Comments on: The united lobbyists of pdf Extract tables from PDFs and scrape the web Thu, 14 Jul 2016 16:12:42 +0000 hourly 1 By: yorksranter Sun, 08 Aug 2010 15:03:57 +0000 The docs could do with attention – it’s not at all obvious what pdftoxml will spit out or on what basis (for example – is everything always text? text first? text with other stuff interspersed?)