Comments on: Lots of new libraries https://blog.scraperwiki.com/2011/10/lots-of-new-libraries/ Extract tables from PDFs and scrape the web Thu, 14 Jul 2016 16:12:42 +0000 hourly 1 https://wordpress.org/?v=4.6 By: Francis Irving https://blog.scraperwiki.com/2011/10/lots-of-new-libraries/#comment-715 Wed, 26 Oct 2011 16:04:19 +0000 http://blog.scraperwiki.com/?p=758215702#comment-715 There’s a small mention in this FAQ: https://scraperwiki.com/docs/python/faq/#files

But no, it isn’t very clear! Will update the FAQ a bit to have a separate question saying you can spawn external commands with link to some examples.

Just to get a handle on your use case, are you calling PDF Miner from Ruby or PHP? Assuming Python people would call the API?

]]>
By: mazadillon https://blog.scraperwiki.com/2011/10/lots-of-new-libraries/#comment-714 Wed, 26 Oct 2011 13:41:35 +0000 http://blog.scraperwiki.com/?p=758215702#comment-714 I don’t think it’s documented anywhere but I found that I could use pdf2txt.py which is part of PDF Miner by using the ‘bash trick’ someone discussed on the mailing list – basically running a local command in the sandbox.

Perhaps worth mentioning somewhere?

]]>