0 Members and 1 Guest are viewing this topic.
PDFMiner is a pdf parsing library written in Python by Yusuke Shinyama.In addition to the pdf2txt.py and dumppdf.py command line tools, there is a way of analyzing the content tree of each page.Since that’s exactly the kind of programmatic parsing I wanted to use PDFMiner for, this is a more complete example, which continues where the default documentation stops.