Metadata-Version: 2.1
Name: pdfminer.six
Version: 20200124
Summary: PDF parser and analyzer
Home-page: https://github.com/pdfminer/pdfminer.six
Author: Yusuke Shinyama + Philippe Guglielmetti
Author-email: pdfminer@goulu.net
License: MIT/X
Keywords: pdf parser,pdf converter,layout analysis,text mining
Platform: UNKNOWN
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Text Processing
Requires-Python: >=3.4
Description-Content-Type: text/markdown
Requires-Dist: pycryptodome
Requires-Dist: sortedcontainers
Requires-Dist: chardet ; python_version > "3.0"
Provides-Extra: dev
Requires-Dist: nose ; extra == 'dev'
Requires-Dist: tox ; extra == 'dev'
Provides-Extra: docs
Requires-Dist: sphinx ; extra == 'docs'
Requires-Dist: sphinx-argparse ; extra == 'docs'

pdfminer.six
============

[![Build Status](https://travis-ci.org/pdfminer/pdfminer.six.svg?branch=master)](https://travis-ci.org/pdfminer/pdfminer.six)
[![PyPI version](https://img.shields.io/pypi/v/pdfminer.six.svg)](https://pypi.python.org/pypi/pdfminer.six/)
[![gitter](https://badges.gitter.im/pdfminer-six/Lobby.svg)](https://gitter.im/pdfminer-six/Lobby?utm_source=badge&utm_medium)

Pdfminer.six is an community maintained fork of the original PDFMiner. It is a
tool for extracting information from PDF documents.
Unlike other PDF-related tools, it focuses entirely on getting
and analyzing text data. Pdfminer.six allows one to obtain
the exact location of text in a page, as well as
other information such as fonts or lines.
It includes a PDF converter that can transform PDF files
into other text formats (such as HTML). It has an extensible
PDF parser that can be used for other purposes than text analysis.

Check out the full documentation on
[Read the Docs](https://pdfminersix.readthedocs.io).


Features
--------

 * Written entirely in Python.
 * Parse, analyze, and convert PDF documents.
 * PDF-1.7 specification support. (well, almost).
 * CJK languages and vertical writing scripts support.
 * Various font types (Type1, TrueType, Type3, and CID) support.
 * Support for extracting images (JPG, JBIG2 and Bitmaps).
 * Support for RC4 and AES encryption.
 * Outline (TOC) extraction.
 * Tagged contents extraction.
 * Automatic layout analysis.


How to use
----------

 * Install Python 3.4 or newer
 * Install

    `pip install pdfminer.six`

 * Use command-line interface to extract text from pdf:

    `python pdf2txt.py samples/simple1.pdf`

* Check out more examples and documentation on
[Read the Docs](https://pdfminersix.readthedocs.io).


Contributing
------------

Be sure to read the [contribution guidelines](https://github.com/pdfminer/pdfminer.six/blob/master/CONTRIBUTING.md). 


