loongson/pypi/: chardet-4.0.0 metadata and description

Homepage Simple index

Universal encoding detector for Python 2 and 3

author Mark Pilgrim
author_email mark@diveintomark.org
classifiers
  • Development Status :: 5 - Production/Stable
  • Intended Audience :: Developers
  • License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)
  • Operating System :: OS Independent
  • Programming Language :: Python
  • Programming Language :: Python :: 2
  • Programming Language :: Python :: 2.7
  • Programming Language :: Python :: 3
  • Programming Language :: Python :: 3.5
  • Programming Language :: Python :: 3.6
  • Programming Language :: Python :: 3.7
  • Programming Language :: Python :: 3.8
  • Programming Language :: Python :: 3.9
  • Programming Language :: Python :: Implementation :: CPython
  • Programming Language :: Python :: Implementation :: PyPy
  • Topic :: Software Development :: Libraries :: Python Modules
  • Topic :: Text Processing :: Linguistic
keywords encoding,i18n,xml
license LGPL
maintainer Daniel Blanchard
maintainer_email dan.blanchard@gmail.com
platform
  • UNKNOWN
requires_python >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*

Because this project isn't in the mirror_whitelist, no releases from root/pypi are included.

File Tox results History
chardet-4.0.0-py2.py3-none-any.whl
Size
175 KB
Type
Python Wheel
Python
2.7
  • Replaced 1 time(s)
  • Uploaded to loongson/pypi by loongson 2022-08-08 02:35:47

Chardet: The Universal Character Encoding Detector

Build status https://img.shields.io/coveralls/chardet/chardet/stable.svg Latest version on PyPI License
Detects
  • ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
  • Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
  • EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
  • EUC-KR, ISO-2022-KR (Korean)
  • KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
  • ISO-8859-5, windows-1251 (Bulgarian)
  • ISO-8859-1, windows-1252 (Western European languages)
  • ISO-8859-7, windows-1253 (Greek)
  • ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
  • TIS-620 (Thai)

Note

Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily disabled until we can retrain the models.

Requires Python 2.7 or 3.5+.

Installation

Install from PyPI:

pip install chardet

Documentation

For users, docs are now available at https://chardet.readthedocs.io/.

Command-line Tool

chardet comes with a command-line script which reports on the encodings of one or more files:

% chardetect somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0

About

This is a continuation of Mark Pilgrim’s excellent chardet. Previously, two versions needed to be maintained: one that supported python 2.x and one that supported python 3.x. We’ve recently merged with Ian Cordasco’s charade fork, so now we have one coherent version that works for Python 2.7+ and 3.4+.

maintainer:Dan Blanchard