loongson/pypi/: pystemmer-2.2.0.3 metadata and description

Homepage Simple index

Snowball stemming algorithms, for information retrieval

author Richard Boulton
author_email richard@tartarus.org
classifiers
  • Development Status :: 5 - Production/Stable
  • Intended Audience :: Developers
  • License :: OSI Approved :: MIT License
  • License :: OSI Approved :: BSD License
  • Natural Language :: Danish
  • Natural Language :: Dutch
  • Natural Language :: English
  • Natural Language :: Finnish
  • Natural Language :: French
  • Natural Language :: German
  • Natural Language :: Italian
  • Natural Language :: Norwegian
  • Natural Language :: Portuguese
  • Natural Language :: Russian
  • Natural Language :: Spanish
  • Natural Language :: Swedish
  • Operating System :: OS Independent
  • Programming Language :: C
  • Programming Language :: Other
  • Programming Language :: Python
  • Programming Language :: Python :: 2
  • Programming Language :: Python :: 2.6
  • Programming Language :: Python :: 2.7
  • Programming Language :: Python :: 3
  • Programming Language :: Python :: 3.3
  • Programming Language :: Python :: 3.4
  • Programming Language :: Python :: 3.5
  • Programming Language :: Python :: 3.6
  • Programming Language :: Python :: 3.7
  • Programming Language :: Python :: 3.8
  • Programming Language :: Python :: 3.9
  • Programming Language :: Python :: 3.10
  • Programming Language :: Python :: 3.11
  • Programming Language :: Python :: 3.12
  • Programming Language :: Python :: 3.13
  • Topic :: Database
  • Topic :: Internet :: WWW/HTTP :: Indexing/Search
  • Topic :: Text Processing :: Indexing
  • Topic :: Text Processing :: Linguistic
keywords python,information retrieval,language processing,morphological analysis,stemming algorithms,stemmers
license MIT, BSD
maintainer Richard Boulton
maintainer_email richard@tartarus.org
File Tox results History
PyStemmer-2.2.0.3-cp310-cp310-manylinux_2_27_loongarch64.whl
Size
644 KB
Type
Python Wheel
Python
3.1.0

Stemming algorithms

PyStemmer provides access to efficient algorithms for calculating a “stemmed” form of a word. This is a form with most of the common morphological endings removed; hopefully representing a common linguistic base form. This is most useful in building search engines and information retrieval software; for example, a search with stemming enabled should be able to find a document containing “cycling” given the query “cycles”.

PyStemmer provides algorithms for several (mainly European) languages, by wrapping the libstemmer library from the Snowball project in a Python module.

It also provides access to the classic Porter stemming algorithm for English: although this has been superseded by an improved algorithm, the original algorithm may be of interest to information retrieval researchers wishing to reproduce results of earlier experiments.