Metadata-Version: 2.1
Name: SoMeWeTa
Version: 1.7.3
Summary: A part-of-speech tagger with support for domain adaptation and external resources.
Home-page: https://github.com/tsproisl/SoMeWeTa
Author: Thomas Proisl
Author-email: thomas.proisl@fau.de
License: GNU General Public License v3 or later (GPLv3+)
Download-URL: https://github.com/tsproisl/SoMeWeTa/archive/v1.7.3.tar.gz
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Natural Language :: English
Classifier: Natural Language :: French
Classifier: Natural Language :: German
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.4
Requires-Dist: numpy
Requires-Dist: regex (>=2019.02.18)

SoMeWeTa
========

SoMeWeTa (short for Social Media and Web Tagger) is a part-of-speech
tagger that supports domain adaptation and that can incorporate
external sources of information such as Brown clusters and lexica. It
is based on the averaged structured perceptron and uses beam search
and an early update strategy. It is possible to train and evaluate the
tagger on partially annotated data.

SoMeWeTa achieves state-of-the-art results on the German web and
social media texts from the `EmpiriST 2015 shared task
<https://sites.google.com/site/empirist2015/>`_ on automatic
linguistic annotation of computer-mediated communication / social
media. Therefore, SoMeWeTa is particularly well-suited to tag all
kinds of written German discourse, for example chats, forums, wiki
talk pages, tweets, blog comments, social networks, SMS and WhatsApp
dialogues.

In addition, we also provide models trained on German, English and
French newspaper texts, as well as models for Bhojpuri and spoken
Italian. For all languages, SoMeWeTa achieves highly competitive
results close to the current state of the art.

More detailed documentation is available `here
<https://github.com/tsproisl/SoMeWeTa>`_.


