Metadata-Version: 2.1
Name: spacy-fastlang
Version: 0.4.2
Summary: Language detection using FastText and Spacy
Home-page: https://github.com/thomasthiebaud/spacy-fastlang
License: MIT
Keywords: spacy,fasttext,language,detection
Author: Thomas Thiebaud
Author-email: thiebaud.tom@gmail.com
Requires-Python: >=3.6,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development
Requires-Dist: fasttext (>=0.9.2,<0.10.0)
Requires-Dist: spacy (>=2.3.0,<3.0.0)
Project-URL: Documentation, https://github.com/thomasthiebaud/spacy-fastlang
Project-URL: Repository, https://github.com/thomasthiebaud/spacy-fastlang
Description-Content-Type: text/markdown

# spacy_fastlang

## Install

Assuming you have a working python environment, you can simply install it using

```
pip install spacy_fastlang
```

## Usage

The library exports a pipeline component called `LanguageDetector` that will set two spacy extensions

- doc.\_.language = ISO code of the detected language or `xx` as a fallback
- doc.\_.language_score = confidence

```
from spacy_fastlang import LanguageDetector
nlp = spacy.load("...")
nlp.add_pipe(LanguageDetector())
doc = nlp(en_text)

doc._.language == "..."
doc._.language_score >= ...
```

## Options

[Check the tests](./tests/test_spacy_fastlang.py) to see more examples and available options

## License

Everythin is under `MIT` except the default model which is distributed under [Creative Commons Attribution-Share-Alike License 3.0](https://creativecommons.org/licenses/by-sa/3.0/) by facebook [here](https://fasttext.cc/docs/en/language-identification.html)

