Metadata-Version: 2.1
Name: figur
Version: 0.0.5
Summary: Figurenerkennung for German literary texts
Home-page: https://github.com/severinsimmler/figur
Author: Severin Simmler
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
Requires-Dist: flair (>=0.4.1)
Requires-Dist: spacy (>=2.0.18)
Requires-Dist: pandas (>=0.24.1)
Requires-Dist: wget (>=3.2)
Requires-Dist: requests (>=2.21.0)


# Figurenerkennung for German literary texts

[![Build Status](https://travis-ci.com/severinsimmler/figur.svg?branch=master)](https://travis-ci.com/severinsimmler/figur)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2592472.svg)](https://doi.org/10.5281/zenodo.2592472)


An important step in the quantitative analysis of narrative texts is the automatic recognition of references to figures, a special case of the generic NLP problem of Named Entity Recognition (NER).

Usually NER models are not designed for literary texts resulting in poor recall. This easy-to-use package is the continuation of the work of [Jannidis et al.](https://opus.bibliothek.uni-wuerzburg.de/opus4-wuerzburg/frontdoor/deliver/index/docId/14333/file/Jannidis_Figurenerkennung_Roman.pdf) using techniques from the field of Deep Learning.


## Installation

```
$ pip install figur
```


## Example

```python
>>> import figur
>>> text = "Der Gärtner entfernte sich eilig, und Eduard folgte bald."
>>> figur.tag(text)
   SentenceId      Token      Tag
0           0        Der        _
1           0    Gärtner  AppTdfW
2           0  entfernte        _
3           0       sich     Pron
4           0     eilig,        _
5           0        und        _
6           0     Eduard     Core
7           0     folgte        _
8           0      bald.        _
```


## Figurenerkennung statistics
![Confusion Matrix](doc/confusion-matrix.svg)


