Metadata-Version: 2.1
Name: stedsans
Version: 0.0.7
Summary: stedsans is a package capable of doing geospatial analyses from text.
Home-page: https://github.com/maltehb/stedsans
Author: Malte Højmark-Bertelsen, Jakob Grøhn Damgaard
Author-email: hjb@kmd.dk, bokajgd@gmail.com
License: UNKNOWN
Keywords: Geospatial Analysis NLP Danish English
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy (==1.20.1)
Requires-Dist: pandas (==1.2.2)
Requires-Dist: transformers (==4.4.2)
Requires-Dist: nltk (==3.5)
Requires-Dist: geopy (==2.1.0)
Requires-Dist: folium (==0.12.1)
Requires-Dist: torch (==1.7.1)
Requires-Dist: flake8 (==3.8.4)
Requires-Dist: parameterized (==0.8.1)
Requires-Dist: matplotlib (==3.4.0)
Requires-Dist: pointpats (==2.2.0)
Requires-Dist: descartes (==1.1.0)
Requires-Dist: importlib-resources (==5.1.4)

# ```stedsans```
This repository is for an exam project for the course Spatial Analytics at Aarhus University during the spring of 2021. 

It is made by Jakob Grøhn Damgaard and Malte Højmark-Bertelsen.

The purpose of it is to build a PyPI-package capable of plotting a map of any location in a Danish sentence. To do so we employ the Natural Language Processing (NLP) technique Named Entity Recognition (NER)

**NER** is a task consisting of finding words in text that constitute a specific entities and tagging them with specific labels. The most common entities are person names (PER), locations (LOC) and organizations (ORG) (Ruder, 2019).
The way the named entities are tagged follows a tagging scheme called BIO-tagging, where the different words are separated as either being the beginning (B) of an entity, inside an entity (I), or other (O), meaning that a word is not part of the defined entities. 
An illustration of the aforementioned entities can be seen in *Table 1*.

__Table 1:__
| __NER-tag__ | __Meaning__ |
| --- | --- |
| B-PER | Beginning of person name |
| I-PER | Inside a person name |
| B-LOC | Beginning of location |
| I-LOC | Inside a location |
| B-ORG | Beginning of organization |
| I-ORG | Inside an organization |
| O | Other |
---

### Instructions
To use the code locally, start off by cloning the repository and install [Anaconda](https://docs.anaconda.com/anaconda/install/) for your OS. Afterwards create a conda environment and install the requirements.
```bash 
# From the directory of this repository
conda create -n [env_name] python=3.9  # Create conda environment
conda activate [env_name]  # Activate conda environment
pip install -r requirements.txt  # Install required packages
```

Afterwards install `geopandas`using the pre-build binaries from Anaconda:
```bash
conda install geopandas
```

### Usage
To see an example of usage see the Google Colab demo: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MalteHB/stedsans/blob/main/notebooks/stedsans_demo.ipynb)


### References
Ruder, S. (2019). Neural transfer learning for natural language processing (Doctoral dissertation, NUI Galway).

---

#### Contact
For help or further information feel free to reach out to Jakob Grøhn Damgaard on [bokajgd@gmail.com](mailto:bokajgd@gmail.com?subject=stedsans) or Malte Højmark-Bertelsen on [hjb@kmd.dk](mailto:hjb@kmd.dk?subject=stedsans).


