Metadata-Version: 2.0
Name: text2math
Version: 0.0.4.dev1
Summary: Simple package for generating ngrams and bag of words representation from text.
Home-page: https://github.com/steven-cutting/text2math
Author: Steven Cutting
Author-email: steven.e.cutting@linux.com
License: GNU GPL v3+
Keywords: nlp text ngram ngrams
Platform: UNKNOWN
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Development Status :: 2 - Pre-Alpha
Requires-Dist: toolz (>=0.7.4)
Requires-Dist: chardet (>=2.3.0)
Requires-Dist: unidecode (>=0.04.19)
Requires-Dist: ftfy (>=4.0.0)
Provides-Extra: dev
Requires-Dist: cchardet (>=1.0.0); extra == 'dev'
Requires-Dist: cytoolz (>=0.7.3); extra == 'dev'
Requires-Dist: beautifulsoup4 (>=4.4.0); extra == 'dev'
Requires-Dist: lxml (>=3.4.4); extra == 'dev'
Provides-Extra: extra
Requires-Dist: beautifulsoup4 (>=4.4.0); extra == 'extra'
Requires-Dist: lxml (>=3.4.4); extra == 'extra'
Provides-Extra: faster
Requires-Dist: cchardet (>=1.0.0); extra == 'faster'
Requires-Dist: cytoolz (>=0.7.3); extra == 'faster'
Provides-Extra: test
Requires-Dist: pytest-runner (>=2.6.2); extra == 'test'
Requires-Dist: pytest (>=2.8.7); extra == 'test'

A simple package designed to be used for demonstrating basic Natural Language Processing (NLP) feature engineering in Python.

## More Info:

### Practice Dataset

[**Stack Exchange Data Dump**](https://archive.org/details/stackexchange)


### Text Encoding

[**The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)**
by Joel Spolsky](http://www.joelonsoftware.com/articles/Unicode.html)

#### Packages

+ [**`chardet`**](https://pypi.python.org/pypi/chardet) - Universal encoding detector for Python 2 and 3
+ [**`cchardet`**](https://pypi.python.org/pypi/cchardet/1.0.0) - Universal encoding detector. This library is faster than chardet
+ [**`ftfy`**](http://ftfy.readthedocs.org/en/latest/#) - fixes text for you
+ [**`unidecode`**](https://pypi.python.org/pypi/Unidecode) - ASCII transliterations of Unicode text


### Natural Language Processing

[**Care and Feeding of Topic Models: Problems, Diagnostics, and Improvementes**](http://www.people.fas.harvard.edu/~airoldi/pub/books/b02.AiroldiBleiEroshevaFienberg2014HandbookMMM/Ch12_MMM2014.pdf)

### Functional Programing in Python

[**Functional programming in Python**
*Examine the functional aspects of Python: which options work well and which ones you should avoid*
By David Mertz](https://www.oreilly.com/ideas/functional-programming-in-python)

#### Packages

+ [**`toolz`**](http://toolz.readthedocs.org/en/latest/) - Toolz provides a set of utility functions for iterators, functions, and dictionaries.
+ [**`functools`**](https://docs.python.org/2/library/functools.html#module-functools) - Higher-order functions and operations on callable objects.
+ [**`itertools`**](https://docs.python.org/2/library/itertools.html#module-itertools) - Functions creating iterators for efficient looping.


