Metadata-Version: 2.1
Name: hashedml
Version: 0.0.1
Summary: Hash based machine learning
Home-page: https://github.com/mtingers/hashedml
Author: Matth Ingersoll
Author-email: matth@mtingers.com
Maintainer: Matth Ingersoll
Maintainer-email: matth@mtingers.com
License: MIT
Download-URL: https://pypi.python.org/pypi/hashedml
Description: # HashedML
        A machine learning library that uses a different approach: string hashing
        (think hash tables) for classifying sequences.
        
        # Installation
        
        PyPI (not available yet):
        ```
        pip install -U hashedml
        ```
        
        setup.py:
        ```
        python setup.py build
        python setup.py install
        ```
        
        # Classification
        HashedML takes the simple `fit(X, y)` / `predict(X)` approach.
        
        Example:
        
        ```python
        model = HashedML()
        iris_data = open('iris.data').read().split('\n')
        for i in iris_data:
            i = i.split(',')
            X = i[:-1]
            y = i[-1]
            model.fit(X, y)
        
        iris_test = open('iris.test').read().split('\n')
        for i in irist_test:
            i = i.split(',')
            X = i[:-1]
            y = i[-1]
            # use test() to get accuracy
            prediction = model.test(X, y)
            # -or: normally you don't have 'y'
            prediction = model.predict(X)
        
        print('accuracy: {}%'.format(model.accuracy()*100))
        
        ```
        
        # Generative
        HashedML can also generate data after learning.
        
        Example:
        
        ```python
        from collections import deque
        model = HashedML(nback=4, stm=True)
        token_q = deque(maxlen=model.nback)
        tokens = []
        
        tokens = TextBlob(open('training.text').read()).tokens
        
        # Learn
        for i in tokens:
            token_q.append(i)
            if len(token_q) != model.nback:
                continue
            X = list(token_q)tq[:-1]
            y = list(token_q)tq[-1]
            model.fit(X, y)
        
        # Generate
        output = model.generate(
            ('What', 'is'),
            nwords=500,
            seperator=' '
        )
        print(output)
        ```
        
        # Variable X Input & Non-numerical X or Y
        The X value can be of varying length/dimensions. For example, this is valid:
        ```python
        X = (
            (1, 2, 3),
            (1, 2),
            (1, 2, 3, 4),
        )
        # y can be of different data types
        y  = (
            'y1',
            2.0,
            'foostring'
        )
        ```
        
        All data is converted to strings. This is conterintuitive and different than
        most machine learning libraries, but helps with working with variable X/y data.
        
        # Examples
        
        ```bash
        % for i in test-data/*.test; do echo; echo -en "$i: "; data_file=$(echo $i|sed 's/.test/.data/g'); hashedml classify $data_file $i ; done
        
        test-data/abalone.test: accuracy: 100.0%
        
        test-data/allhypo.test: accuracy: 89.61%
        
        test-data/anneal.test: accuracy: 82.0%
        
        test-data/arrhythmia.test: accuracy: 100.0%
        
        test-data/breast-cancer.test: accuracy: 100.0%
        
        test-data/bupa.test: accuracy: 100.0%
        
        test-data/glass.test: accuracy: 100.0%
        
        test-data/iris.test: accuracy: 100.0%
        
        test-data/long.test: accuracy: 100.0%
        
        test-data/parkinsons_updrs.test: accuracy: 100.0%
        
        test-data/soybean-large.test: accuracy: 97.87%
        
        test-data/tic-tac-toe.test: accuracy: 100.0%
        ```
        
Platform: UNKNOWN
Classifier: Development Status :: 1 - Planning
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Description-Content-Type: text/markdown
