Metadata-Version: 2.1
Name: seqpandas
Version: 0.1.0
Summary: Read bioinformatics sequence formats into a Pandas DataFrame
Keywords: bioinformatics,genomics,pandas,vcf,fasta,sam,bed,pileup,cython
Author-Email: Troy Sincomb <troysincomb@gmail.com>
Maintainer-Email: Troy Sincomb <troysincomb@gmail.com>
License: MIT
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Cython
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering
Project-URL: Homepage, https://github.com/tmsincomb/seqpandas
Project-URL: Documentation, https://seqpandas.readthedocs.io
Project-URL: Repository, https://github.com/tmsincomb/seqpandas
Project-URL: Issues, https://github.com/tmsincomb/seqpandas/issues
Requires-Python: >=3.10
Requires-Dist: Click>=7.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: biopython>=1.79
Requires-Dist: pysam>=0.21.0
Requires-Dist: numpy<2.0.0,>=1.21.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: matplotlib>=3.3.0
Requires-Dist: logomaker>=0.8
Requires-Dist: patchworklib>=0.6.3
Provides-Extra: dev
Requires-Dist: pytest>=6.2.4; extra == "dev"
Requires-Dist: flake8>=3.7.8; extra == "dev"
Requires-Dist: black>=21.7b0; extra == "dev"
Requires-Dist: coverage>=4.5.4; extra == "dev"
Requires-Dist: tox>=3.14.0; extra == "dev"
Requires-Dist: Sphinx>=1.8.5; extra == "dev"
Requires-Dist: twine>=1.14.0; extra == "dev"
Requires-Dist: bump2version>=0.5.11; extra == "dev"
Requires-Dist: watchdog>=0.9.0; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Requires-Dist: pre-commit>=4.0.0; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest>=6.2.4; extra == "test"
Requires-Dist: pytest-cov>=2.12.0; extra == "test"
Description-Content-Type: text/x-rst

=========
SeqPandas
=========
Import genomic data to get a custom Pandas & Biopython hybrid class object with fancy shortcuts to make Machine Learning preprocessing easy!

* Free software: MIT license
* Documentation: https://seqpandas.readthedocs.io.


Installation
------------

.. code:: bash

    pip install seqpandas


Usage
-----

.. code:: python

    import seqpandas as spd

    # Direct File Path
    df = spd.read_seq('file.fasta', format='fasta')
    df = spd.read_seq('file.sam', format='sam')
    df = spd.read_vcf('file.vcf', format='vcf')
    df = spd.read_bed('file.bed', format='bed')

    # Just need BioPython Seqs? No problem!
    seqrecords = spd.read('file.fasta', format='fasta')

    # Already Opened BioPython Handle
    from Bio import SeqIO
    seqrecords = SeqIO.parse('file.fasta', format='fasta')
    df = spd.BioDataFrame.from_seqrecords(seqrecords)


Tutorial
--------
For a complete walkthrough and to use it for a machine learning pipeline please follow the `tutorial notebook <https://github.com/tmsincomb/SeqPandas/blob/master/tutorial.ipynb>`_.


Credits
-------

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
