Metadata-Version: 2.1
Name: wespipeline
Version: 0.9.2
Summary: An implementation of a whole exome analysis pipeline using the library Luigi for workflow management.
Home-page: UNKNOWN
Author: Alejandro Rodríguez Díaz
Author-email: jancho@usal.es
License: UNKNOWN
Project-URL: Documentation, https://wespipeline.readthedocs.io/en/latest/index.html
Project-URL: Source, https://github.com/Janchorizo/wespipeline
Project-URL: Tracker, https://github.com/Janchorizo/wespipeline/issues
Description: Wespipeline
        ===========
        An implementation of a whole exome analysis pipeline using `Luigi <https://github.com/spotify/luigi/>`_ for workflow management.
        
        .. figure:: https://raw.githubusercontent.com/janchorizo/wespipeline/master/docs/steps.png
           :alt: Steps Logo
           :align: center
        
        This package provides with the implementation of tasks for executing partial or complete variant calling 
        analysis with the advantages of having a workflow manager: dependency resolution, execution planner,
        modularity, monitoring and historic.
        
        Documentation for the latest version is being hosted by `readthedocs <https://wespipeline.readthedocs.io/en/latest/>`_
        
        Installation
        ^^^^^^^^^^^^
        Wespipeline is available through pip, conda and manual installation. Install it from the package repositories
        ``pip3 install wespipeline`` ``conda install wespipeline``, or download the project and place it in a place 
        accessible to Python.
        
        Notice that executing the analysis will involve additional dependencies. These are cited below and can be
        downloaded with the Anaconda distribution:
        
        * Secuence retrieval : Sra Toolkit, Fastqc
        
        * Reference genome retrieval : No needed dependency
        
        * Secuence alignment : Bwa
        
        * Alignment processing : Bwa Samtools, 
        
        * Variant calling : Freebayes, Varscan, Gatk, Deepvariant
        
        * Variant calling evaluation : Vcf tools
        
        .. code-block:: bash
        
           wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
           bash ~/miniconda.sh -b -p $HOME/miniconda
           export PATH="$HOME/miniconda/bin:$PATH"
           source $HOME/miniconda/bin/activate && \
               conda config --add channels bioconda && \
               conda config --add channels conda-forge && \
               conda install -y samtools && \
               conda install -y bwa && \
               conda install -y picard && \
               conda install -y platypus-variant && \
               conda install -y varscan && \
               conda install -y freebayes && \
               conda install -y fastqc && \
               conda install -y sra-tools && \
               conda install -y vcftools 
        
           rm ~/miniconda.sh
        
        Getting started
        ^^^^^^^^^^^^^^^
        
        Installing or downloading the package will provide with a higher level task per step of the
        analysis, each of which can be executed in a similar fashion to other Luigi tasks.
        
        Each of the six steps have a higher level task that can be scheduled in a similar fashion
        to other Luigi tasks:
        
        .. code-block:: bash
        
        	python3 -m luigi --module wespipeline.<module> <Taskname> --<Taskname>-param value
        
        Download the sequences using the NCBI accession number.
        
        .. code-block:: bash 
        
        	python3 -m luigi --module wespipeline.fastq FastqRetrieval \
        		--FastqRetrieval-paired-end true \
        		--FastqRetrieval-accession-number SRR9209557 \
        		--FastqRetrieval-create-report true
        
        Or an external url.
        
        .. code-block:: bash
        
        	python3 -m luigi --module wespipeline.fastq FastqRetrieval \
        		--FastqRetrieval-paired-end true \
        		--FastqRetrieval-compressed false \
        		--FastqRetrieval-accession-number SRR9209557 \
        		--FastqRetrieval-create-report true
        
        Download the reference genome and create a report using FastqC.
        
        .. code-block:: bash
        
        	python3.6 -m luigi --module tasks.reference ReferenceRetrieval 
        		--workers 3 \
        		--ReferenceGenome-ref-url ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit \
        		--ReferenceGenome-from2bit True \
        		--GlobalParams-base-dir ./tfm_experiment \
        		--GlobalParams-log-dir .logs \
        		--GlobalParams-exp-name hg19
        
        Or run the whole analysis, specifying the parameters for each of the steps.
        
        .. code-block:: bash
        
        	python3 -m luigi --module tasks.vcf VariantCalling 
        		--workers 3 
        		--VariantCalling-use-platypus true 
        		--VariantCalling-use-freebayes true 
        		--VariantCalling-use-samtools false 
        		--VariantCalling-use-gatk false 
        		--VariantCalling-use-deepcalling false 
        		--AlignProcessing-cpus 6 
        		--FastqAlign-cpus 6 
        		--FastqAlign-create-report True 
        		--GetFastq-gz-compressed True 
        		--GetFastq-fastq1-url ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/NIST7035_TAAGGCGA_L001_R1_001.fastq.gz 
        		--GetFastq-fastq2-url ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/NIST7035_TAAGGCGA_L001_R2_001.fastq.gz 
        		--GetFastq-from-ebi False 
        		--GetFastq-paired-end True 
        		--ReferenceGenomeRetrieval-ref-url ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit --ReferenceGenomeRetrieval-from2bit True 
        		--GlobalParams-base-dir ./tfm_experiment 
        		--GlobalParams-log-dir .logs 
        		--GlobalParams-exp-name hg19 
        
        Tasks implemented
        ^^^^^^^^^^^^^^^^^
        
        +-----------------+----------------------------+
        | Module          | Task                       |
        +=================+============================+
        | reference       | ReferenceGenomeRetrieval   |
        +-----------------+----------------------------+
        | fastq           | FastqRetrieval             |
        +-----------------+----------------------------+
        | align           | FastqAlignment             |
        +-----------------+----------------------------+
        | processalign    | FastqProcessing            |
        +-----------------+----------------------------+
        | variantcalling  |    | VariantCalling        |
        +-----------------+----------------------------+
        | processalign    |  VariantProcessing         |
        +-----------------+----------------------------+
        
        Acknowledgements
        ^^^^^^^^^^^^^^^^
        
        Special thanks to ...
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
