Command-line version

The ncPRO-seq pipeline will generate a lot of output files. Thus before starting, it is highly recommended to deploy the ncPRO-seq output architecture.
However, this step is optional. If you don' want to use this output architecture, please change the output paths in the configuration files. To deploy the output architecture, use the following command:
$ APPLI_DIR/bin/ncPRO-deploy -o output_directory

The input (fastq, bam, csfasta or fasta) files must be filed in the rawdata folder. Finally, after setting the different parameters in the configuration file, run the ncPRO-seq pipeline as follow :

$ cd output_directory
$ APPLI_DIR/bin/ncPRO-seq -c config-ncrna.txt

The ncPRO-seq pipeline is modular and sequential. The user can specify the analysis steps to run. For instance, the following command line will just perform the quality control, the reads grouping, and the alignment on the reference genome.

$ APPLI_DIR/bin/ncPRO-seq -c config-ncrna.txt -s processRead 
-s mapGenome -s mapGenomeStat
The following analysis steps are available:
1whitegray!20
Table 2: Description of ncPRO-seq '-s' options
gray ncPRO-seq analysis step (-s option) Description
processRead calcualte read length distribution, median quality score for each postion, and group reads
mapGenome run Bowtie for genome mapping
mapGenomeStat compute number of mapped reads and unmapped reads in the genome
overviewRmsk compute read coverage for each repeats family
overviewRfam compute read coverage for each ncRNA family
generateNcgff create gff file for special ncrna family
ncrnaProcess ncRNA family analyses, including read coverage, read length distribution, read coverage in subfamilies, and sequence logo
ucscTrack generate ucsc track for ncRNA family
sigRegion detect significantly enriched regions
html_builder build the html report file
 

Chongjian Chen 2012-01-26