Command-line version
The ncPRO-seq pipeline will generate a lot of output files. Thus before starting, it is highly recommended to deploy the ncPRO-seq output architecture.
However, this step is optional. If you don' want to use this output architecture, please change the output paths in the configuration files.
To deploy the output architecture, use the following command:
$ APPLI_DIR/bin/ncPRO-deploy -o output_directory
The input (fastq, bam, csfasta or fasta) files must be filed in the rawdata folder.
Finally, after setting the different parameters in the configuration file, run the ncPRO-seq pipeline as follow :
$ cd output_directory
$ APPLI_DIR/bin/ncPRO-seq -c config-ncrna.txt
The ncPRO-seq pipeline is modular and sequential. The user can specify the analysis steps to run.
For instance, the following command line will just perform the quality control, the reads grouping, and the alignment on the reference genome.
$ APPLI_DIR/bin/ncPRO-seq -c config-ncrna.txt -s processRead
-s mapGenome -s mapGenomeStat
The following analysis steps are available:
1whitegray!20
Table 2:
Description of ncPRO-seq '-s' options
gray
ncPRO-seq analysis step (-s option) |
Description |
processRead |
calcualte read length distribution, median quality score for each postion, and group reads |
mapGenome |
run Bowtie for genome mapping |
mapGenomeStat |
compute number of mapped reads and unmapped reads in the genome |
overviewRmsk |
compute read coverage for each repeats family |
overviewRfam |
compute read coverage for each ncRNA family |
generateNcgff |
create gff file for special ncrna family |
ncrnaProcess |
ncRNA family analyses, including read coverage, read length distribution, read coverage in subfamilies, and sequence logo |
ucscTrack |
generate ucsc track for ncRNA family |
sigRegion |
detect significantly enriched regions |
html_builder |
build the html report file |
|
|
Chongjian Chen
2012-01-26