other annotations

While "nc" in the name implies a focus on noncoding RNAs, ncPRO-seq is far more and can be used to analyse any annotation files in gff3 format like splice site and promoter region of protein coding gene. Note that these annotation files should contain one of the following three features in "attributes" column to indicate the names of items: Name, Alias or ID.

It has already been reported that small RNAs are enriched at 3' ends of internal exons (spliRNAs) and at transcription initiation sites (tiRNAs) [15]. To show how the "other annotations" option works, we create gff3 annotation files of both splice donor site and acceptor site for genomes that has refgene annotation in UCSC [4], Basically, to generate donor site annotation, we locate the 3' end of all exons except the last one in genes, and extend 100bp upstream and downstream, thereby obtain regions of size 201bp with 3' end of exon at position 101. For acceptor site, 5' end of exons excluding the first exon in genes are chosen to extend +- 100bp.

Example of "attributes" column in splice acceptor annotation file:

GeneName=NM_001083312;Exon_idx=2;Type=acceptor;Extend_base=100;

Example of "attributes" column in splice donor annotation file:

GeneName=NM_001083312;Exon_idx=1;Type=donor;Extend_base=100;

Chongjian Chen 2012-01-26