MicroRNAs (miRNAs) are endogenous non-coding small RNAs (of about 22 nucleotides), which play important roles in post-transcriptional regulation of gene expression via mRNA cleavage or translation inhibition. The major topics of miRNA research can be classified into three parts: identification of novel miRNA, recognition of miRNA target, and transcriptional regulation of miRNA expression. Although several machine learning-based approaches have been developed to identify novel miRNAs from next generation sequencing (NGS) data, precursor/genomic sequences were essentially required as references for most of these methods. As such, the non-availability of genomic sequences is often a limitation in miRNA discovery in non-model organisms. It is thus necessary to develop a systematic approach to determine novel miRNAs without reference sequences. In this study, an effective method was developed to identify miRNAs from non-model plants based on NGS datasets with several significant structure-related features of mature miRNAs and their passenger strands by using support vector machine (SVM) algorithm. The accuracy of the independent test reached 96.61% and 93.04% for dicots (Arabidopsis) and monocots (rice), respectively. Furthermore, true small RNA sequencing data from orchid was applied in this study. Twenty-one predicted orchid miRNAs were selected and experimentally validated. Significantly, all of them have been confirmed by qRT-PCR. This novel approach was also complied as a user-friendly program, called microRPM (microRNA Prediction Model).

 
 

Citation: K. C. Tseng, Y. F. Chiang-Hsieh, H. Pai, C. N. Chow, S. C. Lee, H. Q. Zheng, P. L. Kuo, G. Z. Li, Y. C. Hung, N. S. Lin, and W. C. Chang* (2018) microRPM: a microRNA prediction model based only on plant small RNA sequencing data. Bioinformatics, 34, 1108-1115.

 
 
Download demo files

 Small RNA sequencing reads
 Reference transcript


Prediction model

 Select type of model:
  Without reference   With reference

 Select feature combination:
  Triplet element only (best for “without reference” model)
  Dicer cutting site & Triplet element (best for “with reference” model)

  

Required programs

 Bowtie  download  website
 Vienna RNA package (RNAfold & RNAcofold)  download  website
 Trinity  download  website
 LibSVM  download  website
 libsvm.patch  download
 Structure RNA sequences (Rfam database)  download  website
 
       
  Last update: 2018/05/14