A New Computational Method Based on Heterogeneous Network for Predicting MicroRNA-Disease Associations
- 4 Downloads
MicroRNAs (miRNAs) are a class of small non-coding RNAs that are involved in the development of various complex human diseases. A great effort has spent to uncover the relations between miRNAs and diseases for decades. Although most of known miRNA-disease associations are discovered by experimental methods, the experimental methods are in general expensive and time-consuming. Another approach using computational methods to predict potential miRNA-disease associations has been attracted many computer scientists in recent years. However, computational methods suffer from various limitations that affect the prediction accuracy and their applicability. In this paper, we proposed a new computational method that would be able to predict reliable miRNA-disease associations. We integrate different biological data sources such as known miRNA-disease associations, miRNA-miRNA functional similarity, and disease-disease semantic similarity into a miRNA-disease heterogeneous network. The structural characteristics of this network are represented as a feature vector dataset via meta-paths and a binary classification problem is formulated. However, because the number of known miRNA-disease associations is very small, we face with an imbalance data classification problem. To solve this issue, a clustering-based under-sampling algorithm has been proposed. Training classification models using SVMs, we obtained results of 2–5% higher in AUC measures when compared to previous methods. These results implied that our proposed model could be used to discover reliable miRNA-disease associations in the human genome.
This research was supported by the Vietnam Ministry of Education and Training, project B2018-SPH-52.