N6-Methyladenosine (m6A) is the most common RNA methylation in humans, regulating a wide range of cellular phenomena and presenting associations with various diseases1. The interaction between RNA-binding proteins and an RNA molecule has resulted in the methylation of a particular nucleotide, leading to notable effects on RNA stability, functionality, and cellular localisation. Multiple experiments have been conducted to identify human m6A sites. Nevertheless, an extensive examination across diverse cellular contexts and transcriptomes is laborious and costly, hampering progress in medical applications. Several computational models2,3 have been developed for screening potential m6A RNA methylation sites. However, their predictive capabilities are currently constrained to the utilisation of ineffective features that are unable to capture the hidden information present in methylated sites.
This study introduces AI-m6ARS, an innovative predictive model for an accurate prediction of m6A methylation sites. The AI-m6ARS model integrates four distinct feature sets: (i) one-hot encodings, (ii) iFeatures, (iii) conservation features, and (iv) geographical features. These feature sets are brought to improve the characterisation of methylated sites within DRACH motifs. Comprehensive negative sample selection and feature selection techniques were also performed to enhance the quality of the training set.
AI-m6ARS demonstrates robust predictive performance with an area under the receiver operating characteristic curve of 0.86 on a non-redundant blind test. Consistent results were observed on cross-validation, providing confidence in the robustness and generalisability of AI-m6ARS. The feature importance analysis revealed that the four most important features are geographical features. AI-m6ARS displayed comparable performance to state-of-the-art models, but offered two significant advantages. First, a machine learning pipeline that is both effective and interpretable is employed. Second, AI-m6ARS can also be accessed as a comprehensive web-based platform at https://biosig.lab.uq.edu.au/ai_m6ars. Our web server provides valuable insights into the landscape of m6A RNA methylation sites in the human genome, facilitating advancements in medical applications.