Discrimination of golgi proteins through efficient exploitation of hybrid feature spaces coupled with smote and ensemble of support vector machine

Muhammad Tahir, Fazlullah Khan, Mohammad Khalid Imam Rahmani, Vinh Truong Hoang

Research output: Journal PublicationArticlepeer-review

5 Citations (Scopus)

Abstract

Many organelles inside and outside a living cell depend on the perfect behavior of Golgi apparatus for smooth and normal functioning. Its poor performance may lead to many inheritable diseases like diabetes and cancer. Therefore, it is highly crucial to detect any strange behavior of Golgi apparatus in advance. Accurate discrimination of cis-Golgi from trans-Golgi proteins surely helps researchers identify the role of Golgi proteins in various diseases and assist pharmacists in drug development. In this work, various hybrid models of Bi-Profile Bayes, Bigram PSSM, Di-Peptide Composition, and Split Amphiphilic Pseudo Amino Acid Composition with SMOTE oversampling technique have been employed to discriminate Golgi protein types. Multiple linear Support Vector Machines have been used to exploit the discrimination power of these models. The proposed prediction system: Golgi-predictor has shown significant performance and achieved promising results compared to other existing state-of-the-art techniques. Through the 10-fold cross-validation, the proposed system achieved an accuracy value of 97.6%, sensitivity value of 98.8%, specificity value of 96.5%, G-mean value of 97.6%, MCC value of 0.95, and F-score value of 0.97. Similarly, through the jackknife cross-validation, the achieved values for accuracy, sensitivity, specificity, G-mean, MCC, and F-score are respectively, 96.5%, 97.8%, 95.2%, 96.4%, 0.93, and 0.96. Moreover, through the independent dataset testing, Golgi-predictor demonstrated significant enhancement in performance over other techniques. The proposed methodology aims at supporting drug designers in pharmaceutical industry and assisting researchers from the fields of bioinformatics and computational biology towards better innovation in predicting the behavior of Golgi proteins.

Original languageEnglish
Pages (from-to)206028-206038
Number of pages11
JournalIEEE Access
Volume8
DOIs
Publication statusPublished - 2020
Externally publishedYes

Keywords

  • Amphiphilic pseudo amino acid composition
  • Bi-profile Bayes
  • Bigram features
  • Golgi proteins
  • Support vector machine
  • Synthetic minority oversampling technique

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'Discrimination of golgi proteins through efficient exploitation of hybrid feature spaces coupled with smote and ensemble of support vector machine'. Together they form a unique fingerprint.

Cite this