Classification of Single Cell Types using Small Sets of Expressed Genes: Comparative Analysis of Supervised Machine Learning Methods

Aleksandar Veljkovic, Mirjana Maljkovic, Nenad Mitic, Sasa Malkov, Minjie Lyu, Xin Lin, Marek Michalewicz, Guanglan Zhang, Vladimir Brusic

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

Abstract

Single cell transcriptomics measures gene expression data of large number of genes, concurrently, from tens of thousands of cells present in a studied biological sample. It is difficult to obtain good classification results due to high data dimensionality and variability of biological states. We performed a preliminary study to assess the feasibility of using supervised machine learning methods to classify peripheral blood mononuclear cell (PBMC) types from single cell gene expression data. We analyzed a large PBMC data set (sim 120,000 PBMC cells), selected 47 genes (from 30698 features) suitable as SML classification features, and performed classification using 20 machine learning algorithms. Data sets represented three sample processing strategies: PBMC separation (two data sets), and experimental cell sorting by (two data sets). The accuracy in 5-class classification among 20 methods was 91-97% (PBMC separation), 97-100% (magnetic-activated cell sorting), and 82-99% (fluorescence-activated cell sorting). Our results indicate the feasibility of supervised machine learning for classification of cells into major PBMC cell types using a small number of classification features from single cell gene expression data.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
EditorsYufei Huang, Lukasz Kurgan, Feng Luo, Xiaohua Tony Hu, Yidong Chen, Edward Dougherty, Andrzej Kloczkowski, Yaohang Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3322-3326
Number of pages5
ISBN (Electronic)9781665401265
DOIs
Publication statusPublished - 2021
Event2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 - Virtual, Online, United States
Duration: 9 Dec 202112 Dec 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021

Conference

Conference2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021
Country/TerritoryUnited States
CityVirtual, Online
Period9/12/2112/12/21

Keywords

  • 10x SCT
  • PBMC
  • classification
  • data mining
  • dimensionality reduction
  • gene expression
  • machine learning
  • transcriptome

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Biomedical Engineering
  • Health Informatics
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Classification of Single Cell Types using Small Sets of Expressed Genes: Comparative Analysis of Supervised Machine Learning Methods'. Together they form a unique fingerprint.

Cite this