Predicting Autism Spectrum Disorder Using Pluripotent Stem Cell RNA-Seq Data and Machine Learning

Authors

  • Richard Li Highland Park High School, TX 75205, USA

Keywords:

Autism Spectrum Disorder (ASD), machine learning algorithms, workflows, tools

Abstract

In this work, datasets of gene expression in Autism Spectrum Disorder (ASD) were analyzed with the goal of selecting the most attributed genes and performing classification with machine learning algorithms. The publicly published datasets (GSE129806 and GSE214323) from the Gene Expression Omnibus database, which are both RNA-seq gene count data of humans, were downloaded. Then the workflows with differential expression analysis, principal component analysis (PCA), gene set enrichment analysis (GSEA) (Subramanian et al., 2005) and gene expression Meta-Analysis (Toro-Domínguez et al., 2020) were developed. The datasets were following pipelines which used machine learning algorithms to develop prediction models for classification. The results of this exploratory study suggest that the gene expression profiles identified from the pluripotent stem cell samples with ASD can be used to identify a biological signature for ASD with machine learning techniques. And especially, the gene expression Meta-Analysis of multiple datasets and larger numbers of samples could lead to more practical tools, such as Machine Learning models and workflows, to detect ASD at an early age in the general population.

Downloads

Published

2023-09-25

How to Cite

Richard Li. (2023). Predicting Autism Spectrum Disorder Using Pluripotent Stem Cell RNA-Seq Data and Machine Learning. ournal of nnovations in edical esearch, 2(9), 41–62. etrieved from https://www.paradigmpress.org/jimr/article/view/795

Issue

Section

Articles