The precision of machine learning models at classifying autism spectrum disorder in adults

(1) Silver Creek High School, (2) Special Education, Santa Clara Unified School District

https://doi.org/10.59720/23-176
Cover photo for The precision of machine learning models at classifying autism spectrum disorder in adults

Autism spectrum disorder (ASD) is hard to correctly diagnose due to the very subjective nature of diagnosing it: behavior analysis. Due to this issue, we sought to find a machine learning-based method that diagnoses ASD without behavior analysis or helps reduce misdiagnosis. We tested the precision of many binary classification models such as the naive bayes, support vector machine, decision tree, random forest, and k-nearest neighbor models to compare their mean average precision in diagnosing individuals with ASD. Based on multivariable data, we hypothesized that the k-nearest neighbor model would be the best at diagnosing ASD accurately because this model is known to use data points that have related values to classify new data points. Upon training and testing of all the different models with an online dataset, the mean average precision of each model was analyzed along with its cross-validation scores, showing that the most accurate model at predicting whether an individual had ASD was the random forest model with a mean average precision of 0.92 and a mean cross-validation score of 0.86. The naive bayes model was the least accurate performer with a mean average precision of 0.80 and a mean cross-validation score of 0.64. Based on these results, the random forest model could aid in reducing the misdiagnosis of ASD. The usage of the random forest algorithm helps avoid bias in behavioral diagnosis by using objective data from screening tests rather than subjective data gathered by clinicians to classify ASD.

Download Full Article as PDF