Articles | Journal of Emerging Investigators

A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood

Adami et al. | Sep 20, 2023

Here, recognizing the difficulty associated with tracking the progression of dementia, the authors used machine learning models to predict between the presence of cognitive normalcy, mild cognitive impairment, and Alzheimer's Disease, based on blood DNA methylation levels, sex, and age. With four machine learning models and two dataset dimensionality reduction methods they achieved an accuracy of 53.33%.

Predicting and explaining illicit financial flows in developing countries: A machine learning approach

Putta et al. | Aug 24, 2025

The authors looked at the ability of different machine learning algorithms to predict the level of financial corruption in different countries.

Predicting smoking status based on RNA sequencing data

Yang et al. | Aug 30, 2024

Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.

The utilization of Artificial Intelligence in enabling the early detection of brain tumors

Haider et al. | Feb 05, 2025

AI analysis of brain scans offers promise for helping doctors diagnose brain tumors. Haider and Drosis explore this field by developing machine learning models that classify brain scans as "cancer" or "non-cancer" diagnoses.

Uncovering the hidden trafficking trade with geographic data and natural language processing

Aqid et al. | Oct 14, 2024

The authors use machine learning to develop an evidence-based detection tool for identifying human trafficking.

Prediction of diabetes using supervised classification

Sun et al. | Mar 17, 2024

The authors develop and test a machine learning algorithm for predicting diabetes diagnoses.

Predicting baseball pitcher efficacy using physical pitch characteristics

Oberoi et al. | Jan 11, 2024

Here, the authors sought to develop a new metric to evaluate the efficacy of baseball pitchers using machine learning models. They found that the frequency of balls, was the most predictive feature for their walks/hits allowed per inning (WHIP) metric. While their machine learning models did not identify a defining trait, such as high velocity, spin rate, or types of pitches, they found that consistently pitching within the strike zone resulted in significantly lower WHIPs.

Environmental contributors of asthma via explainable AI: Green spaces, climate, traffic & air quality

Chen et al. | Aug 12, 2025

This study explored how green spaces, climate, traffic, and air quality (GCTA) collectively influence asthma-related emergency department visits in the U.S using machine learning models and explainable AI.

Determining viability of image processing models for forensic analysis of hair for related individuals

Wang et al. | Feb 04, 2025

Here, the authors used machine learning to analyze microscopic images of hair, quantifying various features to distinguish individuals, even within families where traditional DNA analysis is limited. The Discriminant Analysis (DA) model achieved the highest accuracy (88.89%) in identifying individuals, demonstrating its potential to improve the reliability of hair evidence in forensic investigations.

Study of neural network parameters in detecting heart disease

Malkevich et al. | Sep 07, 2025

The authors looked at the ability to detect heart disease before the onset of severe clinical symptoms.

Browse Articles

A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood

Predicting and explaining illicit financial flows in developing countries: A machine learning approach

Predicting smoking status based on RNA sequencing data

The utilization of Artificial Intelligence in enabling the early detection of brain tumors

Uncovering the hidden trafficking trade with geographic data and natural language processing

Prediction of diabetes using supervised classification

Predicting baseball pitcher efficacy using physical pitch characteristics

Environmental contributors of asthma via explainable AI: Green spaces, climate, traffic & air quality

Determining viability of image processing models for forensic analysis of hair for related individuals

Study of neural network parameters in detecting heart disease

Search Articles

Popular Tags

Browse Articles

Search Articles

Category

School Level

Popular Tags