Articles | Journal of Emerging Investigators

Using advanced machine learning and voice analysis features for Parkinson’s disease progression prediction

Narsipur et al. | Aug 06, 2025

The authors looked at the ability to use audio clips to analyze the progression of Parkinson's disease.

Forecasting air quality index: A statistical machine learning and deep learning approach

Pasula et al. | Feb 17, 2025

Here the authors investigated air quality forecasting in India, comparing traditional time series models like SARIMA with deep learning models like LSTM. The research found that SARIMA models, which capture seasonal variations, outperform LSTM models in predicting Air Quality Index (AQI) levels across multiple Indian cities, supporting the hypothesis that simpler models can be more effective for this specific task.

SOS-PVCase: A machine learning optimized lignin peroxidase with polyvinyl chloride (PVC) degrading properties

Ahuja et al. | Sep 30, 2024

The authors looked at the primary structure of lignin peroxidase in an attempt to identify mutations that would improve both the stability and solubility of the peroxidase protein. The goal is to engineer peroxidase enzymes that are stable to help break down polymers, such as PVC, into monomers that can be reused instead of going to landfills.

Diagnosing hypertrophic cardiomyopathy using machine learning models on CMRs and EKGs of the heart

Kolluri et al. | Jul 29, 2024

Here seeking to develop a method to diagnose, hypertrophic cardiomyopathy which can cause sudden cardiac death, the authors investigated the use of a convolutional neural network (CNN) and long short-term memory (LSTM) models to classify cardiac magnetic resonance and heart electrocardiogram scans. They found that the CNN model had a higher accuracy and precision and better other qualities, suggesting that machine learning models could be valuable tools to assist physicians in the diagnosis of hypertrophic cardiomyopathy.

The precision of machine learning models at classifying autism spectrum disorder in adults

Raj Kumar et al. | Jun 28, 2024

Autism spectrum disorder (ASD) is hard to correctly diagnose due to the very subjective nature of diagnosing it: behavior analysis. Due to this issue, we sought to find a machine learning-based method that diagnoses ASD without behavior analysis or helps reduce misdiagnosis.

Differential privacy in machine learning for traffic forecasting

Vinay et al. | Dec 21, 2022

In this paper, we measured the privacy budgets and utilities of different differentially private mechanisms combined with different machine learning models that forecast traffic congestion at future timestamps. We expected the ANNs combined with the Staircase mechanism to perform the best with every value in the privacy budget range, especially with the medium high values of the privacy budget. In this study, we used the Autoregressive Integrated Moving Average (ARIMA) and neural network models to forecast and then added differentially private Laplacian, Gaussian, and Staircase noise to our datasets. We tested two real traffic congestion datasets, experimented with the different models, and examined their utility for different privacy budgets. We found that a favorable combination for this application was neural networks with the Staircase mechanism. Our findings identify the optimal models when dealing with tricky time series forecasting and can be used in non-traffic applications like disease tracking and population growth.

A novel CNN-based machine learning approach to identify skin cancers

Rao et al. | Nov 18, 2022

In this study, the authors developed and assessed the accuracy of a machine learning algorithm to identify skin cancers using images of biopsies.

Assessing and Improving Machine Learning Model Predictions of Polymer Glass Transition Temperatures

Ramprasad et al. | Mar 18, 2020

In this study, the authors test whether providing a larger dataset of glass transition temperatures (T_g) to train the machine-learning platform Polymer Genome would improve its accuracy. Polymer Genome is a machine learning based data-driven informatics platform for polymer property prediction and T_g is one property needed to design new polymers in silico. They found that training the model with their larger, curated dataset improved the algorithm's T_g, providing valuable improvements to this useful platform.

Depression detection in social media text: leveraging machine learning for effective screening

Shin et al. | Mar 25, 2025

Depression affects millions globally, yet identifying symptoms remains challenging. This study explored detecting depression-related patterns in social media texts using natural language processing and machine learning algorithms, including decision trees and random forests. Our findings suggest that analyzing online text activity can serve as a viable method for screening mental disorders, potentially improving diagnosis accuracy by incorporating both physical and psychological indicators.

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls

Anand et al. | Mar 09, 2025

The mountain chain of the Western Ghats on the Indian peninsula, a UNESCO World Heritage site, is home to about 200 frog species, 89 of which are endemic. Distinctive to each frog species, their vocalizations can be used for species recognition. Manually surveying frogs at night during the rain in elephant and big cat forests is difficult, so being able to autonomously record ambient soundscapes and identify species is essential. An effective machine learning (ML) species classifier requires substantial training data from this area. The goal of this study was to assess data augmentation techniques on a dataset of frog vocalizations from this region, which has a minimal number of audio recordings per species. Consequently, enhancing an ML model’s performance with limited data is necessary. We analyzed the effects of four data augmentation techniques (Time Shifting, Noise Injection, Spectral Augmentation, and Test-Time Augmentation) individually and their combined effect on the frog vocalization data and the public environmental sounds dataset (ESC-50). The effect of combined data augmentation techniques improved the model's relative accuracy as the size of the dataset decreased. The combination of all four techniques improved the ML model’s classification accuracy on the frog calls dataset by 94%. This study established a data augmentation approach to maximize the classification accuracy with sparse data of frog call recordings, thereby creating a possibility to build a real-world automated field frog species identifier system. Such a system can significantly help in the conservation of frog species in this vital biodiversity hotspot.

Browse Articles

Using advanced machine learning and voice analysis features for Parkinson’s disease progression prediction

Forecasting air quality index: A statistical machine learning and deep learning approach

SOS-PVCase: A machine learning optimized lignin peroxidase with polyvinyl chloride (PVC) degrading properties

Diagnosing hypertrophic cardiomyopathy using machine learning models on CMRs and EKGs of the heart

The precision of machine learning models at classifying autism spectrum disorder in adults

Differential privacy in machine learning for traffic forecasting

A novel CNN-based machine learning approach to identify skin cancers

Assessing and Improving Machine Learning Model Predictions of Polymer Glass Transition Temperatures

Depression detection in social media text: leveraging machine learning for effective screening

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls

Search Articles

Popular Tags

Browse Articles

Search Articles

Category

School Level

Popular Tags