Browse Articles

Forecasting air quality index: A statistical machine learning and deep learning approach

Pasula et al. | Feb 17, 2025

Forecasting air quality index: A statistical machine learning and deep learning approach
Image credit: Amir Hosseini

Here the authors investigated air quality forecasting in India, comparing traditional time series models like SARIMA with deep learning models like LSTM. The research found that SARIMA models, which capture seasonal variations, outperform LSTM models in predicting Air Quality Index (AQI) levels across multiple Indian cities, supporting the hypothesis that simpler models can be more effective for this specific task.

Read More...

Diagnosing hypertrophic cardiomyopathy using machine learning models on CMRs and EKGs of the heart

Kolluri et al. | Jul 29, 2024

Diagnosing hypertrophic cardiomyopathy using machine learning models on CMRs and EKGs of the heart
Image credit: Jesse Orrico

Here seeking to develop a method to diagnose, hypertrophic cardiomyopathy which can cause sudden cardiac death, the authors investigated the use of a convolutional neural network (CNN) and long short-term memory (LSTM) models to classify cardiac magnetic resonance and heart electrocardiogram scans. They found that the CNN model had a higher accuracy and precision and better other qualities, suggesting that machine learning models could be valuable tools to assist physicians in the diagnosis of hypertrophic cardiomyopathy.

Read More...

Differential privacy in machine learning for traffic forecasting

Vinay et al. | Dec 21, 2022

Differential privacy in machine learning for traffic forecasting

In this paper, we measured the privacy budgets and utilities of different differentially private mechanisms combined with different machine learning models that forecast traffic congestion at future timestamps. We expected the ANNs combined with the Staircase mechanism to perform the best with every value in the privacy budget range, especially with the medium high values of the privacy budget. In this study, we used the Autoregressive Integrated Moving Average (ARIMA) and neural network models to forecast and then added differentially private Laplacian, Gaussian, and Staircase noise to our datasets. We tested two real traffic congestion datasets, experimented with the different models, and examined their utility for different privacy budgets. We found that a favorable combination for this application was neural networks with the Staircase mechanism. Our findings identify the optimal models when dealing with tricky time series forecasting and can be used in non-traffic applications like disease tracking and population growth.

Read More...

Assessing and Improving Machine Learning Model Predictions of Polymer Glass Transition Temperatures

Ramprasad et al. | Mar 18, 2020

Assessing and Improving Machine Learning Model Predictions of Polymer Glass Transition Temperatures

In this study, the authors test whether providing a larger dataset of glass transition temperatures (Tg) to train the machine-learning platform Polymer Genome would improve its accuracy. Polymer Genome is a machine learning based data-driven informatics platform for polymer property prediction and Tg is one property needed to design new polymers in silico. They found that training the model with their larger, curated dataset improved the algorithm's Tg, providing valuable improvements to this useful platform.

Read More...

Depression detection in social media text: leveraging machine learning for effective screening

Shin et al. | Mar 25, 2025

Depression detection in social media text: leveraging machine learning for effective screening

Depression affects millions globally, yet identifying symptoms remains challenging. This study explored detecting depression-related patterns in social media texts using natural language processing and machine learning algorithms, including decision trees and random forests. Our findings suggest that analyzing online text activity can serve as a viable method for screening mental disorders, potentially improving diagnosis accuracy by incorporating both physical and psychological indicators.

Read More...

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls

Anand et al. | Mar 09, 2025

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls
Image credit: Anand and Sampath 2025

The mountain chain of the Western Ghats on the Indian peninsula, a UNESCO World Heritage site, is home to about 200 frog species, 89 of which are endemic. Distinctive to each frog species, their vocalizations can be used for species recognition. Manually surveying frogs at night during the rain in elephant and big cat forests is difficult, so being able to autonomously record ambient soundscapes and identify species is essential. An effective machine learning (ML) species classifier requires substantial training data from this area. The goal of this study was to assess data augmentation techniques on a dataset of frog vocalizations from this region, which has a minimal number of audio recordings per species. Consequently, enhancing an ML model’s performance with limited data is necessary. We analyzed the effects of four data augmentation techniques (Time Shifting, Noise Injection, Spectral Augmentation, and Test-Time Augmentation) individually and their combined effect on the frog vocalization data and the public environmental sounds dataset (ESC-50). The effect of combined data augmentation techniques improved the model's relative accuracy as the size of the dataset decreased. The combination of all four techniques improved the ML model’s classification accuracy on the frog calls dataset by 94%. This study established a data augmentation approach to maximize the classification accuracy with sparse data of frog call recordings, thereby creating a possibility to build a real-world automated field frog species identifier system. Such a system can significantly help in the conservation of frog species in this vital biodiversity hotspot.

Read More...