Browse Articles

A land use regression model to predict emissions from oil and gas production using machine learning

Cao et al. | Mar 24, 2023

A land use regression model to predict emissions from oil and gas production using machine learning

Emissions from oil and natural gas (O&G) wells such as nitrogen dioxide (NO2), volatile organic compounds (VOCs), and ozone (O3) can severely impact the health of communities located near wells. In this study, we used O&G activity and wind-carried emissions to quantify the extent to which O&G wells affect the air quality of nearby communities, revealing that NO2, NOx, and NO are correlated to O&G activity. We then developed a novel land use regression (LUR) model using machine learning based on O&G prevalence to predict emissions.

Read More...

Predicting the factors involved in orthopedic patient hospital stay

D’Souza et al. | Dec 13, 2023

Predicting the factors involved in orthopedic patient hospital stay
Image credit: Pixabay

Long hospital stays can be stressful for the patient for many reasons. We hypothesized that age would be the greatest predictor of hospital stay among patients who underwent orthopedic surgery. Through our models, we found that severity of illness was indeed the highest factor that contributed to determining patient length of stay. The other two factors that followed were the facility that the patient was staying in and the type of procedure that they underwent.

Read More...

Predicting asthma-related emergency department visits and hospitalizations with machine learning techniques

Chatterjee et al. | Oct 25, 2021

Predicting asthma-related emergency department visits and hospitalizations with machine learning techniques

Seeking to investigate the effects of ambient pollutants on human respiratory health, here the authors used machine learning to examine asthma in Lost Angeles County, an area with substantial pollution. By using machine learning models and classification techniques, the authors identified that nitrogen dioxide and ozone levels were significantly correlated with asthma hospitalizations. Based on an identified seasonal surge in asthma hospitalizations, the authors suggest future directions to improve machine learning modeling to investigate these relationships.

Read More...

Predicting smoking status based on RNA sequencing data

Yang et al. | Aug 30, 2024

Predicting smoking status based on RNA sequencing data
Image credit: Yang and Stanley 2024

Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.

Read More...

Predicting college retention rates from Google Street View images of campuses

Dileep et al. | Jan 02, 2024

Predicting college retention rates from Google Street View images of campuses
Image credit: Dileep et al. 2024

Every year, around 40% of undergraduate students in the United States discontinue their studies, resulting in a loss of valuable education for students and a loss of money for colleges. Even so, colleges across the nation struggle to discover the underlying causes of these high dropout rates. In this paper, the authors discuss the use of machine learning to find correlations between the built environment factors and the retention rates of colleges. They hypothesized that one way for colleges to improve their retention rates could be to improve the physical characteristics of their campus to be more pleasing. The authors used image classification techniques to look at images of colleges and correlate certain features like colors, cars, and people to higher or lower retention rates. With three possible options of high, medium, and low retention rates, the probability that their models reached the right conclusion if they simply chose randomly was 33%. After finding that this 33%, or 0.33 mark, always fell outside of the 99% confidence intervals built around their models’ accuracies, the authors concluded that their machine learning techniques can be used to find correlations between certain environmental factors and retention rates.

Read More...

A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood

Adami et al. | Sep 20, 2023

A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood
Image credit: National Cancer Institute

Here, recognizing the difficulty associated with tracking the progression of dementia, the authors used machine learning models to predict between the presence of cognitive normalcy, mild cognitive impairment, and Alzheimer's Disease, based on blood DNA methylation levels, sex, and age. With four machine learning models and two dataset dimensionality reduction methods they achieved an accuracy of 53.33%.

Read More...

A novel encoding technique to improve non-weather-based models for solar photovoltaic forecasting

Ahmed et al. | Jun 09, 2023

A novel encoding technique to improve non-weather-based models for solar photovoltaic forecasting

Several studies have applied different machine learning (ML) techniques to the area of forecasting solar photovoltaic power production. Most of these studies use weather data as inputs to predict power production; however, there are numerous practical issues with the procurement of this data. This study proposes models that do not use weather data as inputs, but rather use past power production data as a more practical substitute to weather-based models. Our proposed models demonstrate a better, cheaper, and more reliable alternatives to current weather models.

Read More...

Temperature and Precipitation Responses to a Stratospheric Aerosol Geoengineering Experiment Using the Community Climate System Model 4

Anderson et al. | Aug 19, 2014

Temperature and Precipitation Responses to a Stratospheric Aerosol Geoengineering Experiment Using the Community Climate System Model 4

We are changing our environment with steadily increasing carbon dioxide emissions, but we might be able to help. The authors here use a computer program called Community Climate System Model 4 to predict the effects of spraying small particles into the atmosphere to reflect away some of the sun's rays. The software predicts that this could reduce the amount of energy the Earth's atmosphere absorbs and may limit but will not completely counteract our carbon dioxide production.

Read More...