Articles | Journal of Emerging Investigators

How to improve at chess: Uncovering insights using regression analysis

Sivakumar et al. | Apr 25, 2026

The authors looked at how different factors related to practicing and playing chess would impact a player's rating.

Predicting clogs in water pipelines using sound sensors and machine learning linear regression

Rajawat et al. | Oct 11, 2025

The authors looked the ability of sound sensors to predict clogged pipes when the sound intensity data is run through a machine learning algorithm.

A land use regression model to predict emissions from oil and gas production using machine learning

Cao et al. | Mar 24, 2023

Emissions from oil and natural gas (O&G) wells such as nitrogen dioxide (NO₂), volatile organic compounds (VOCs), and ozone (O₃) can severely impact the health of communities located near wells. In this study, we used O&G activity and wind-carried emissions to quantify the extent to which O&G wells affect the air quality of nearby communities, revealing that NO₂, NO_x, and NO are correlated to O&G activity. We then developed a novel land use regression (LUR) model using machine learning based on O&G prevalence to predict emissions.

Jet optimization using a hybrid multivariate regression model and statistical methods in dimuon collisions

Chunduri et al. | Jun 09, 2024

Collisions of heavy ions, such as muons result in jets and noise. In high-energy particle physics, researchers use jets as crucial event-shaped observable objects to determine the properties of a collision. However, many ionic collisions result in large amounts of energy lost as noise, thus reducing the efficiency of collisions with heavy ions. The purpose of our study is to analyze the relationships between properties of muons in a dimuon collision to optimize conditions of dimuon collisions and minimize the noise lost. We used principles of Newtonian mechanics at the particle level, allowing us to further analyze different models. We used simple Python algorithms as well as linear regression models with tools such as sci-kit Learn, NumPy, and Pandas to help analyze our results. We hypothesized that since the invariant mass, the energy, and the resultant momentum vector are correlated with noise, if we constrain these inputs optimally, there will be scenarios in which the noise of the heavy-ion collision is minimized.

Machine Learning Algorithm Using Logistic Regression and an Artificial Neural Network (ANN) for Early Stage Detection of Parkinson’s Disease

Kar et al. | Oct 10, 2020

Despite the prevalence of PD, diagnosing PD is expensive, requires specialized testing, and is often inaccurate. Moreover, diagnosis is often made late in the disease course when treatments are less effective. Using existing voice data from patients with PD and healthy controls, the authors created and trained two different algorithms: one using logistic regression and another employing an artificial neural network (ANN).

The most efficient position of magnets

Shin et al. | Mar 28, 2024

Here, the authors investigated the most efficient way to position magnets to hold the most pieces of paper on the surface of a refrigerator. They used a regression model along with an artificial neural network to identify the most efficient positions of four magnets to be at the vertices of a rectangle.

Exploring the Factors that Drive Coffee Ratings

Agarwal et al. | May 19, 2025

This study explores the factors that influence coffee quality ratings using data from the Coffee Quality Institute. Through a regression model based on gradient descent, the authors aimed to predict coffee ratings (total cup points) and hypothesized that sweetness and the coffee producer would be the most influential factors.

Can the nucleotide content of a DNA sequence predict the sequence accessibility?

Balachandran et al. | Mar 10, 2023

Sequence accessibility is an important factor affecting gene expression. Sequence accessibility or openness impacts the likelihood that a gene is transcribed and translated into a protein and performs functions and manifests traits. There are many potential factors that affect the accessibility of a gene. In this study, our hypothesis was that the content of nucleotides in a genetic sequence predicts its accessibility. Using a machine learning linear regression model, we studied the relationship between nucleotide content and accessibility.

Analyzing market dynamics and optimizing sales performance with machine learning

Kamat et al. | May 31, 2025

This study uses interpretable machine learning models, lasso and ridge regression with Shapley analysis, to identify key sales drivers for Corporación Favorita, Ecuador’s largest grocery chain. The results show that macroeconomic factors, especially labor force size, have the greatest impact on sales, though geographic and seasonal variables like city altitude and holiday proximity also play important roles. These insights can help businesses focus on the most influential market conditions to enhance competitiveness and profitability.

Using economic indicators to create an empirical model of inflation

Kasera et al. | Dec 01, 2022

Here, seeking to understand the correlation of 50 of the most important economic indicators with inflation, the authors used a rolling linear regression to identify indicators with the most significant correlation with the Month over Month Consumer Price Index Seasonally Adjusted (CPI). Ultimately the concluded that the average gasoline price, U.S. import price index, and 5-year market expected inflation had the most significant correlation with the CPI.

Browse Articles

How to improve at chess: Uncovering insights using regression analysis

Predicting clogs in water pipelines using sound sensors and machine learning linear regression

A land use regression model to predict emissions from oil and gas production using machine learning

Jet optimization using a hybrid multivariate regression model and statistical methods in dimuon collisions

Machine Learning Algorithm Using Logistic Regression and an Artificial Neural Network (ANN) for Early Stage Detection of Parkinson’s Disease

The most efficient position of magnets

Exploring the Factors that Drive Coffee Ratings

Can the nucleotide content of a DNA sequence predict the sequence accessibility?

Analyzing market dynamics and optimizing sales performance with machine learning

Using economic indicators to create an empirical model of inflation

Search Articles

Popular Tags

Browse Articles

Search Articles

Category

School Level

Popular Tags