Articles | Journal of Emerging Investigators

Using economic indicators to create an empirical model of inflation

Kasera et al. | Dec 01, 2022

Here, seeking to understand the correlation of 50 of the most important economic indicators with inflation, the authors used a rolling linear regression to identify indicators with the most significant correlation with the Month over Month Consumer Price Index Seasonally Adjusted (CPI). Ultimately the concluded that the average gasoline price, U.S. import price index, and 5-year market expected inflation had the most significant correlation with the CPI.

Comparison of the ease of use and accuracy of two machine learning algorithms – forestry case study

Bhatia et al. | Mar 21, 2021

Machine learning algorithms are becoming increasingly popular for data crunching across a vast area of scientific disciplines. Here, the authors compare two machine learning algorithms with respect to accuracy and user-friendliness and find that random forest algorithms outperform logistic regression when applied to the same dataset.

Predicting smoking status based on RNA sequencing data

Yang et al. | Aug 30, 2024

Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.

A comparative analysis of machine learning approaches for prediction of breast cancer

Nag et al. | May 11, 2021

Machine learning and deep learning techniques can be used to predict the early onset of breast cancer. The main objective of this analysis was to determine whether machine learning algorithms can be used to predict the onset of breast cancer with more than 90% accuracy. Based on research with supervised machine learning algorithms, Gaussian Naïve Bayes, K Nearest Algorithm, Random Forest, and Logistic Regression were considered because they offer a wide variety of classification methods and also provide high accuracy and performance. We hypothesized that all these algorithms would provide accurate results, and Random Forest and Logistic Regression would provide better accuracy and performance than Naïve Bayes and K Nearest Neighbor.

The Effect of Various Preparation Methods on the Spoilage Rate of Roma Tomatoes (Solanum lycopersicum)

Cataltepe et al. | Feb 22, 2018

As levels of food waste continue to rise, it is essential to find improved techniques of prolonging the shelf life of produce. The authors aimed to find a simple, yet effective, method of slowing down spoilage in tomatoes. Linear regression analysis revealed that the tomatoes soaked salt water and not dried displayed the lowest correlation between time and spoilage, confirming that this preparation was the most effective.

Developing anticholinergic drugs for the treatment of asthma with improved efficacy

Wong et al. | Jul 05, 2023

Anticholinergics are used in treating asthma, a chronic inflammation of the airways. These drugs block human M1 and M2 muscarinic acetylcholine receptors, inhibiting bronchoconstriction. However, studies have reported complications of anticholinergic usage, such as exacerbated eosinophil production and worsened urinary retention. Modification of known anticholinergics using bioisosteric replacements to increase efficacy could potentially minimize these complications. The present study focuses on identifying viable analogs of anticholinergics to improve binding energy to the receptors compared to current treatment options. Glycopyrrolate (G), ipratropium (IB), and tiotropium bromide (TB) were chosen as parent drugs of interest, due to the presence of common functional groups within the molecules, specifically esters and alcohols. Docking score analysis via AutoDock Vina was used to evaluate the binding energy between drug analogs and the muscarinic acetylcholine receptors. The final results suggest that G-A3, IB-A3, and TB-A1 are the most viable analogs, as binding energy was improved when compared to the parent drug. G-A4, IB-A4, IB-A5, TB-A3, and TB-A4 are also potential candidates, although there were slight regressions in binding energy to both muscarinic receptors for these analogs. By researching the effects of bioisosteric replacements of current anticholinergics, it is evident that there is a potential to provide asthmatics with more effective treatment options.

Understanding the battleground of identity fraud

Basu et al. | Oct 09, 2024

The authors looked at variables associated with identity fraud in the US. They found that national unemployment rate and online banking usage are among significant variables that explain identity fraud.

Using two-step machine learning to predict harmful algal bloom risk

Shukla et al. | Jul 04, 2025

Using machine learning to predict the risk of algae bloom

The effect of COVID-19 on the USA house market

Xiao et al. | Nov 19, 2022

COVID-19 has impacted the way many people go about their daily lives, but what are the main factors driving the changes in the housing market, particular house prices?

Comparative study of machine learning models for water potability prediction

Lee et al. | Mar 31, 2025

The global issue of water quality has led to the use of machine learning models, like ANN and SVM, to predict water potability. However, these models can be complex and resource-intensive. This research aimed to find a simpler, more efficient model for water quality prediction.

Browse Articles

Using economic indicators to create an empirical model of inflation

Comparison of the ease of use and accuracy of two machine learning algorithms – forestry case study

Predicting smoking status based on RNA sequencing data

A comparative analysis of machine learning approaches for prediction of breast cancer

The Effect of Various Preparation Methods on the Spoilage Rate of Roma Tomatoes (Solanum lycopersicum)

Developing anticholinergic drugs for the treatment of asthma with improved efficacy

Understanding the battleground of identity fraud

Using two-step machine learning to predict harmful algal bloom risk

The effect of COVID-19 on the USA house market

Comparative study of machine learning models for water potability prediction

Search Articles

Popular Tags

Browse Articles

Search Articles

Category

School Level

Popular Tags