Browse Articles

Using data science along with machine learning to determine the ARIMA model’s ability to adjust to irregularities in the dataset

Choudhary et al. | Jul 26, 2021

Using data science along with machine learning to determine the ARIMA model’s ability to adjust to irregularities in the dataset

Auto-Regressive Integrated Moving Average (ARIMA) models are known for their influence and application on time series data. This statistical analysis model uses time series data to depict future trends or values: a key contributor to crime mapping algorithms. However, the models may not function to their true potential when analyzing data with many different patterns. In order to determine the potential of ARIMA models, our research will test the model on irregularities in the data. Our team hypothesizes that the ARIMA model will be able to adapt to the different irregularities in the data that do not correspond to a certain trend or pattern. Using crime theft data and an ARIMA model, we determined the results of the ARIMA model’s forecast and how the accuracy differed on different days with irregularities in crime.

Read More...

Tree-Based Learning Algorithms to Classify ECG with Arrhythmias

Sun et al. | Apr 23, 2025

Tree-Based Learning Algorithms to Classify ECG with Arrhythmias

Arrhythmias vary in type and treatment, and ECGs are used to detect them, though human interpretation can be inconsistent. The researchers tested four tree-based algorithms (gradient boosting, random forest, decision tree, and extra trees) on ECG data from over 10,000 patients.

Read More...

The effect of economic downturns on the frequency of mass shootings

Bhupathi et al. | Jul 11, 2025

The effect of economic downturns on the frequency of mass shootings

Researching gun violence and mass shootings in the U.S. is difficult due to the lack of consistent data collection. Some studies have linked mass shootings to personal financial stress, but little formal research exists on the impact of broader economic conditions. This study hypothesized an inverse relationship between mass shootings and economic performance, using the S&P 500 and unemployment rate as indicators.

Read More...

Using broad health-related survey questions to predict the presence of coronary heart disease

Chavda et al. | Aug 23, 2024

Using broad health-related survey questions to predict the presence of coronary heart disease

Coronary heart disease (CHD) is the leading cause of death in the U.S., responsible for nearly 700,000 deaths in 2021, and is marked by artery clogging that can lead to heart attacks. Traditional prediction methods require expensive clinical tests, but a new study explores using machine learning on demographic, clinical, and behavioral survey data to predict CHD.

Read More...

SmartZoo: A Deep Learning Framework for an IoT Platform in Animal Care

Ji et al. | Aug 07, 2024

SmartZoo: A Deep Learning Framework for an IoT Platform in Animal Care

Zoos offer educational and scientific advantages but face high maintenance costs and challenges in animal care due to diverse species' habits. Challenges include tracking animals, detecting illnesses, and creating suitable habitats. We developed a deep learning framework called SmartZoo to address these issues and enable efficient animal monitoring, condition alerts, and data aggregation. We discovered that the data generated by our model is closer to real data than random data, and we were able to demonstrate that the model excels at generating data that resembles real-world data.

Read More...

Analyzing market dynamics and optimizing sales performance with machine learning

Kamat et al. | May 31, 2025

Analyzing market dynamics and optimizing sales performance with machine learning

This study uses interpretable machine learning models, lasso and ridge regression with Shapley analysis, to identify key sales drivers for Corporación Favorita, Ecuador’s largest grocery chain. The results show that macroeconomic factors, especially labor force size, have the greatest impact on sales, though geographic and seasonal variables like city altitude and holiday proximity also play important roles. These insights can help businesses focus on the most influential market conditions to enhance competitiveness and profitability.

Read More...

Part of speech distributions for Grimm versus artificially generated fairy tales

Arvind et al. | Nov 16, 2024

Part of speech distributions for Grimm versus artificially generated fairy tales
Image credit: Nayalia Y.

Here, the authors wanted to explore mathematical paradoxes in which there are multiple contradictory interpretations or analyses for a problem. They used ChatGPT to generate a novel dataset of fairy tales. They found statistical differences between the artificially generated text and human produced text based on the distribution of parts of speech elements.

Read More...

Correlation between shutdowns and CO levels across the United States.

Gupta et al. | Dec 05, 2021

Correlation between shutdowns and CO levels across the United States.

Concerns regarding the rapid spread of Sars-CoV2 in early 2020 led company and local governmental officials in many states to ask people to work from home and avoid leaving their homes; measures commonly referred to as shutdowns. Here, the authors investigate how shutdowns affected carbon monoxide (CO) levels in 15 US states using publicly available data. Their results suggest that CO levels decreased as a result of these measures over the course of 2020, a trend which started to reverse after shutdowns ended.

Read More...

Machine learning on crowd-sourced data to highlight coral disease

Narayan et al. | Jul 26, 2021

Machine learning on crowd-sourced data to highlight coral disease

Triggered largely by the warming and pollution of oceans, corals are experiencing bleaching and a variety of diseases caused by the spread of bacteria, fungi, and viruses. Identification of bleached/diseased corals enables implementation of measures to halt or retard disease. Benthic cover analysis, a standard metric used in large databases to assess live coral cover, as a standalone measure of reef health is insufficient for identification of coral bleaching/disease. Proposed herein is a solution that couples machine learning with crowd-sourced data – images from government archives, citizen science projects, and personal images collected by tourists – to build a model capable of identifying healthy, bleached, and/or diseased coral.

Read More...