Browse Articles

Using data science along with machine learning to determine the ARIMA model’s ability to adjust to irregularities in the dataset

Choudhary et al. | Jul 26, 2021

Using data science along with machine learning to determine the ARIMA model’s ability to adjust to irregularities in the dataset

Auto-Regressive Integrated Moving Average (ARIMA) models are known for their influence and application on time series data. This statistical analysis model uses time series data to depict future trends or values: a key contributor to crime mapping algorithms. However, the models may not function to their true potential when analyzing data with many different patterns. In order to determine the potential of ARIMA models, our research will test the model on irregularities in the data. Our team hypothesizes that the ARIMA model will be able to adapt to the different irregularities in the data that do not correspond to a certain trend or pattern. Using crime theft data and an ARIMA model, we determined the results of the ARIMA model’s forecast and how the accuracy differed on different days with irregularities in crime.

Read More...

Tree-Based Learning Algorithms to Classify ECG with Arrhythmias

Sun et al. | Apr 23, 2025

Tree-Based Learning Algorithms to Classify ECG with Arrhythmias

Arrhythmias vary in type and treatment, and ECGs are used to detect them, though human interpretation can be inconsistent. The researchers tested four tree-based algorithms (gradient boosting, random forest, decision tree, and extra trees) on ECG data from over 10,000 patients.

Read More...

Using broad health-related survey questions to predict the presence of coronary heart disease

Chavda et al. | Aug 23, 2024

Using broad health-related survey questions to predict the presence of coronary heart disease

Coronary heart disease (CHD) is the leading cause of death in the U.S., responsible for nearly 700,000 deaths in 2021, and is marked by artery clogging that can lead to heart attacks. Traditional prediction methods require expensive clinical tests, but a new study explores using machine learning on demographic, clinical, and behavioral survey data to predict CHD.

Read More...

SmartZoo: A Deep Learning Framework for an IoT Platform in Animal Care

Ji et al. | Aug 07, 2024

SmartZoo: A Deep Learning Framework for an IoT Platform in Animal Care

Zoos offer educational and scientific advantages but face high maintenance costs and challenges in animal care due to diverse species' habits. Challenges include tracking animals, detecting illnesses, and creating suitable habitats. We developed a deep learning framework called SmartZoo to address these issues and enable efficient animal monitoring, condition alerts, and data aggregation. We discovered that the data generated by our model is closer to real data than random data, and we were able to demonstrate that the model excels at generating data that resembles real-world data.

Read More...

Part of speech distributions for Grimm versus artificially generated fairy tales

Arvind et al. | Nov 16, 2024

Part of speech distributions for Grimm versus artificially generated fairy tales
Image credit: Nayalia Y.

Here, the authors wanted to explore mathematical paradoxes in which there are multiple contradictory interpretations or analyses for a problem. They used ChatGPT to generate a novel dataset of fairy tales. They found statistical differences between the artificially generated text and human produced text based on the distribution of parts of speech elements.

Read More...

Correlation between shutdowns and CO levels across the United States.

Gupta et al. | Dec 05, 2021

Correlation between shutdowns and CO levels across the United States.

Concerns regarding the rapid spread of Sars-CoV2 in early 2020 led company and local governmental officials in many states to ask people to work from home and avoid leaving their homes; measures commonly referred to as shutdowns. Here, the authors investigate how shutdowns affected carbon monoxide (CO) levels in 15 US states using publicly available data. Their results suggest that CO levels decreased as a result of these measures over the course of 2020, a trend which started to reverse after shutdowns ended.

Read More...

Machine learning on crowd-sourced data to highlight coral disease

Narayan et al. | Jul 26, 2021

Machine learning on crowd-sourced data to highlight coral disease

Triggered largely by the warming and pollution of oceans, corals are experiencing bleaching and a variety of diseases caused by the spread of bacteria, fungi, and viruses. Identification of bleached/diseased corals enables implementation of measures to halt or retard disease. Benthic cover analysis, a standard metric used in large databases to assess live coral cover, as a standalone measure of reef health is insufficient for identification of coral bleaching/disease. Proposed herein is a solution that couples machine learning with crowd-sourced data – images from government archives, citizen science projects, and personal images collected by tourists – to build a model capable of identifying healthy, bleached, and/or diseased coral.

Read More...

The gender gap in STEM at top U.S. Universities: change over time and relationship with ranking

Kruus et al. | Jun 25, 2024

The gender gap in STEM at top U.S. Universities: change over time and relationship with ranking

Authors address the gender disparity in STEM fields, examining changes in gender diversity across male-dominated undergraduate programs over 19 years at 24 top universities. Analyzing data from NCES IPEDS, it identifies STEM as persistently male-dominated but notes increasing gender diversity in many disciplines, particularly in recent years. Results indicate that higher-ranked universities in disciplines like computer science and mechanical engineering show a weak correlation with improved gender diversity, suggesting effective initiatives can mitigate the gender gap in STEM, despite ongoing challenges.

Read More...