Articles | Journal of Emerging Investigators

Comparison of three large language models as middle school math tutoring assistants

Ramanathan et al. | May 02, 2024

Middle school math forms the basis for advanced mathematical courses leading up to the university level. Large language models (LLMs) have the potential to power next-generation educational technologies, acting as digital tutors to students. The main objective of this study was to determine whether LLMs like ChatGPT, Bard, and Llama 2 can serve as reliable middle school math tutoring assistants on three tutoring tasks: hint generation, comprehensive solution, and exercise creation.

Automated classification of nebulae using deep learning & machine learning for enhanced discovery

Nair et al. | Feb 01, 2024

There are believed to be ~20,000 nebulae in the Milky Way Galaxy. However, humans have only cataloged ~1,800 of them even though we have gathered 1.3 million nebula images. Classification of nebulae is important as it helps scientists understand the chemical composition of a nebula which in turn helps them understand the material of the original star. Our research on nebulae classification aims to make the process of classifying new nebulae faster and more accurate using a hybrid of deep learning and machine learning techniques.

Optimizing Interplanetary Travel Using a Genetic Algorithm

Murali et al. | Oct 28, 2018

In this work, the authors develop an algorithm that solves the problem of efficient space travel between planets. This is a problem that could soon be of relevance as mankind continues to expand its exploration of outer space, and potentially attempt to inhabit it.

Analysis of quantitative classification and properties of X-ray binary systems

Kuppusamy et al. | Oct 09, 2025

The authors looked at variables and their patterns and how those contribute to the properties of X-ray binaries.

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls

Anand et al. | Mar 09, 2025

The mountain chain of the Western Ghats on the Indian peninsula, a UNESCO World Heritage site, is home to about 200 frog species, 89 of which are endemic. Distinctive to each frog species, their vocalizations can be used for species recognition. Manually surveying frogs at night during the rain in elephant and big cat forests is difficult, so being able to autonomously record ambient soundscapes and identify species is essential. An effective machine learning (ML) species classifier requires substantial training data from this area. The goal of this study was to assess data augmentation techniques on a dataset of frog vocalizations from this region, which has a minimal number of audio recordings per species. Consequently, enhancing an ML model’s performance with limited data is necessary. We analyzed the effects of four data augmentation techniques (Time Shifting, Noise Injection, Spectral Augmentation, and Test-Time Augmentation) individually and their combined effect on the frog vocalization data and the public environmental sounds dataset (ESC-50). The effect of combined data augmentation techniques improved the model's relative accuracy as the size of the dataset decreased. The combination of all four techniques improved the ML model’s classification accuracy on the frog calls dataset by 94%. This study established a data augmentation approach to maximize the classification accuracy with sparse data of frog call recordings, thereby creating a possibility to build a real-world automated field frog species identifier system. Such a system can significantly help in the conservation of frog species in this vital biodiversity hotspot.

Can the nucleotide content of a DNA sequence predict the sequence accessibility?

Balachandran et al. | Mar 10, 2023

Sequence accessibility is an important factor affecting gene expression. Sequence accessibility or openness impacts the likelihood that a gene is transcribed and translated into a protein and performs functions and manifests traits. There are many potential factors that affect the accessibility of a gene. In this study, our hypothesis was that the content of nucleotides in a genetic sequence predicts its accessibility. Using a machine learning linear regression model, we studied the relationship between nucleotide content and accessibility.

Predicting asthma-related emergency department visits and hospitalizations with machine learning techniques

Chatterjee et al. | Oct 25, 2021

Seeking to investigate the effects of ambient pollutants on human respiratory health, here the authors used machine learning to examine asthma in Lost Angeles County, an area with substantial pollution. By using machine learning models and classification techniques, the authors identified that nitrogen dioxide and ozone levels were significantly correlated with asthma hospitalizations. Based on an identified seasonal surge in asthma hospitalizations, the authors suggest future directions to improve machine learning modeling to investigate these relationships.

Building a video classifier to improve the accuracy of depth-aware frame interpolation

Jasper et al. | May 24, 2021

In this study, the authors share their work on improving the frame rate of videos to reduce data sent to users with both 2D and 3D footage. This work helps improve the experience for both types of footage!

Comparison of the ease of use and accuracy of two machine learning algorithms – forestry case study

Bhatia et al. | Mar 21, 2021

Machine learning algorithms are becoming increasingly popular for data crunching across a vast area of scientific disciplines. Here, the authors compare two machine learning algorithms with respect to accuracy and user-friendliness and find that random forest algorithms outperform logistic regression when applied to the same dataset.

Using text embedding models as text classifiers with medical data

Goel et al. | Nov 19, 2024

This article describes the classification of medical text data using vector databases and text embedding. Various large language models were used to generate this medical data for the classification task.

Browse Articles

Comparison of three large language models as middle school math tutoring assistants

Automated classification of nebulae using deep learning & machine learning for enhanced discovery

Optimizing Interplanetary Travel Using a Genetic Algorithm

Analysis of quantitative classification and properties of X-ray binary systems

Optimizing data augmentation to improve machine learning accuracy on endemic frog calls

Can the nucleotide content of a DNA sequence predict the sequence accessibility?

Predicting asthma-related emergency department visits and hospitalizations with machine learning techniques

Building a video classifier to improve the accuracy of depth-aware frame interpolation

Comparison of the ease of use and accuracy of two machine learning algorithms – forestry case study

Using text embedding models as text classifiers with medical data

Search Articles

Popular Tags

Browse Articles

Search Articles

Category

School Level

Popular Tags