Browse Articles

Gradient boosting with temporal feature extraction for modeling keystroke log data

Barretto et al. | Oct 04, 2024

Gradient boosting with temporal feature extraction for modeling keystroke log data
Image credit: Barretto and Barretto 2024.

Although there has been great progress in the field of Natural language processing (NLP) over the last few years, particularly with the development of attention-based models, less research has contributed towards modeling keystroke log data. State of the art methods handle textual data directly and while this has produced excellent results, the time complexity and resource usage are quite high for such methods. Additionally, these methods fail to incorporate the actual writing process when assessing text and instead solely focus on the content. Therefore, we proposed a framework for modeling textual data using keystroke-based features. Such methods pay attention to how a document or response was written, rather than the final text that was produced. These features are vastly different from the kind of features extracted from raw text but reveal information that is otherwise hidden. We hypothesized that pairing efficient machine learning techniques with keystroke log information should produce results comparable to transformer techniques, models which pay more or less attention to the different components of a text sequence in a far quicker time. Transformer-based methods dominate the field of NLP currently due to the strong understanding they display of natural language. We showed that models trained on keystroke log data are capable of effectively evaluating the quality of writing and do it in a significantly shorter amount of time compared to traditional methods. This is significant as it provides a necessary fast and cheap alternative to increasingly larger and slower LLMs.

Read More...

Risk assessment modeling for childhood stunting using automated machine learning and demographic analysis

Sirohi et al. | Sep 25, 2022

Risk assessment modeling for childhood stunting using automated machine learning and demographic analysis

Over the last few decades, childhood stunting has persisted as a major global challenge. This study hypothesized that TPTO (Tree-based Pipeline Optimization Tool), an AutoML (automated machine learning) tool, would outperform all pre-existing machine learning models and reveal the positive impact of economic prosperity, strong familial traits, and resource attainability on reducing stunting risk. Feature correlation plots revealed that maternal height, wealth indicators, and parental education were universally important features for determining stunting outcomes approximately two years after birth. These results help inform future research by highlighting how demographic, familial, and socio-economic conditions influence stunting and providing medical professionals with a deployable risk assessment tool for predicting childhood stunting.

Read More...

Investigating ecosystem resiliency in different flood zones of south Brooklyn, New York

Ng et al. | Mar 23, 2024

Investigating ecosystem resiliency in different flood zones of south Brooklyn, New York
Image credit: Ng and Zheng et al 2024

With climate change and rising sea levels, south Brooklyn is exposed to massive flooding and intense precipitation. Previous research discovered that flooding shifts plant species distribution, decreases soil pH, and increases salt concentration, nitrogen, phosphorus, and potassium levels. The authors predicted a decreasing trend from Zone 1 to 6: high-pH, high-salt, and high-nutrients in more flood-prone areas to low-pH, low-salt, and low-nutrient in less flood-prone regions. They performed DNA barcoding to identify plant species inhabiting flood zones with expectations of decreasing salt tolerance and moisture uptake by plants' soil from Zones 1-6. Furthermore, they predicted an increase in invasive species, ultimately resulting in a decrease in biodiversity. After barcoding, they researched existing information regarding invasiveness, ideal soil, pH tolerance, and salt tolerance. They performed soil analyses to identify pH, nitrogen (N), phosphorus (P), and potassium (K) levels. For N and P levels, we discovered a general decreasing trend from Zone 1 to 6 with low and moderate statistical significance respectively. Previous studies found that soil moisture can increase N and P uptake, helping plants adopt efficient resource-use strategies and reduce water stress from flooding. Although characteristics of plants were distributed throughout all zones, demonstrating overall diversity, the soil analyses hinted at the possibility of a rising trend of plants adapting to the increase in flooding. Future expansive research is needed to comprehensively map these trends. Ultimately, investigating trends between flood zones and the prevalence of different species will assist in guiding solutions to weathering climate change and protecting biodiversity in Brooklyn.

Read More...

Tap water quality analysis in Ulaanbaatar City

Munkhbat et al. | Sep 25, 2022

Tap water quality analysis in Ulaanbaatar City

There have been several issues concerning the water quality in Ulaanbaatar, Mongolia in the past few years. This study, we collected 28 samples from 6 districts of Ulaanbaatar to check if the water supply quality met the standards of the World Health Organization, the Environmental Protection Agency, and a Mongolian National Standard. Only three samples fully met all the requirements of the global standards. Samples in Zaisan showed higher hardness (>120 ppm) and alkalinity levels (20–200 ppm) over the other districts in the city. Overall, the results show that it is important to ensure a safe and accessible water supply in Ulaanbaatar to prevent future water quality issues.

Read More...