Browse Articles

Gradient boosting with temporal feature extraction for modeling keystroke log data

Barretto et al. | Oct 04, 2024

Gradient boosting with temporal feature extraction for modeling keystroke log data
Image credit: Barretto and Barretto 2024.

Although there has been great progress in the field of Natural language processing (NLP) over the last few years, particularly with the development of attention-based models, less research has contributed towards modeling keystroke log data. State of the art methods handle textual data directly and while this has produced excellent results, the time complexity and resource usage are quite high for such methods. Additionally, these methods fail to incorporate the actual writing process when assessing text and instead solely focus on the content. Therefore, we proposed a framework for modeling textual data using keystroke-based features. Such methods pay attention to how a document or response was written, rather than the final text that was produced. These features are vastly different from the kind of features extracted from raw text but reveal information that is otherwise hidden. We hypothesized that pairing efficient machine learning techniques with keystroke log information should produce results comparable to transformer techniques, models which pay more or less attention to the different components of a text sequence in a far quicker time. Transformer-based methods dominate the field of NLP currently due to the strong understanding they display of natural language. We showed that models trained on keystroke log data are capable of effectively evaluating the quality of writing and do it in a significantly shorter amount of time compared to traditional methods. This is significant as it provides a necessary fast and cheap alternative to increasingly larger and slower LLMs.

Read More...

Identification of potential therapeutic targets for multiple myeloma by gene expression analysis

Kochenderfer et al. | Apr 26, 2024

Identification of potential therapeutic targets for multiple myeloma by gene expression analysis
Image credit: The authors

A central challenge of cancer therapy is identifying treatments that will effectively target cancer cells while minimizing effects on healthy cells. To identify potential targets for treating a multiple myeloma, a frequently incurable cancer, Kochenderfer and Kochenderfer analyze RNA sequencing data from the Cancer Cell Line Encyclopedia to find genes with high expression in multiple myeloma cells and low expression in normal tissues

Read More...

Investigating ecosystem resiliency in different flood zones of south Brooklyn, New York

Ng et al. | Mar 23, 2024

Investigating ecosystem resiliency in different flood zones of south Brooklyn, New York
Image credit: Ng and Zheng et al 2024

With climate change and rising sea levels, south Brooklyn is exposed to massive flooding and intense precipitation. Previous research discovered that flooding shifts plant species distribution, decreases soil pH, and increases salt concentration, nitrogen, phosphorus, and potassium levels. The authors predicted a decreasing trend from Zone 1 to 6: high-pH, high-salt, and high-nutrients in more flood-prone areas to low-pH, low-salt, and low-nutrient in less flood-prone regions. They performed DNA barcoding to identify plant species inhabiting flood zones with expectations of decreasing salt tolerance and moisture uptake by plants' soil from Zones 1-6. Furthermore, they predicted an increase in invasive species, ultimately resulting in a decrease in biodiversity. After barcoding, they researched existing information regarding invasiveness, ideal soil, pH tolerance, and salt tolerance. They performed soil analyses to identify pH, nitrogen (N), phosphorus (P), and potassium (K) levels. For N and P levels, we discovered a general decreasing trend from Zone 1 to 6 with low and moderate statistical significance respectively. Previous studies found that soil moisture can increase N and P uptake, helping plants adopt efficient resource-use strategies and reduce water stress from flooding. Although characteristics of plants were distributed throughout all zones, demonstrating overall diversity, the soil analyses hinted at the possibility of a rising trend of plants adapting to the increase in flooding. Future expansive research is needed to comprehensively map these trends. Ultimately, investigating trends between flood zones and the prevalence of different species will assist in guiding solutions to weathering climate change and protecting biodiversity in Brooklyn.

Read More...

Predicting baseball pitcher efficacy using physical pitch characteristics

Oberoi et al. | Jan 11, 2024

Predicting baseball pitcher efficacy using physical pitch characteristics
Image credit: Antoine Schibler

Here, the authors sought to develop a new metric to evaluate the efficacy of baseball pitchers using machine learning models. They found that the frequency of balls, was the most predictive feature for their walks/hits allowed per inning (WHIP) metric. While their machine learning models did not identify a defining trait, such as high velocity, spin rate, or types of pitches, they found that consistently pitching within the strike zone resulted in significantly lower WHIPs.

Read More...

Predicting college retention rates from Google Street View images of campuses

Dileep et al. | Jan 02, 2024

Predicting college retention rates from Google Street View images of campuses
Image credit: Dileep et al. 2024

Every year, around 40% of undergraduate students in the United States discontinue their studies, resulting in a loss of valuable education for students and a loss of money for colleges. Even so, colleges across the nation struggle to discover the underlying causes of these high dropout rates. In this paper, the authors discuss the use of machine learning to find correlations between the built environment factors and the retention rates of colleges. They hypothesized that one way for colleges to improve their retention rates could be to improve the physical characteristics of their campus to be more pleasing. The authors used image classification techniques to look at images of colleges and correlate certain features like colors, cars, and people to higher or lower retention rates. With three possible options of high, medium, and low retention rates, the probability that their models reached the right conclusion if they simply chose randomly was 33%. After finding that this 33%, or 0.33 mark, always fell outside of the 99% confidence intervals built around their models’ accuracies, the authors concluded that their machine learning techniques can be used to find correlations between certain environmental factors and retention rates.

Read More...

The extent to which storefront alcohol advertising differs by community profile in Michigan

Voyt et al. | May 17, 2023

The extent to which storefront alcohol advertising differs by community profile in Michigan
Image credit: Steve Harvey

Here, recognizing that alcohol manufacturers may target ethnic minorities and youths with specific forms of advertisements based on previous studies, the authors considered how alcohol storefronts differ depending on the community they are located in. Specifically, they looked at differences between Metro-Dtroit suburban communities of high- and low-incomes. They found that alcohol stores in the low-income areas had more and larger alcohol and malt liquor advertisements per store along with being within 1,000 feet of a school.

Read More...

Hybrid Quantum-Classical Generative Adversarial Network for synthesizing chemically feasible molecules

Sikdar et al. | Jan 10, 2023

Hybrid Quantum-Classical Generative Adversarial Network for synthesizing chemically feasible molecules

Current drug discovery processes can cost billions of dollars and usually take five to ten years. People have been researching and implementing various computational approaches to search for molecules and compounds from the chemical space, which can be on the order of 1060 molecules. One solution involves deep generative models, which are artificial intelligence models that learn from nonlinear data by modeling the probability distribution of chemical structures and creating similar data points from the trends it identifies. Aiming for faster runtime and greater robustness when analyzing high-dimensional data, we designed and implemented a Hybrid Quantum-Classical Generative Adversarial Network (QGAN) to synthesize molecules.

Read More...

Differential privacy in machine learning for traffic forecasting

Vinay et al. | Dec 21, 2022

Differential privacy in machine learning for traffic forecasting

In this paper, we measured the privacy budgets and utilities of different differentially private mechanisms combined with different machine learning models that forecast traffic congestion at future timestamps. We expected the ANNs combined with the Staircase mechanism to perform the best with every value in the privacy budget range, especially with the medium high values of the privacy budget. In this study, we used the Autoregressive Integrated Moving Average (ARIMA) and neural network models to forecast and then added differentially private Laplacian, Gaussian, and Staircase noise to our datasets. We tested two real traffic congestion datasets, experimented with the different models, and examined their utility for different privacy budgets. We found that a favorable combination for this application was neural networks with the Staircase mechanism. Our findings identify the optimal models when dealing with tricky time series forecasting and can be used in non-traffic applications like disease tracking and population growth.

Read More...

Effect of hypervitaminosis A in regenerating planaria: A potential model for teratogenicity testing

Bennet et al. | Dec 12, 2022

Effect of hypervitaminosis A in regenerating planaria: A potential model for teratogenicity testing

This unique research study evaluated the potential use of the flatworm, brown planaria (Dugesia tigrine), as an alternative model for teratogenicity testing. In this study, we exposed amputated planaria to varying concentrations of a known teratogen, vitamin A (retinol), for approximately 2 weeks, and evaluated multiple parameters including the formation of blastema and eyes. The results from this study demonstrated that high concentrations of retinol caused defects in head and eye formation in regenerating planaria, with similarities to vitamin A related teratogenicity findings in mammals. Based on these results, regenerating brown planaria are a promising alternative model for teratogenicity testing, which can potentially be paradigm shifting as it can reduce cost, time, and pregnant animal use in research.

Read More...

Combinatorial treatment by siNOTCH and retinoic acid decreases A172 brain cancer cell growth

Richardson et al. | Nov 14, 2022

Combinatorial treatment by siNOTCH and retinoic acid decreases A172 brain cancer cell growth

Treatments inhibiting Notch signaling pathways have been explored by researchers as a new approach for the treatment of glioblastoma tumors, which is a fast-growing and aggressive brain tumor. Recently, retinoic acid (RA) therapy, which inhibits Notch signaling, has shown a promising effect on inhibiting glioblastoma progression. RA, which is a metabolite of vitamin A, is very important in embryonic cellular development, which includes the regulation of multiple developmental processes, such as brain neurogenesis. However, high doses of RA treatment caused many side effects such as headaches, nausea, redness around the injection site, or allergic reactions. Therefore, we hypothesized that a combination treatment of RA and siRNA targeting NOTCH1 (siNOTCH1), the essential gene that activates Notch signaling, would effectively inhibit brain cancer cell proliferation. The aim of the study was to determine whether inhibiting NOTCH1 would inhibit the growth of brain cancer cells by cell viability assay. We found that the combination treatment of siNOTCH1 and RA in low concentration effectively decreased the NOTCH1 expression level compared to the individual treatments. However, the combination treatment condition significantly decreased the number of live brain cancer cells only at a low concentration of RA. We anticipate that this novel combination treatment can provide a solution to the side effects of chemotherapy.

Read More...