Browse Articles

Gradient boosting with temporal feature extraction for modeling keystroke log data

Barretto et al. | Oct 04, 2024

Gradient boosting with temporal feature extraction for modeling keystroke log data
Image credit: Barretto and Barretto 2024.

Although there has been great progress in the field of Natural language processing (NLP) over the last few years, particularly with the development of attention-based models, less research has contributed towards modeling keystroke log data. State of the art methods handle textual data directly and while this has produced excellent results, the time complexity and resource usage are quite high for such methods. Additionally, these methods fail to incorporate the actual writing process when assessing text and instead solely focus on the content. Therefore, we proposed a framework for modeling textual data using keystroke-based features. Such methods pay attention to how a document or response was written, rather than the final text that was produced. These features are vastly different from the kind of features extracted from raw text but reveal information that is otherwise hidden. We hypothesized that pairing efficient machine learning techniques with keystroke log information should produce results comparable to transformer techniques, models which pay more or less attention to the different components of a text sequence in a far quicker time. Transformer-based methods dominate the field of NLP currently due to the strong understanding they display of natural language. We showed that models trained on keystroke log data are capable of effectively evaluating the quality of writing and do it in a significantly shorter amount of time compared to traditional methods. This is significant as it provides a necessary fast and cheap alternative to increasingly larger and slower LLMs.

Read More...

Using NLP to ascertain changes in the fast-fashion industry based on UN sustainable development goals

Chadha et al. | Sep 11, 2023

Using NLP to ascertain changes in the fast-fashion industry based on UN sustainable development goals
Image credit: Prudence Earl

Here, the authors sought to evaluate the efforts of fast fashion clothing companies towards sustainability, specifically in regards to the United Nations Sustainable Development Goals. The authors used natural language processing to investigate the sustainability reports of fast fashion companies focusing on terms established by the UN. They found that the most consistently addressed areas were related to sustainable consumption/production, with a focus on health and well-being emerging during the recent pandemic.

Read More...

The effect of activation function choice on the performance of convolutional neural networks

Wang et al. | Sep 15, 2023

The effect of activation function choice on the performance of convolutional neural networks
Image credit: Tara Winstead

With the advance of technology, artificial intelligence (AI) is now applied widely in society. In the study of AI, machine learning (ML) is a subfield in which a machine learns to be better at performing certain tasks through experience. This work focuses on the convolutional neural network (CNN), a framework of ML, applied to an image classification task. Specifically, we analyzed the performance of the CNN as the type of neural activation function changes.

Read More...

Part of speech distributions for Grimm versus artificially generated fairy tales

Arvind et al. | Nov 16, 2024

Part of speech distributions for Grimm versus artificially generated fairy tales
Image credit: Nayalia Y.

Here, the authors wanted to explore mathematical paradoxes in which there are multiple contradictory interpretations or analyses for a problem. They used ChatGPT to generate a novel dataset of fairy tales. They found statistical differences between the artificially generated text and human produced text based on the distribution of parts of speech elements.

Read More...

An explainable model for content moderation

Cao et al. | Aug 16, 2023

An explainable model for content moderation

The authors looked at the ability of machine learning algorithms to interpret language given their increasing use in moderating content on social media. Using an explainable model they were able to achieve 81% accuracy in detecting fake vs. real news based on language of posts alone.

Read More...

Reddit v. Wall Street: Why Redditors beat Wall Street at its own game

Bhakar et al. | Sep 13, 2022

Reddit v. Wall Street: Why Redditors beat Wall Street at its own game

Here the authors investigated the motivation of a short squeeze of GameStop stock where users of the internet forum Reddit drove a sudden increase in GameStop stock price during the start of 2021. They relied on both qualitative and quantitative analyses where they tracked activity on the r/WallStreetBets subreddit in relation to mentions of GameStop. With these methods they found that while initially the short squeeze was driven by financial motivations, later on emotional motivations became more important. They suggest that social phenomena can be dynamic and evolve necessitating mixed method approaches to study them.

Read More...