Articles | Journal of Emerging Investigators

Does language familiarity affect typing speed?

Shin et al. | Aug 23, 2024

In cognitive psychology, typed responses are used to assess thinking skills and creativity, but research on factors influencing typing speed is limited. This study examined how language familiarity affects typing speed, hypothesizing that familiarity with a language would correlate with faster typing. Participants typed faster in English than Latin, with those unfamiliar with Latin showing a larger discrepancy between the two languages, though Latin education level did not significantly impact typing speed, highlighting the role of language familiarity in typing performance.

A natural language processing approach to skill identification in the job market

Suram et al. | Oct 29, 2024

The authors looked at using machine learning to identify skills needed to apply for certain jobs, specifically looking at different techniques to parse apart the text. They found that Bidirectional Encoder Representation of Transforms (BERT) performed best.

Large Language Models are Good Translators

Zeng et al. | Oct 16, 2024

Machine translation remains a challenging area in artificial intelligence, with neural machine translation (NMT) making significant strides over the past decade but still facing hurdles, particularly in translation quality due to the reliance on expensive bilingual training data. This study explores whether large language models (LLMs), like GPT-4, can be effectively adapted for translation tasks and outperform traditional NMT systems.

Changes in Aromanian language use and the Aromanian ethnolinguistic group’s reaction to decline

Ganea et al. | Sep 10, 2021

The Aromanian language and culture is quickly declining towards extinction. In this new research article, Ganea and Lascu provide evidence that, although the use of the Aromanian language is less prevalent among younger individuals, participants overwhelming support the preservation of Aromanian language and culture.

Comparison of three large language models as middle school math tutoring assistants

Ramanathan et al. | May 02, 2024

Middle school math forms the basis for advanced mathematical courses leading up to the university level. Large language models (LLMs) have the potential to power next-generation educational technologies, acting as digital tutors to students. The main objective of this study was to determine whether LLMs like ChatGPT, Bard, and Llama 2 can serve as reliable middle school math tutoring assistants on three tutoring tasks: hint generation, comprehensive solution, and exercise creation.

Uncovering the hidden trafficking trade with geographic data and natural language processing

Aqid et al. | Oct 14, 2024

The authors use machine learning to develop an evidence-based detection tool for identifying human trafficking.

Rhythmic lyrics translation: Customizing a pre-trained language model using stacked fine-tuning

Chong et al. | May 01, 2023

Neural machine translation (NMT) is a software that uses neural network techniques to translate text from one language to another. However, one of the most famous NMT models—Google Translate—failed to give an accurate English translation of a famous Korean nursery rhyme, "Airplane" (비행기). The authors fine-tuned a pre-trained model first with a dataset from the lyrics domain, and then with a smaller dataset containing the rhythmical properties, to teach the model to translate rhythmically accurate lyrics. This stacked fine-tuning method resulted in an NMT model that could maintain the rhythmical characteristics of lyrics during translation while single fine-tuned models failed to do so.

Grammatical Gender and Politics: A Comparison of French and English in Political Discourse

Zhang et al. | Jul 07, 2021

Grammatical gender systems are prevalent across many languages, and when comparing French and English the existence of this system becomes a strong distinction. There have been studies that attribute assigned grammatical gender with the ability to influence conceptualization (attributing gender attributes) of all nouns, thus affecting people's thoughts on a grand scale. We hypothesized that due to the influence of a grammatical gender system, French political discourse would have a large difference between the number of masculine and feminine nouns used. Specifically, we predicted there would be a larger ratio of feminine to masculine nouns in French political discourse than in non-political discourse when compared to English discourse. Through linguistic analysis of gendered nouns in French political writing, we found that there is a clear difference between the number of feminine versus masculine nouns, signaling a preference for a more “effeminate” language.

Gradient boosting with temporal feature extraction for modeling keystroke log data

Barretto et al. | Oct 04, 2024

Although there has been great progress in the field of Natural language processing (NLP) over the last few years, particularly with the development of attention-based models, less research has contributed towards modeling keystroke log data. State of the art methods handle textual data directly and while this has produced excellent results, the time complexity and resource usage are quite high for such methods. Additionally, these methods fail to incorporate the actual writing process when assessing text and instead solely focus on the content. Therefore, we proposed a framework for modeling textual data using keystroke-based features. Such methods pay attention to how a document or response was written, rather than the final text that was produced. These features are vastly different from the kind of features extracted from raw text but reveal information that is otherwise hidden. We hypothesized that pairing efficient machine learning techniques with keystroke log information should produce results comparable to transformer techniques, models which pay more or less attention to the different components of a text sequence in a far quicker time. Transformer-based methods dominate the field of NLP currently due to the strong understanding they display of natural language. We showed that models trained on keystroke log data are capable of effectively evaluating the quality of writing and do it in a significantly shorter amount of time compared to traditional methods. This is significant as it provides a necessary fast and cheap alternative to increasingly larger and slower LLMs.

English learner status in Florida public schools is correlated with significantly lower graduation rates

Socas et al. | Oct 30, 2024

The authors explore factors affecting graduation rates of students learning English as a second language across Florida counties.

Browse Articles

Does language familiarity affect typing speed?

A natural language processing approach to skill identification in the job market

Large Language Models are Good Translators

Changes in Aromanian language use and the Aromanian ethnolinguistic group’s reaction to decline

Comparison of three large language models as middle school math tutoring assistants

Uncovering the hidden trafficking trade with geographic data and natural language processing

Rhythmic lyrics translation: Customizing a pre-trained language model using stacked fine-tuning

Grammatical Gender and Politics: A Comparison of French and English in Political Discourse

Gradient boosting with temporal feature extraction for modeling keystroke log data

English learner status in Florida public schools is correlated with significantly lower graduation rates

Search Articles

Popular Tags

Browse Articles

Search Articles

Category

School Level

Popular Tags