Algorithmic trading has been increasingly used by Americans. In this work, we tested whether including the opening, closing, and highest prices in three supervised learning models affected their performance. Indeed, we found that including all three prices decreased the error of the prediction significantly.
Read More...Browse Articles
Gradient boosting with temporal feature extraction for modeling keystroke log data
Although there has been great progress in the field of Natural language processing (NLP) over the last few years, particularly with the development of attention-based models, less research has contributed towards modeling keystroke log data. State of the art methods handle textual data directly and while this has produced excellent results, the time complexity and resource usage are quite high for such methods. Additionally, these methods fail to incorporate the actual writing process when assessing text and instead solely focus on the content. Therefore, we proposed a framework for modeling textual data using keystroke-based features. Such methods pay attention to how a document or response was written, rather than the final text that was produced. These features are vastly different from the kind of features extracted from raw text but reveal information that is otherwise hidden. We hypothesized that pairing efficient machine learning techniques with keystroke log information should produce results comparable to transformer techniques, models which pay more or less attention to the different components of a text sequence in a far quicker time. Transformer-based methods dominate the field of NLP currently due to the strong understanding they display of natural language. We showed that models trained on keystroke log data are capable of effectively evaluating the quality of writing and do it in a significantly shorter amount of time compared to traditional methods. This is significant as it provides a necessary fast and cheap alternative to increasingly larger and slower LLMs.
Read More...Nintendo Da Vinci: A Novel Control System to Improve Performance in Robotic-Assisted Surgery
Complications of robotic-assisted surgery are on the rise, partly due to surgeons not receiving proper training. Al-Akash and Al-Akash hypothesized Nintendo JoyCon controls would improve surgical performance compared to the FDA-approved Da Vinci Surgical System with two user groups (doctor and gamer). Their results show that implementing a Nintendo JoyCon control system is associated with improved surgical performance and learning rate compared to the Da Vinci Surgical System.
Read More...Predicting smoking status based on RNA sequencing data
Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.
Read More...Unlocking robotic potential through modern organ segmentation
The authors looked at different models of semantic segmentation to determine which may be best used in the future for segmentation of CT scans to help diagnose certain conditions.
Read More...Entropy-based subset selection principal component analysis for diabetes risk factor identification
In this article, the authors looked at developing a strategy that would allow for earlier diagnosis of Diabetes as that improves long-term outcomes. They were able to find that BMI, tricep skin fold thickness, and blood pressure are the risk factors with the highest accuracy in predicting diabetes risk.
Read More...DNA-SEnet: A convolutional neural network for classifying DNA-asthma associations
In this study, the authors developed a model named DNA Sequence Embedding Network (DNA-SEnet) to classify DNA-asthma associations using their genomic patterns.
Read More...Contribution of environmental factors to genetic variation in the Pacific white-sided dolphin
Here the authors sought to understand the effects of different variables that may be tied to pollution and climate change on genetic variation of Pacific white-sided dolphins, a species that is currently threatened by water pollution. Based on environmental data collected alongside a genetic distance matrix, they found that ocean currents had the most significant impact on the genetic diversity of Pacific white-sided dolphins along the Japanese coast.
Read More...Predicting the factors involved in orthopedic patient hospital stay
Long hospital stays can be stressful for the patient for many reasons. We hypothesized that age would be the greatest predictor of hospital stay among patients who underwent orthopedic surgery. Through our models, we found that severity of illness was indeed the highest factor that contributed to determining patient length of stay. The other two factors that followed were the facility that the patient was staying in and the type of procedure that they underwent.
Read More...Applying centrality analysis on a protein interaction network to predict colorectal cancer driver genes
In this article the authors created an interaction map of proteins involved in colorectal cancer to look for driver vs. non-driver genes. That is they wanted to see if they could determine what genes are more likely to drive the development and progression in colorectal cancer and which are present in altered states but not necessarily driving disease progression.
Read More...