Using facial recognition as a use-case scenario, we attempt to identify sources of bias in a model developed using transfer learning. To achieve this task, we developed a model based on a pre-trained facial recognition model, and scrutinized the accuracy of the model’s image classification against factors such as age, gender, and race to observe whether or not the model performed better on some demographic groups than others. By identifying the bias and finding potential sources of bias, his work contributes a unique technical perspective from the view of a small scale developer to emerging discussions of accountability and transparency in AI.
In this work, the authors investigate the accuracy with which two different population growth models can predict population growth over time. They apply the Malthusian law or Logistic law to US population from 1951 until 2019. To assess how closely the growth model fits actual population data, a least-squared curve fit was applied and revealed that the Logistic law of population growth resulted in smaller sum of squared residuals. These findings are important for ensuring optimal population growth models are implemented to data as population forecasting affects a country's economic and social structure.
The surface of the unicellular eukaryote, Tetrahymena pyriformis, is covered with thousands of hair-like cilia. These cilia are very similar to cilia of the human olfactory and respiratory tracts making them model organisms for studying cilia function and pathology. The authors of this study investigated the effect of voltage on T. pyriformis galvanotaxis, the movement towards an electrical stimulus. They observed galvanotaxis towards the cathode at voltages over 4V which plateau, indicating opening of voltage gated-ion channels to trigger movement.
In this study, the authors investigated the biological mechanism underlying the actions of a traditional medicinal plant, Astragalus membranaceus. Using C. elegans as an experimental model, they tested the effects of AM root on heat stress responses. Their results suggest that AM root extract may enhance the activity of endogenous pathways that mediate cellular responses to heat stress.
Lung cancer is highly fatal, largely due to late diagnoses, but early detection can greatly improve survival. This study developed three models to enhance early diagnosis: an MLP for clinical data, a CNN for imaging data, and a hybrid model combining both.
Here the authors hypothesized that reducing folliculin (FLCN) might affect p62 protein levels in the dorsal hippocampus of mice, given their potential functional connection and p62's role in neurodegenerative diseases. Their study, using western blots and a two-way ANOVA on young wild-type mice, found that p62 levels correlated with FLCN expression, but ultimately concluded there's no evidence of a functional connection between FLCN and p62 in this specific model.
The global issue of water quality has led to the use of machine learning models, like ANN and SVM, to predict water potability. However, these models can be complex and resource-intensive. This research aimed to find a simpler, more efficient model for water quality prediction.
Additive manufacturing (AM) is transforming the production of complex metal parts, but challenges like internal cracking can arise, particularly in critical sectors such as aerospace and automotive. Traditional methods to assess cracking susceptibility are costly and time-consuming, prompting the use of machine learning (ML) for more efficient predictions. This study developed a multi-model ML pipeline that predicts solidification cracking susceptibility (SCS) more accurately by considering secondary alloy properties alongside composition, with Random Forest models showing the best performance, highlighting a promising direction for future research into SCS quantification.
The purpose of our study was to examine the correlation of glycosylated hemoglobin (HbA1c), blood pressure (BP) readings, and lipid levels with retinopathy. Our main hypothesis was that poor glycemic control, as evident by high HbA1c levels, high blood pressure, and abnormal lipid levels, causes an increased risk of retinopathy. We identified the top two features that were most important to the model as age and HbA1c. This indicates that older patients with poor glycemic control are more likely to show presence of retinopathy.
Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.