Cystic fibrosis is a genetic disease caused by mutations in the CFTR gene. In this paper, the authors attempt to identify variations in stretches of up to 8 nucleotides in the protein-coding portions of the CFTR gene that are associated with disease development. This would allow screening of newborns or even fetuses in utero to determine the likelihood they develop cystic fibrosis.
Read More...Browse Articles
Uncovering mirror neurons’ molecular identity by single cell transcriptomics and microarray analysis
In this study, the authors use bioinformatic approaches to characterize the mirror neurons, which are active when performing and seeing certain actions. They also investigated whether mirror neuron impairment was connected to neural degenerative diseases and psychiatric disorders.
Read More...Characterization and Phylogenetic Analysis of the Cytochrome B Gene (cytb) in Salvelinus fontinalis, Salmo trutta and Salvelinus fontinalis X Salmo trutta Within the Lake Champlain Basin
Recent declines in the brook trout population of the Lake Champlain Basin have made the genetic screening of this and other trout species of utmost importance. In this study, the authors collected and analyzed 21 DNA samples from Lake Champlain Basin trout populations and performed a phylogenetic analysis on these samples using the cytochrome b gene. The findings presented in this study may influence future habitat decisions in this region.
Read More...A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood
Here, recognizing the difficulty associated with tracking the progression of dementia, the authors used machine learning models to predict between the presence of cognitive normalcy, mild cognitive impairment, and Alzheimer's Disease, based on blood DNA methylation levels, sex, and age. With four machine learning models and two dataset dimensionality reduction methods they achieved an accuracy of 53.33%.
Read More...A new therapy against MDR bacteria by in silico virtual screening of Pseudomonas aeruginosa LpxC inhibitors
Here, seeking to address the growing threat of multidrug-resistant bacteria (MDR). the authors used in silico virtual screening to target MDR Pseudomonas aeruginosa. They considered a key protein in its biosynthesis and virtually screened 20,000 candidates and 30 derivatives of brequinar. In the end, they identified a possible candidate with the highest degree of potential to inhibit the pathogen's lipid A synthesis.
Read More...Predicting smoking status based on RNA sequencing data
Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.
Read More...The impact of genetic analysis on the early detection of colorectal cancer
Although the 5-year survival rate for colorectal cancer is below 10%, it increases to greater than 90% if it is diagnosed early. We hypothesized from our research that analyzing non-synonymous single nucleotide variants (SNVs) in a patient's exome sequence would be an indicator for high genetic risk of developing colorectal cancer.
Read More...Tomato disease identification with shallow convolutional neural networks
Plant diseases can cause up to 50% crop yield loss for the popular tomato plant. A mobile device-based method to identify diseases from photos of symptomatic leaves via computer vision can be more effective due to its convenience and accessibility. To enable a practical mobile solution, a “shallow” convolutional neural networks (CNNs) with few layers, and thus low computational requirement but with high accuracy similar to the deep CNNs is needed. In this work, we explored if such a model was possible.
Read More...A novel approach to determine which organism best displays Gijswijt's Sequence in its genome
The sequence of nitrogenous bases that make up the DNA of organisms can contain hidden mathematical sequences. Here the authors used BioPython, a programming tool, to find an organism that displays Gijswijt’s Sequence in its genome. In this manner they found that the common carp best displays Gijswijt’s Sequence in its genome.
Read More...Refinement of Single Nucleotide Polymorphisms of Atopic Dermatitis related Filaggrin through R packages
In the United States, there are currently 17.8 million affected by atopic dermatitis (AD), commonly known as eczema. It is characterized by itching and skin inflammation. AD patients are at higher risk for infections, depression, cancer, and suicide. Genetics, environment, and stress are some of the causes of the disease. With the rise of personalized medicine and the acceptance of gene-editing technologies, AD-related variations need to be identified for treatment. Genome-wide association studies (GWAS) have associated the Filaggrin (FLG) gene with AD but have not identified specific problematic single nucleotide polymorphisms (SNPs). This research aimed to refine known SNPs of FLG for gene editing technologies to establish a causal link between specific SNPs and the diseases and to target the polymorphisms. The research utilized R and its Bioconductor packages to refine data from the National Center for Biotechnology Information's (NCBI's) Variation Viewer. The algorithm filtered the dataset by coding regions and conserved domains. The algorithm also removed synonymous variations and treated non-synonymous, frameshift, and nonsense separately. The non-synonymous variations were refined and ordered by the BLOSUM62 substitution matrix. Overall, the analysis removed 96.65% of data, which was redundant or not the focus of the research and ordered the remaining relevant data by impact. The code for the project can also be repurposed as a tool for other diseases. The research can help solve GWAS's imprecise identification challenge. This research is the first step in providing the refined databases required for gene-editing treatment.
Read More...