Browse Articles

String analysis of exon 10 of the CFTR gene and the use of Bioinformatics in determination of the most accurate DNA indicator for CF prediction

Carroll et al. | Jul 12, 2020

String analysis of exon 10 of the CFTR gene and the use of Bioinformatics in determination of the most accurate DNA indicator for CF prediction

Cystic fibrosis is a genetic disease caused by mutations in the CFTR gene. In this paper, the authors attempt to identify variations in stretches of up to 8 nucleotides in the protein-coding portions of the CFTR gene that are associated with disease development. This would allow screening of newborns or even fetuses in utero to determine the likelihood they develop cystic fibrosis.

Read More...

Characterization and Phylogenetic Analysis of the Cytochrome B Gene (cytb) in Salvelinus fontinalis, Salmo trutta and Salvelinus fontinalis X Salmo trutta Within the Lake Champlain Basin

Palermo et al. | Jan 24, 2014

Characterization and Phylogenetic Analysis of the Cytochrome B Gene (<em>cytb</em>) in <em>Salvelinus fontinalis</em>,<em> Salmo trutta</em> and <em>Salvelinus fontinalis X Salmo trutta</em> Within the Lake Champlain Basin

Recent declines in the brook trout population of the Lake Champlain Basin have made the genetic screening of this and other trout species of utmost importance. In this study, the authors collected and analyzed 21 DNA samples from Lake Champlain Basin trout populations and performed a phylogenetic analysis on these samples using the cytochrome b gene. The findings presented in this study may influence future habitat decisions in this region.

Read More...

A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood

Adami et al. | Sep 20, 2023

A novel approach for predicting Alzheimer’s disease using machine learning on DNA methylation in blood
Image credit: National Cancer Institute

Here, recognizing the difficulty associated with tracking the progression of dementia, the authors used machine learning models to predict between the presence of cognitive normalcy, mild cognitive impairment, and Alzheimer's Disease, based on blood DNA methylation levels, sex, and age. With four machine learning models and two dataset dimensionality reduction methods they achieved an accuracy of 53.33%.

Read More...

A new therapy against MDR bacteria by in silico virtual screening of Pseudomonas aeruginosa LpxC inhibitors

Liu et al. | Apr 27, 2022

A new therapy against MDR bacteria by <em>in silico</em> virtual screening of <em>Pseudomonas aeruginosa</em> LpxC inhibitors

Here, seeking to address the growing threat of multidrug-resistant bacteria (MDR). the authors used in silico virtual screening to target MDR Pseudomonas aeruginosa. They considered a key protein in its biosynthesis and virtually screened 20,000 candidates and 30 derivatives of brequinar. In the end, they identified a possible candidate with the highest degree of potential to inhibit the pathogen's lipid A synthesis.

Read More...

Predicting smoking status based on RNA sequencing data

Yang et al. | Aug 30, 2024

Predicting smoking status based on RNA sequencing data
Image credit: Yang and Stanley 2024

Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.

Read More...

The impact of genetic analysis on the early detection of colorectal cancer

Agrawal et al. | Aug 24, 2023

The impact of genetic analysis on the early detection of colorectal cancer

Although the 5-year survival rate for colorectal cancer is below 10%, it increases to greater than 90% if it is diagnosed early. We hypothesized from our research that analyzing non-synonymous single nucleotide variants (SNVs) in a patient's exome sequence would be an indicator for high genetic risk of developing colorectal cancer.

Read More...

Tomato disease identification with shallow convolutional neural networks

Trinh et al. | Mar 03, 2023

Tomato disease identification with shallow convolutional neural networks

Plant diseases can cause up to 50% crop yield loss for the popular tomato plant. A mobile device-based method to identify diseases from photos of symptomatic leaves via computer vision can be more effective due to its convenience and accessibility. To enable a practical mobile solution, a “shallow” convolutional neural networks (CNNs) with few layers, and thus low computational requirement but with high accuracy similar to the deep CNNs is needed. In this work, we explored if such a model was possible.

Read More...

Refinement of Single Nucleotide Polymorphisms of Atopic Dermatitis related Filaggrin through R packages

Naravane et al. | Oct 12, 2022

Refinement of Single Nucleotide Polymorphisms of Atopic Dermatitis related Filaggrin through R packages

In the United States, there are currently 17.8 million affected by atopic dermatitis (AD), commonly known as eczema. It is characterized by itching and skin inflammation. AD patients are at higher risk for infections, depression, cancer, and suicide. Genetics, environment, and stress are some of the causes of the disease. With the rise of personalized medicine and the acceptance of gene-editing technologies, AD-related variations need to be identified for treatment. Genome-wide association studies (GWAS) have associated the Filaggrin (FLG) gene with AD but have not identified specific problematic single nucleotide polymorphisms (SNPs). This research aimed to refine known SNPs of FLG for gene editing technologies to establish a causal link between specific SNPs and the diseases and to target the polymorphisms. The research utilized R and its Bioconductor packages to refine data from the National Center for Biotechnology Information's (NCBI's) Variation Viewer. The algorithm filtered the dataset by coding regions and conserved domains. The algorithm also removed synonymous variations and treated non-synonymous, frameshift, and nonsense separately. The non-synonymous variations were refined and ordered by the BLOSUM62 substitution matrix. Overall, the analysis removed 96.65% of data, which was redundant or not the focus of the research and ordered the remaining relevant data by impact. The code for the project can also be repurposed as a tool for other diseases. The research can help solve GWAS's imprecise identification challenge. This research is the first step in providing the refined databases required for gene-editing treatment.

Read More...