Here, the authors explored how the sale and use of electric vehicles could reduce emissions from the transport industry in Canada. By fitting the sale of total of electric vehicles with an exponential model, the authors predicted the number of electric vehicle sales through 2030 and related that to the average emission for such vehicles. Ultimately, they found that the sale and use of electric vehicles alone would likely not meet the 45% reduction in emissions from the transport industry suggested by the Canadian government
Read More...Browse Articles
A land use regression model to predict emissions from oil and gas production using machine learning
Emissions from oil and natural gas (O&G) wells such as nitrogen dioxide (NO2), volatile organic compounds (VOCs), and ozone (O3) can severely impact the health of communities located near wells. In this study, we used O&G activity and wind-carried emissions to quantify the extent to which O&G wells affect the air quality of nearby communities, revealing that NO2, NOx, and NO are correlated to O&G activity. We then developed a novel land use regression (LUR) model using machine learning based on O&G prevalence to predict emissions.
Read More...Jet optimization using a hybrid multivariate regression model and statistical methods in dimuon collisions
Collisions of heavy ions, such as muons result in jets and noise. In high-energy particle physics, researchers use jets as crucial event-shaped observable objects to determine the properties of a collision. However, many ionic collisions result in large amounts of energy lost as noise, thus reducing the efficiency of collisions with heavy ions. The purpose of our study is to analyze the relationships between properties of muons in a dimuon collision to optimize conditions of dimuon collisions and minimize the noise lost. We used principles of Newtonian mechanics at the particle level, allowing us to further analyze different models. We used simple Python algorithms as well as linear regression models with tools such as sci-kit Learn, NumPy, and Pandas to help analyze our results. We hypothesized that since the invariant mass, the energy, and the resultant momentum vector are correlated with noise, if we constrain these inputs optimally, there will be scenarios in which the noise of the heavy-ion collision is minimized.
Read More...Machine Learning Algorithm Using Logistic Regression and an Artificial Neural Network (ANN) for Early Stage Detection of Parkinson’s Disease
Despite the prevalence of PD, diagnosing PD is expensive, requires specialized testing, and is often inaccurate. Moreover, diagnosis is often made late in the disease course when treatments are less effective. Using existing voice data from patients with PD and healthy controls, the authors created and trained two different algorithms: one using logistic regression and another employing an artificial neural network (ANN).
Read More...The most efficient position of magnets
Here, the authors investigated the most efficient way to position magnets to hold the most pieces of paper on the surface of a refrigerator. They used a regression model along with an artificial neural network to identify the most efficient positions of four magnets to be at the vertices of a rectangle.
Read More...Modeling Hartree-Fock approximations of the Schrödinger Equation for multielectron atoms from Helium to Xenon using STO-nG basis sets
The energy of an atom is extremely useful in nuclear physics and reaction mechanism pathway determination but is challenging to compute. This work aimed to synthesize regression models for Pople Gaussian expansions of Slater-type Orbitals (STO-nG) atomic energy vs. atomic number scatter plots to allow for easy approximation of atomic energies without using computational chemistry methods. The data indicated that of the regressions, sinusoidal regressions most aptly modeled the scatter plots.
Read More...Can the nucleotide content of a DNA sequence predict the sequence accessibility?
Sequence accessibility is an important factor affecting gene expression. Sequence accessibility or openness impacts the likelihood that a gene is transcribed and translated into a protein and performs functions and manifests traits. There are many potential factors that affect the accessibility of a gene. In this study, our hypothesis was that the content of nucleotides in a genetic sequence predicts its accessibility. Using a machine learning linear regression model, we studied the relationship between nucleotide content and accessibility.
Read More...Using economic indicators to create an empirical model of inflation
Here, seeking to understand the correlation of 50 of the most important economic indicators with inflation, the authors used a rolling linear regression to identify indicators with the most significant correlation with the Month over Month Consumer Price Index Seasonally Adjusted (CPI). Ultimately the concluded that the average gasoline price, U.S. import price index, and 5-year market expected inflation had the most significant correlation with the CPI.
Read More...Comparison of the ease of use and accuracy of two machine learning algorithms – forestry case study
Machine learning algorithms are becoming increasingly popular for data crunching across a vast area of scientific disciplines. Here, the authors compare two machine learning algorithms with respect to accuracy and user-friendliness and find that random forest algorithms outperform logistic regression when applied to the same dataset.
Read More...Predicting smoking status based on RNA sequencing data
Given an association between nicotine addiction and gene expression, we hypothesized that expression of genes commonly associated with smoking status would have variable expression between smokers and non-smokers. To test whether gene expression varies between smokers and non-smokers, we analyzed two publicly-available datasets that profiled RNA gene expression from brain (nucleus accumbens) and lung tissue taken from patients identified as smokers or non-smokers. We discovered statistically significant differences in expression of dozens of genes between smokers and non-smokers. To test whether gene expression can be used to predict whether a patient is a smoker or non-smoker, we used gene expression as the training data for a logistic regression or random forest classification model. The random forest classifier trained on lung tissue data showed the most robust results, with area under curve (AUC) values consistently between 0.82 and 0.93. Both models trained on nucleus accumbens data had poorer performance, with AUC values consistently between 0.65 and 0.7 when using random forest. These results suggest gene expression can be used to predict smoking status using traditional machine learning models. Additionally, based on our random forest model, we proposed KCNJ3 and TXLNGY as two candidate markers of smoking status. These findings, coupled with other genes identified in this study, present promising avenues for advancing applications related to the genetic foundation of smoking-related characteristics.
Read More...