All manuscripts published in JEI must be hypothesis-driven research. Hypotheses are a crucial part of the scientific thinking process, and professional scientific endeavors revolve around posing and testing hypotheses. We believe that it is important for students who publish with JEI to practice rigorous scientific thinking. This means that manuscripts that merely introduce an invention, methods optimization, or machine learning algorithm/model, no matter how impressive it is, are not appropriate for JEI. Here are some common examples of unacceptable “hypotheses” relating to engineering projects:
In this guide, we will describe a few of the best strategies to convert your engineering- or machine learning-based manuscript into a hypothesis-driven one publishable with JEI. We will use some examples of submissions that we received in the past that either implement one of the included strategies or no longer would be acceptable and provide guidance on how to revise them.
It is often possible to convert an engineering- or machine learning-based manuscript to a hypothesis-driven one by adding a few experiments, and sometimes just by changing the way it is presented. Here are two strategies to convert manuscripts that involve engineering, machine learning, and optimization projects to also include a clear, experimentally tested hypothesis, with examples drawn from previous JEI submissions.
This is the best way to use your invention to write a hypothesis-driven manuscript acceptable for JEI publication. Rather than centering your hypothesis on your invention or model, your hypothesis should predict a scientific finding using your invention or model within the methodology of testing your hypothesis. Below are some examples of JEI manuscripts that use their invention/model/optimization to pose a question and perform a series of experiments to test the hypothesis using their invention/model/optimization.
Every year, around 800,000 people die by suicide worldwide, and it is the second leading cause of death for individuals aged 15-24 years. The National Survey on Drug Use and Health (NSDUH) is conducted annually to ask the general United States population aged 12 years and older questions about mental health, substance use, and suicidal thoughts. Some of the factors the survey considers are substance use, including need for treatment and disorders, and mental health topics, including major depressive episodes, suicidal ideation and attempts, and mental illness. In our research, we sought to identify associations between suicidal ideation and relevant variables, such as sleep quality, hopelessness, and anxious behavior, by analyzing survey responses from the 2020 NSDUH. Due to the density of the survey and no obvious direct relationship between any variables with suicidal ideation, through our research we aimed to clearly find and display any association. We hypothesized that frequent anxious thoughts and behavior (such as fidgeting and restlessness), feelings of sadness, and/or low sleep quality would be the factors most predictive of having suicidal thoughts. Using a random forest classifier, we found that sleep problems were highly predictive of suicidal ideation. Professionals and clinicians should keep these findings in mind when developing suicide prevention efforts, such as identifying and supporting people potentially at risk, intervening to improve sleep quality, and teaching coping skills.
In this work, the authors used a computational algorithm to specifically address a scientific question. Their hypothesis was that several factors stemming from a large survey would be most predictive of suicidal thoughts. To test this hypothesis, the authors resorted to using a computational random forest classifier and found that only one of the hypothesized factors was highly predictive of suicidal thoughts.
The purpose of this experiment was to determine if a larger turbine on a small hydropower generator would produce a greater voltage than a smaller sized turbine. This experiment was conducted in the Salinas river, which flowed from 0.40 m/s to 0.43 m/s during testing, and a sink, which flowed at 3.78 liters per minute. There were three turbines used: a small one with scoops sized 1.25”x1.25,” a medium one with scoops sized 1.375”x1.25,” and a large one with scoops sized 1.75”x1.5.” Once in the water source, the instantaneous voltage produced by the generator was recorded at each minute for a period of ten minutes. There were five tests conducted using each turbine with the river and the sink. The small turbine produced the greatest voltage in both tests by generating an average voltage of 1.216V in the river and 4.06V in the sink, compared with .99V and 3.738 being produced by the medium turbine, and 0.87 V and 3.419V being produced by the big turbine. As turbine size went up, the voltage being produced went down with statistical significance between each data set. The original hypothesis wasn’t supported because the large turbine produced the least voltage. This may be because the increased surface area of the scoops meant that the large turbine needed more water to be able to turn or because the weight of the turbine added resistance.
This manuscript contains two parts: an engineering portion where the authors built the portable hydropower generator, a multi-meter, turbines, and other supplies necessary for testing, and a scientific testing portion where the authors performed experiments to compare the effect of turbine size on the voltage produced by a small hydropower generator. This is a good example of how you can use your inventions and engineered-devices to test a scientific hypothesis.
Millions of undetonated landmines still exist in underdeveloped and developing countries as remnants from past military conflicts that ravage developing and underdeveloped countries due to their political instability. There is an inability to detect and remove land mines, and to remove them safely, using many modern methods employed in developed countries around the world because the resources to do so are too expensive, risky, or unattainable due to a lack of training, practicality, or safety. The solution created in this project is to deploy many "modules" with each module contains a large inductor coil/metal detector, GSM chip, GPS chip, and a standalone microcontroller and have all the modules send data to one location and parse all the data using a client side application to process and display the data. The client side applications receive location data along with inductor sensor readings to create a heat map overlaid onto satellite imagery that can be updated with new data. The design was successful at fulfilling its goal; the early prototypes for the module worked and performed well with the other hardware. The data from the coil test proved the efficacy of the coil design and metal detector circuit to provide accurate and reliable readings. The application of the project is groundbreaking for humanitarian groups, undeveloped countries, and developing countries and extends beyond the scope of these users.
While the author claimed to have created a portable and cheap method to detect landmines, again, no clear hypothesis was presented or tested. The author hypothesized that they could create a solution to a problem without demonstrating that their system serves its intended purpose. Importantly, this manuscript falls under the category of unacceptable submissions where the author simply hypothesizes that they can build something. Please see below for ways to revise and convert this type of manuscript for publication in JEI.
While this manuscript discusses a very interesting and relevant problem, it is lacking a clear hypothesis and a good portion of the technology system was hypothetical, not created by the student. However, there is a section within the manuscript where the author creates a custom metal detector coil circuit and tests its ability to detect metal in accordance with UN standards for metal detection. Here, the author demonstrates their ability to build a custom metal-detecting device and validate its effectiveness. Instead of hypothesizing that they have a solution to a large-scale problem, the author could revise their hypothesis to something testable like:
In this case, the author would 1) validate that their coil design is acceptable by UN standards and 2) test how specific changes to its design would affect its ability to detect metal.
If an invention or computational algorithm/model explores a different way to accomplish a goal more efficiently or more accurately, you can design an experiment where you compare the factors that may influence the effectiveness/accuracy of your invention or model. However, you must include the following information for this approach to be acceptable:
These requirements are necessary to ensure that you have thought critically about your invention/model.
Collisions of heavy ions, such as muons result in jets and noise. In high-energy particle physics, researchers use jets as crucial event-shaped observable objects to determine the properties of a collision. However, many ionic collisions result in large amounts of energy lost as noise, thus reducing the efficiency of collisions with heavy ions. The purpose of our study is to analyze the relationships between properties of muons in a dimuon collision to optimize conditions of dimuon collisions and minimize the noise lost. We used principles of Newtonian mechanics at the particle level, allowing us to further analyze different models. We used simple Python algorithms as well as linear regression models with tools such as sci-kit Learn, NumPy, and Pandas to help analyze our results. We hypothesized that since the invariant mass, the energy, and the resultant momentum vector are correlated with noise, if we constrain these inputs optimally, there will be scenarios in which the noise of the heavy-ion collision is minimized.
In this work, the authors used various computational algorithms to assess how to best minimize the noise generated during collisions of heavy ions. Specifically, the authors hypothesized that three properties of particles correlate with noise, and so they decided to constrain these to obtain models that predict minimum noise generation.
Middle school math forms the basis for advanced mathematical courses leading up to the university level. Large language models (LLMs) have the potential to power next-generation educational technologies, acting as digital tutors to students. The main objective of this study was to determine whether LLMs like ChatGPT, Bard, and Llama 2 can serve as reliable middle school math tutoring assistants on three tutoring tasks: hint generation, comprehensive solution, and exercise creation. Our first hypothesis was that ChatGPT would perform better in completing all three tutoring tasks than Bard and Llama 2 due to its largest model size (175 billion parameters). Our second hypothesis was that Bard would perform better than Llama 2 in generating comprehensive correct solutions due to its relatively higher model size (137 billion parameters) than Llama 2 (70 billion parameters). We curated medium-difficulty, word-based middle school math problems on algebra, number theory, and counting/probability from The Art of Problem Solving and Khan Academy. A human tutor evaluated the LLMs' performance on each tutoring task. Contrary to our first hypothesis, results showed that ChatGPT didn't perform uniformly better than Bard and Llama 2 on all the tasks. ChatGPT outperformed both Bard and Llama 2 only in the comprehensive solution task. Bard didn't perform better than Llama 2 in the comprehensive solution task which does not support our second hypothesis. We conclude that middle school math teachers can use a combination of ChatGPT, Bard, and Llama 2 as assistants based on the specific tutoring task.
Here, the authors had a specific hypothesis in mind – that large language models (LLMs) could serve as digital tutors – and they tested three different LLMs for their reliability on three tutoring tasks. The authors made predictions based on the underlying architecture of each LLM. Finally, the authors tested their hypotheses using a specific set of questions they curated, essentially creating a neutral testing ground for the three LLMs.
One of the most dreadful diseases for women and their health is breast cancer. Breast cancer death rates are higher than those for any other cancer, aside from lung cancer. Machine learning and deep learning techniques can be used to predict the early onset of breast cancer. The main objective of this analysis was to determine whether machine learning algorithms can be used to predict the onset of breast cancer with more than 90% accuracy. Based on research with supervised machine learning algorithms, Gaussian Naïve Bayes, K Nearest Algorithm, Random Forest, and Logistic Regression were considered because they offer a wide variety of classification methods and also provide high accuracy and performance. We hypothesized that all these algorithms would provide accurate results, and Random Forest and Logistic Regression would provide better accuracy and performance than Naïve Bayes and K Nearest Neighbor. The Wisconsin Breast Cancer dataset from the UC Irvine repository was used to perform a comparison between the supervised machine learning algorithms of Gaussian Naïve Bayes, K Nearest Neighbor, Random Forest, and Logistic Regression. Based on the results, the Random Forest algorithm performed best among the four algorithms in malignant prediction (accuracy = 98%), and Logistic Regression algorithm performed best among the four algorithms in benign prediction (accuracy = 99%). All the algorithms performed well in the prediction of benign versus malignant cancer, with more than 90% accuracy based on their F1-score. The study results can be used for further research in prediction of cancer using machine learning algorithms.
In this work, the authors tested several machine learning (ML) algorithms for their ability to predict the onset of breast cancer. They hypothesized that all algorithms would be accurate at predicting early onset breast cancer but that some would perform better than others. Importantly, this type of hypothesis is no longer accepted at JEI. To make it acceptable for JEI, the authors would have had to focus their hypothesis on what would make a certain model accurate based on informed knowledge. For instance, they could reference particular parameters in each ML algorithm that would make the algorithm more likely to predict breast cancer in better or worse manner. Then, they would tune these parameters to test the hypothesis. A simple “These algorithms will work” is not a sufficiently rigorous hypothesis to be considered by JEI.
Feel free to contact the JEI Editorial Staff if you have any more questions about how to write a hypothesis-driven manuscript for JEI. Find the links to the full manuscripts mentioned above, as well as some other acceptable machine learning-based manuscripts below:
Comparison of three large language models as middle school math tutoring assistants
A comparative analysis of machine learning approaches for prediction of breast cancer
Evaluating the effectiveness of machine learning models for detecting AI-generated art
Evaluating machine learning algorithms to classify forest tree species through satellite imagery
These guidelines were last updated on June 24, 2024. The primary reason for our new guidelines on computational algorithm manuscripts is that JEI currently does not have the necessary expertise to evaluate the increasing volume of computational manuscripts. We always strive to provide extensive feedback to our student authors so that they can have a positive and educational experience while publishing their (likely) first scientific work. At the moment, we do not feel that we can do this appropriately for all computational manuscripts.