
“Stanford president resigns after fallout from falsified data in his research”; “Top Cancer Center Seeks to Retract or Correct Dozens of Studies” - these are just a couple of the science stories that have reached popular news outlets in the last couple of years, prompting a slew of commentary with titles like “Why Is There So Much Fraud in Academia?” and “Why Scientific Fraud Is Suddenly Everywhere”. While the topic seems to have just reached a scale that warrants widespread attention beyond the ivory tower, the issue is not new (1). Do you think you could identify fraudulent data in a paper? Are you confident you would never, even accidentally, publish misleading data? It is easy to say yes, but it can be deceptively tricky. In this post, we’ll go over a few considerations and resources to use to help ensure you don’t fall prey to fraudulent data.
Be a control freak
Let’s start at the very beginning - designing and running your experiment. Before you begin your experiments, make sure that you are using appropriate controls. Controls are the samples against which you will compare your experimental groups and can be classified as “Negative” and “Positive". Essentially, negative controls are used to show what an untreated sample looks like in whatever test you run, while positive controls are used to show that your test is working as expected. It is easy to think of controls as less important than your experimental groups, but they deserve the same care and attention. If your experiments are well controlled, you (and others) can have more confidence in the results.
Don’t touch raw data
You’ve run your experiment and collected your data. Now you are ready to analyze. Depending on the data type you have collected, this may involve some form of data manipulation. Some adjustments during data analysis are entirely acceptable and necessary, however, these steps should never be done on the raw data. Instead, make a copy of the data and apply any processing steps to the copy. Avoiding touching the original data means you or someone else can always go back and repeat an analysis to ensure it is reproducible.
Watch your figures
After you have run your experiment and analyzed the data, it is time to share your findings. Figures play a key role in communicating your results with others and figures that are easy to read and comprehend go a long way towards that goal. The challenge is condensing what may have been years worth of data into a few neat figures that fit within a journal’s formatting requirements. Doing so requires making a lot of decisions: What images do you show? How much of the image? What type of graph should you use? It’s easy to want to choose just the “prettiest” image or the graph that takes up the least space - but the other question you need to ask yourself is whether the figures give the most representative depiction of the data. Highlighting data that are an accurate representation of your data as a whole ensures that you are not misrepresenting your findings to others in the field.
Reading other papers
It is one thing to ensure that the work you produce is accurate and reproducible, you also need to make sure that the work inspiring your own is done to the same standards. As you read papers and examine figures, keep an eye out for fishy looking images or data discrepancies. If you are concerned about a paper check to see if others have reported any issues on sites like PubPeer - a database where readers can comment on published papers. The use of databases like this has certainly helped in identifying problematic research, but keep in mind that not every mistake is an example of fraud. Sometimes a copy-paste error is just a copy-paste error and little mistakes do not always alter the overall conclusions of a paper. However, it is important to critically evaluate the data as you assess the author's conclusions.
I think we can all agree that falsifying data or publishing irreproducible research is a bad idea, but it happens all too often (1, 2). Being aware of the issue and doing your best to follow best practices in your field are important first steps to reinforcing research integrity.
References:
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016). https://doi.org/10.1038/533452a
Van Noorden, R. More than 10,000 research papers were retracted in 2023 — a new record. Nature 624, 479-481 (2023). https://doi.org/10.1038/d41586-023-03974-8
Comments