For most of us, scientific studies are difficult to read and understand. Sure, we can read the abstracts and make some sense of the conclusions. But, how do we know if these conclusions are valid? A paper published in a scientific journal is not foolproof and often the conclusions are reported by the media in a sensational and misleading manner. As I have explained before scientific studies are the true source health-based knowledge and it is best not to rely on 2nd hand interpretations of them. Therefore, this article will give you some tips and tricks to understand studies quickly and accurately.
Types of Scientific Studies
These are articles that do not conduct original research but analyse existing research. There are 2 categories of them:
- Meta-Analyses and Systematic Reviews. These studies compile data from many similar studies to provide the best possible answer to a research question. They are basically the same, except that systematic reviews tend to follow a more formal and defined protocol.
- Review Articles. Also called literature reviews, synthesize and explain many studies in a subfield. They are more descriptive and are useful in providing an overview of a topic. It is essentially like an evidence-based expert opinion.
Original research is the typical experiments studies you would expect. There are 4 main categories of them.
- Experimental Studies. The researchers take a group of people and split them into 2 groups. The experimental or test group gets the treatment. Whereas, the control group gets nothing. If the people are randomly allocated to the 2 groups the study is called a randomised control trial and is considered the highest form of evidence.
- Animal Studies. Strictly speaking, these are also experimental studies. They take a group of animals which are normally mice or rats, give half the treatment (experimental group), and the other half nothing (control group). However, researchers are permitted to do things to mice that they cannot to humans. Researchers often dissect the mice afterward to see exactly what has happened and measure the changes in the tissues directly. Moreover, they can give high doses of suspected toxins or breed mice with genetic predispositions.
- In Vitro Studies. In Vitro means ‘in the glass’, as these studies are typically conducted in Petri dishes or test tubes. They involve directly looking at how microorganisms and cells behave, often with microscopes. For example, a study may look at how certain carbohydrates affect bacterial fermentation, which has implications for our gut flora.
- Observational Studies. The researchers simply observe a group of people tracking certain factors, after a period of time they examine if there are correlations between the factors they have been tracking. Such as, do people who eat lots of fat get heart disease more than those who do not. These studies break down into 4 subgroups.
- Prospective Cohort Studies. These studies take a group of people with similar characteristics and observe them over time. Typically, the researchers are looking for who develops diseases and what factors correlate with that.
- Retrospective Cohort Studies. These studies are the same as the above, but the data about the cohort has already been collected and the researchers examine who has diseases at present and what factors in the past may have led to that. Prospective and retrospective refers to when the data was collected and makes little practical difference to the reader.
- Case-Control Studies. Researchers take 2 groups of people: those who have a disease (the cases) and those who do not (the controls). Next, they take past data about the groups and attempt to find what factors have led to that result.
- Cross-Sectional Studies. A population is examined at one point in time only. For instance, a study may look at the rates of cancer compared to where the participants live (urban vs. rural).
The Structure of Scientific Studies
The abstract is the most useful part of an article. It summarises the aims, methods, results, and conclusions of an article, and should be read before reading the main text. It is also there to determine if the full article is worth reading. Furthermore, it is useful in surveying what the literature says about a topic, as you can quickly read many of them. Ultimately, a well-designed study can be mostly understood by reading the abstract alone. However, this requires you to trust the researchers as you cannot check the details of a study from the abstract alone.
A good introduction summarises the current research on a topic and says why this particular study needs to be conducted. The typical reason is that it is filling a gap in the knowledge base or clearing up previously controversial results. It can be useful to follow the citations in the introduction to gain a better understanding of the field.
Firstly, this section includes details of who the participants in the study are and how they were selected. Secondly, the procedure of the study will be explained. This can get very technical in laboratory studies, looking up terms in Wikipedia can help you understand the methodology.
In contrast, if the study is a literature review or meta-analysis the methods section will include the search terms used and explain how the articles for inclusion were selected. Typically researchers use 10-20 keywords which they put into PubMed. Next, they will sift through the abstracts, selecting studies which met their criteria.
Most commonly, tables and graphs are used to present the results. There is some text as well to provide further details. The results always use statistical tools to check the likelihood of the results being due to chance. These are explained later.
The discussion will tell you what the results mean and also compare them to the existing research in the field. The researchers may also postulate why certain results occurred, especially if they were unexpected. This involves citing evidence from other studies.
Discussion sections are full of citations and these can lead you onto further interesting research. Academic search engines like google scholar and PubMed are far from perfect and often miss important studies. It is essential to utilise the citations in the discussion and introduction sections if want to become a top rate researcher (amateur or professional).
This section summarises the overall findings of the study. Many studies do not have a specific conclusion section. In that case, either the last few paragraphs of the discussion section serve as the author’s conclusion or the entire discussion section is their conclusion.
How to Interpret Scientific Studies
What Journal Is It Published In?
Some journals are much more prestigious than others, therefore research published in it has more academic weight. Top journals include the British Medical Journal (BMJ), The Journal of the American Medical Association (JAMA), The Lancet and Nature. A good guide to what is decent journal is its Impact Factor. This is a number assigned to a journal each year. It is calculated as follows: the number of citations its articles published in the 2 preceding years, have received this year; divided by the number of articles it has published in the 2 preceding years. In short, it shows how important other experts think the journal is (or more accurately how important the studies published in that journal are).
Studies published in smaller, less well-known journals are not necessarily bad, and the impact factor is only a rough guide. Smaller journals may be more open-minded and have more diverse viewpoints. Sometimes well-established journals can become dogmatic, and only publish articles demonstrating a certain viewpoint. Also, smaller journals may simply be more specialised and have a smaller audience.
Funding and Conflicts of Interest
You should always check who funded the study and if the authors have any conflicts of interest. This is stated at the bottom of each article. Bias does exist within studies. One systematic review found that studies funded by the pharmaceutical industry were more likely to report favourable outcomes to the funder (the drug company), than studies funded by other sources (1).
I encourage you to follow the citations to check if they support what the author is saying, and also to dig deeper into a topic. In addition, more citations are not necessarily better although authors should provide evidence for any claims they make. If a claim is made without sufficient evidence it is merely an opinion.
Correlation not Causation
Observational studies cannot show causation i.e. one thing caused another. This is because they simply observe correlations between factors, such as when people eat food x they more commonly get disease x. Perhaps it is the sauce that people put on food x that causes the disease and not the food itself. A novel example of this is that murder rates increase when ice cream sales are high. Of course, eating ice cream does not cause people to commit murder, it is in fact the hot weather that makes people go crazy.
Researchers who conduct observational research know this too and try to combat it. It is standard to measure all obvious risk factors for disease such as smoking, alcohol intake, and age. Then adjust the results to eliminate these unwanted influences. Normally, they take it a step further and adjust their results based on every factor they measure. This helps to isolate one factor’s effect from another. So while they cannot strictly infer causation, they can be more certain that factor x produces result x. (A good example of how to observational research – Epistemology: The PURE Study).
Animal vs. Human Studies
Animal studies are often more valuable than we are led to believe. The criticism of ‘but we’re not mice’ is only partly true. We have similar biology to other mammals, and the biology of all living things is linked. However, there are limits to animal studies and they can be misinterpreted. On average, human studies are more useful and preferable to animal studies. (An example of how to use animal studies – What Happens When Lab Rats Eat Cake).
This also holds true for in vitro studies. They offer certain advantages such as direct observation of microorganisms and cells, but they are not necessarily generalisable to humans and should be used with caution.
Short Term vs. Long Term Studies
In general, the longer the study the more useful. Certain things may appear beneficial in the short term, but have detrimental effects in the long term. In contrast, shorter-term studies can be more tightly controlled, and we can infer causation with greater certainty, so there is a trade-off.
Bigger vs. Smaller Studies
On average, the larger the study the better. A good study should be generalisable to the population it seeks to represent. The larger the sample size the more confident you can be that it represents the target population. For instance, if you were to do a study of men in the UK that smoked your target population is roughly 2 million people. However, your sample size may only be 100. A larger sample of 10,000 would be much more representative of the target population.
Moreover, the smaller the sample size the greater the odds are, that your results are due to chance. For example, a gambler may hit a winning streak and think that he will continue to win all day. However, if he plays for long enough the odds will even out and he will lose overall: the house always wins. Over a large number of observations, random variables even out, and we can see the true effect.
Surrogate Markers vs. End Points
Surrogate markers are measurements that are thought to correlate with clinical endpoints. For instance, high LDL cholesterol (surrogate marker) is thought to correlate with heart disease (clinical endpoint). Surrogate markers are things like cholesterol levels, markers of inflammation and cognitive function tests. Whereas, clinical endpoints are symptoms, diseases, and deaths. However, surrogate markers may not always correlate with endpoints, and many are still being debated, such as LDL cholesterol. Therefore, endpoints are much stronger and definite than surrogate markers.
Interpretation of Results
The majority of the time the author’s interpretation of their results is correct. However, if the results are particularly novel or unexpected the underlying mechanisms driving these results may be unknown. In this case, the authors will postulate various theories and since they are experts in the area their suggestions are a good starting place. However, it is important to note there may be other possibilities. Future research often aims at tackling this.
Checking Your Own Bias
Often times we seek to confirm what we already think is true. You must be prepared to read research with an open mind and disregard false conclusions in light of new evidence. That is the very nature of science itself. Our brains are not perfectly rational or objective. It is important to be conscious of own personal biases and to challenge your conclusions with new evidence.
In addition to personal biases, human beings have a number of uniform biases. These are termed psychological heuristics. For example, we tend to more heavily rely on the first piece of information given and disproportionality value it. In one study, children were randomly given a high or low number and then asked to guess the number of jellybeans in a jar. The children adjusted their estimate based on the number they were given (2).
Another heuristic called the escalation of commitment explains that when we have invested in a decision, we are more likely to continue to invest in it. Even in spite of new evidence that the costs outweigh the benefits of continued investment. This may explain why once we have formed a theory about the world we are reluctant to overturn it, in spite of new evidence to the contrary. Merely knowing about these bias can help reduce their effects. For more information see books such as Thinking, Fast and Slow and Predictably Irrational.
Statistical Terms Explained
Absolute vs. Relative Risk
Your absolute risk is the overall risk you have from something happening, such as suffering a heart attack. Your relative risk is how much your risk will increase or decrease depending on if you take an action, such as taking regular exercise.
To put this into context, the absolute risk of a 60 year old man suffering a heart attack is 10%. If he takes regular exercise his relative risk decreases by 50%. So, his absolute risk is now 5%, a 50% reduction in his absolute risk. (Note: these figures are made up). This is important to know, as researchers will often report in relative risk, which makes their interventions seem more significant than they are.
Both absolute and relative risk are expressed as decimals. If taking regular exercise decreases your relative risk by 50%, it would be written as 0.5 . A relative risk of 1.0 would mean your risk is average, the risk of the control group in the experiment.
Researchers often use hazard ratios to report the effect of an intervention, like taking a medication, on preventing an outcome, like getting a disease. If the hazard ratio is 0.5 for an intervention group, it means half as many people in the intervention group have experienced that outcome, than in the control group.
For example, imagine a study measuring the effect of eating fruit and vegetables on cancer, the study runs for 1 year. The study finds that those who eat 5 portions of fruit and veg, compared to those who eat 1 portion have a hazard ratio of 0.9, of getting cancer. This means during the study, the group eating 5 portions had 10% more cases of cancer than those eating 1 portion.
The p value is included alongside the results. It is a number between 1 and 0 and indicates the likelihood of the null hypothesis being true. For example, imagine a study that seeks to measure the effect of a drug. The null hypothesis would be that the drug has no effect, and the hypothesis would be that it does have an effect.
If we had a p value of 0.06, this would indicate that there is a 6% chance that the null hypothesis is true. The P value is calculated based on the results. A 0.06 p value means that if the null hypothesis is true, 6% of studies would get a result at least as extreme as this study did. Therefore, a lower p value is better. In general, 0.05 or lower is strong evidence the results are sound.
Confidence intervals are also included with the results. It is presented as a range with a percentage, such as 95% CI = 22-40. It indicates how sure the researchers are that the true value of the population lies within this range. For example, 95% CI = 22-40 means that we are 95% sure the true value of the population lies within the range of 22-40. So, smaller ranges and higher percentages are better, as they indicate greater certainty in the results.
- Identity which type of study it is and consider how that affects its usefulness.
- Read the abstract to decide if you want to read the whole study.
- Understand the structure of journal articles to allow you to scan for the key information.
- Take into consideration the size and length of study among the other factors discussed, whilst interpreting the study.
- Check your own bias, and read with an open mind.
- Make sure you understand the common statistical terms used, so you can check how sound the results are.