Joshua Sonkiss, MD
Medical director, Behavioral Health Unit, Fairbanks Memorial Hospital, Fairbanks, AK
Dr. Sonkiss has disclosed that he has no relevant relationships or financial interests in any commercial company pertaining to this educational activity.
As a physician, you’re expected to practice evidence-based medicine. But how can anyone keep up with the latest research? While there are lots of secondary sources of information (including The Carlat Report), reading original research articles allows you to reach your own conclusions about each study. But it can also be daunting.
In this article I’ll discuss a focused approach for identifying and evaluating research most relevant to your practice.
Step 1: Decide What to Read
Scores of new papers appear every day, and no one can read them all. Many clinicians’ eyes glaze over at the thought of reading journal articles, so I recommend that you focus on articles relevant to your own clinical cases. This primes your mind for new information and helps with recall.
If you start with a concise clinical question about a real patient, online sources like PubMed’s Clinical Queries page (www.ncbi.nlm.nih.gov/pubmed/clinical) can make it easy to find relevant articles. Search engines like Google can also guide you toward primary literature, but be aware that some search results are heavy in promotional content and its associated biases.
Once you’ve identified a paper, just skimming the abstract doesn’t cut it. There’s no way to evaluate a study’s caliber—or how well the results apply to your patients—without reading the actual paper. Worse yet, glancing over abstracts can lure you into accepting authors’ sometimes biased conclusions at face value.
Misleading Conclusions
In an infamous example of biased conclusions, the author of a widely-reported meta-analysis dismissed antidepressant efficacy in all but the most severe cases of depression (Kirsch I et al, PLoS Med 2008;5(2):e45).
Another research team reanalyzed the same data and reached a very different conclusion: antidepressants are effective in all but the mildest cases of depression (Vöhringer P & Ghaemi N, Clin Ther 2011;33(12):B49–B61).
Reading only the abstracts, one might be inclined to think that the different conclusions represent solely the biases of the authors. In reality, an analysis of each paper shows that different statistical methods were used in each, and these details can sometimes lead to different conclusions.
Step 2: Get your Hands on It
You can always plunk down cash, but getting full articles is expensive if you don’t have a system.
If you work at a hospital or university, you probably have access to a medical library where mainstream journal articles are available for free, while less-common publications can be ordered. Some medical libraries “lend” journal articles to physicians in their communities even if they aren’t university-affiliated.
PubMed’s search page (www.pubmed.gov) has direct links to many free articles, and the National Library of Medicine has a page dedicated to finding full-text resources (http://1.usa.gov/1q3bvnw).
Another excellent resource is Google Scholar (scholar.google.com), which scours the Internet for PDF versions of full-text articles, and is a powerful search engine in its own right, comparable to PubMed. Finally, many professional organizations offer online access to their journals as a benefit of membership.
Step 3: Understand the Design
Once you’ve selected an article, you’ll need to identify the study design. For an in-depth review of study designs, check out Clinical Epidemiology: The Essentials by Robert and Suzanne Fletcher (5th edition, Lippincott Williams Wilkins;2012).
There are many variations and hybrids, so take a close look at the “methods” section of papers you read. Most published research in psychiatry can be categorized as one of the following types:
Case reports: Someone writes up an interesting case they’ve seen. Case reports generate hypotheses but don’t test them. They are highly susceptible to bias, in part because they often describe the joint occurrence of uncommon events. Case reports rarely describe treatment failures. By definition, they only describe one patient (ie, “N=1”), so case reports almost never provide a basis for altering clinical practice.
Case series: Someone writes up a small number of similar cases. Case series have no control group and don’t test hypotheses, and they suffer the same susceptibility to bias as case reports. However, they reveal patterns among similar patients, and may lead to new hypotheses or suggestions for managing unusual or refractory conditions.
Case-control studies: Researchers select cases with versus without a particular outcome, then ask subjects about prior exposures. For example, people with or without a current diagnosis of schizophrenia may be asked about exposure to cannabis. Case-control studies are highly susceptible to recall bias. They give an estimate of risk called the odds ratio.
Cohort studies: Groups of people are followed prospectively to see how many people either with or without a particular exposure develop an outcome of interest. For example, people who do and don’t smoke cannabis are followed up after 10 years to see how many in each group developed schizophrenia.
Cohort studies allow calculation of relative risk, but they are prone to misclassification and susceptibility bias. Cohort studies can also evaluate non-randomized treatment effects, for instance, whether cannabis smokers who take antidepressants may be more likely to develop schizophrenia than those who do not.
Some cohort studies obtain information from registries of patient data—for example, all members of an HMO; all VA patients; or all individuals born in Denmark between 1980 and 1989. These are helpful because they deal with “real-world” patients, although they are not randomized. Electronic medical records (EMRs) make registry studies much more feasible.
Randomized controlled trials: Subjects are carefully selected and then randomized to treatment or placebo groups. RCTs evaluate the efficacy of treatment in the short-term, but they are costly.
Some RCTs are “open-label,” meaning that subjects and researchers know what’s being given, while “double-blind” trials mean no one knows who’s receiving treatment and who’s receiving a placebo. Blinding can be difficult to accomplish, for example, in studies where one treatment arm receives psychotherapy. All RCTs are subject to selection bias.
Systematic review: This is a review of research designed to answer a specific clinical question, for example, “what is the most effective approach for treating psychotic depression?” The Cochrane Collaboration (www.cochrane.org) produces many high-quality systematic reviews. Systematic reviews impose strict inclusion criteria on the studies they analyze, but publication bias poses a major problem.
Meta-analysis: Researchers use statistical methods to produce a weighted average of treatment effect sizes derived from multiple studies. The weight accorded to each study depends on sample size and quality. Some systematic reviews incorporate meta-analytic methods. Again, publication bias can strongly influence results.
Step 4: Identify Biases
Bias refers to anything that systematically and unexpectedly influences research results. It affects all research, not just the studies carried out by the pharmaceutical industry. There are many classifications of bias, and study designs differ in their susceptibility to each.
Understanding how bias affects internal validity—meaning your confidence that a study accurately identifies cause-and-effect relationships—is indispensable to critically appraising research. (See “Investigating Bias in Research” in this issue for more about bias.)
Step 5: Think About Random Error
Pure chance can throw off study results and render them invalid. In general, effect size and study power determine how likely the results may have arisen purely from chance. Appendectomy for acute appendicitis, for instance, has such a large effect size that its benefit is almost certainly not due to chance.
A larger number of subjects—otherwise known as more power—reduces the role of chance in clinical trials, but introduces more heterogeneity into the population.
Researchers use statistical methods to determine the likelihood their results arose from chance. If that probability falls below an arbitrary threshold, such as p<.05 in most RCTs, the results are said to be statistically significant. Statistical methods go beyond the scope of this article, but Clinical Epidemiology: The Essentials provides an excellent introduction.
Step 6: Appraise the Study
Just because results are touted as statistically significant doesn’t mean they’ll help you help your patients. Validity refers to how legitimate the results of the study actually are, and can be evaluated by the FRISBEE mnemonic, created by Duke University’s residency program. The last E, “equivalence,” may be the most important, as it pertains to external validity, or how well study results will generalize to patients in your practice (Xiong GL & Adams J, Curr Psychiatry 2007;6(12):96).
Although FRISBEE was developed for appraising RCTs, similar concepts can be applied to other study designs. Appraisal worksheets for treatment and diagnostic studies are available on Duke University’s website (bit.ly/1rMWqGc).
Throwing FRISBEEs at Research
Follow-up: Did the study include high drop-out rates or poor follow-up? If so, why?
Randomization: Were subjects randomly allocated to treatment and placebo groups?
Intent-to-treat analysis: How well did the study account for drop-outs? Did researchers analyze data from all subjects who entered the trial?
Similar baseline: Were treatment and placebo groups similar at the start of the trial? (Many research papers include a table comparing the two groups on a variety of measures. They should be comparable.)
Blinding: Were subjects, researchers and other health care personnel blind to treatment? Keep in mind that if the same rater assessed for both side effects and clinical benefit, the rater may be “unblinded” due to knowledge of the patient’s side effects.
Equal treatment: Were the groups treated equally apart from the intervention being studied?
Equivalence to your patient: Are subjects in the study similar to your patient? Subjects in clinical trials usually have few medical or psychiatric comorbidities; how do they compare with the patients in your practice?