Rating scales have been gaining favor in diagnostic assessment of children with psychiatric and/or neurodevelopmental disorders. Many children have trouble talking directly about their behavior, and rating scales can often help clinicians fill in the blanks. While they have their benefits, misuse or overreliance on rating scales can interfere with the assessment process and lead to misdiagnoses, inappropriate interventions, and poor outcomes.
Furthermore, they don’t replace the old-fashioned diagnostic interview, which brings with it alliance-building, subtle reassurance, and cultivation of the doctor-patient relationship. And don’t forget that as part of the assessment, you also want to get the perspective of the child’s parents because they are typically key players in facilitating change.
So how can rating scales be most useful to psychiatrists and patients? Which rating scales should you use? In fact, so many questionnaires exist that clinicians may find themselves in a conundrum over which scales to incorporate into their assessments, as well as what to make of those they choose. Let’s look at some of the facts you need to know about scales, as well as the pros and cons for using them when evaluating children.
The Facts About Rating Scales First, it is important to understand how a rating scale is created. Typically, a group of experts develop a number of potential items that fit the intended construct. For example, if the rating scale is to assess behavioral problems, it might include the item: “My child hits others.” Next, the items are analyzed statistically, using techniques such as a factor analysis, to see how the test items naturally cluster together (eg, what items represent “aggression”). The scale developers then do pilot studies to weed out “bad” items and confirm the “good” ones.
When the final version of the rating scale is complete, it is standardized. A sample of people who reflect the U.S. population completes the questionnaire, and normative scores are produced to determine appropriate cut-off scores for what is deemed a clinically significant behavior. Basically, this answers the question, “How often does an average individual engage in this behavior,” generally broken down by age and gender.
In addition to standardization, test developers also need to measure the psychometric properties of a given scale. Good reliability (how consistent the scale is) and validity (does the scale measure what it’s supposed to) are essential to clinical use of a rating scale. You will usually find information on test development and psychometric properties in the manuals that accompany rating scales, and you can also obtain this information from the medical literature or the publisher.
While you may be tempted to ignore these chapters and jump into administration and interpretation, it is important that you look at these elements to ensure the scale is actually worth giving to your patients. Next it is important to understand the structure and limits of each scale. Some scales such as the Behavior Assessment System for Children, Second Edition (BASC-2), Conners Comprehensive Behavior Rating Scales (CBRS), and Personality Inventory for Children, Second Edition (PIC-2) are called multidimensional or omnibus. This type of scale evaluates the individual for several dimensions of behavior or temperament and can screen for several diagnoses. Other scales are more targeted and intended for refining specific diagnoses, such as the Conners Third Edition, known as Conners 3 (for ADHD), or the Gilliam Autism Rating Scale (GARS) (for measuring the severity of autism).
Some rating scales can be used as screens for possible psychopathology or atypical development. Others are designed to measure response to interventions over time, and are meant to be given repeatedly. You can use rating scales to confirm diagnoses, but only when you include other assessment methods such as a clinical interview and direct observation of the child.
You also want to consider the pragmatics of the scales when choosing which ones to use in your practice. Scales vary in the number of questions, the sophistication of the language, and the way the data is recorded and scored (entirely by hand, by computer after input from handwritten forms, or by input into a computer directly by the patient). It is important to choose a scale that a patient can realistically complete in the time available and that staff can score within a reasonable amount of time.
Let’s be honest; these scales are not cheap. So when you consider the use of a new scale, think about how to conserve its use as well. For example, it will be expensive to mail commercial scale forms to patients prior to an intake appointment if the no-show rate is high. In that situation, it might be better to ask the families to come to their appointment early and fill out the questionnaire in the waiting room.
The Benefits of Using Scales Rating scales offer numerous benefits as an aid for clinical assessment. They are easy to administer and create minimal time and burden for the clinician. Scoring is also easy, particularly if they use a computer scoring program. You can gather information on how a child behaves across different settings (eg,home vs school) in a cost efficient manner without having to interview informants directly. This way, discrepant scores can inform you that specific environments or people may be triggering maladaptive behaviors. With multidimensional questionnaires, you get a lot of “bang for the buck,” since the questions cover a broad domain and can pick up areas of potential psychopathology that you may have not addressed in the clinical intake. This is particularly true given the comorbidity that exists among psychiatric and neurodevelopmental conditions. You can scan specific items and then follow up with the rater on a particular area of concern (eg, hallucinations or self-esteem).
Psychometric properties of rating scales have improved in recent years, and many are now developed using national samples through empirically valid methods. Assuming that scores adequately depict how a child is functioning, composite scores (typically T-scores) are presented in a continuous rather than dichotomous fashion. Line graphs provide a visual representation of the scores and can be helpful in revealing the severity of each domain.
Although clinicians often use a cutoff score to help determine eligibility in a diagnostic category, examining the specific scores can be useful in identifying at-risk behaviors. You can administer rating scales repeatedly across time to provide longitudinal data for a child. This can be particularly useful to track the effectiveness of specific interventions, and can be a great visual aid for a concerned parent. Rating scales meet the IDEA criteria of quantifiable data, and should therefore hold weight in determining a patient’s eligibility for an individual education plan (see CCPR, October 2011, for more detail on this).
What is a T-Score? T-scores are standardized scores. A score of 50 represents the mean. A difference of 10 from the mean indicates a difference of one standard deviation. Thus, a score of 60 is one standard deviation above the mean, while a score of 30 is two standard deviations below the mean.Source: University of Chicago Library
Rating scales can objectify symptom severity for third-party payers, as well as show patients and their parents, evidence of improvement (or lack thereof). Norms and standardization make for a built-in way to answer the question, “But doesn’t every child do this?” Most important, you can use rating scales when the treatment isn’t working, by helping you go back to the beginning and ask, “What did I miss?”
The Limitations of Scales Despite the numerous benefits of incorporating rating scales into diagnostic assessment, there are pitfalls you should be aware of. First of all, it is far too tempting to use the questionnaires for more than they were intended. You don’t want to use the rating scales as the primary basis for diagnosing a condition and skimp on other important parts of the assessment process. This is particularly hard to resist when the questionnaire has categories that already correspond to specific diagnoses from the DSM-IV-TR. Unfortunately, this can be an issue if the time allotted for each patient is brief.
Another negative aspect of rating scales is that the validity of the instrument is limited by the rater’s objectivity in item response. For example, some parents who are overwhelmed by their child’s misbehavior may mark “almost always” for virtually every negative behavior, which then results in a T-score well over 70 across all domains. This bias can be difficult to interpret, particularly if responses from other raters are dissimilar. Assessment of behavioral change can also be difficult over the short-term, particularly as negative behaviors might remain clinically elevated for some time even if improvement is taking place.
While rating scales are useful in identifying the frequency and severity of problem behaviors, they do not reveal how particular behaviors affect a child’s daily functioning. Therefore, follow-up interviews, inference, and other assessment tools are necessary to determine which problem behaviors should be the target of intervention.