Strength Testing in Motor Neuron Diseases
- 598 Downloads
Loss of muscle strength is a cardinal feature of all motor neuron diseases. Functional loss over time, including respiratory dysfunction, inability to ambulate, loss of ability to perform activities of daily living, and others are due, in large part, to decline in strength. Thus, the accurate measurement of limb muscle strength is essential in therapeutic trials to best understand the impact of therapy on vital function. While qualitative strength measurements show declines over time, the lack of reproducibility and linearity of measurement make qualitative techniques inadequate. A variety of quantitative measures have been developed; all have both positive attributes and limitations. However, with careful training and reliability testing, quantitative measures have proven to be reliable and sensitive indicators of both disease progression and the impact of experimental therapy. Quantitative strength measurements have demonstrated potentially important therapeutic effects in both amyotrophic lateral sclerosis and spinobulbar muscular atrophy, and have been shown feasible in children with spinal muscular atrophy. The spectrum of both qualitative and quantitative strength measurements are reviewed and their utility examined in this review.
KeywordsMotor neuron disease Spinal muscular atrophy Manual muscle testing TQNE Hand held dynamometry
A clinical hallmark of motor neuron diseases is a progressive loss of strength. This loss underlies much of the disability that patients encounter, and is a major driver of healthcare costs associated with this constellation of diseases. Although other factors, such as upper motor neuron burden, may alter function, eating, breathing, speaking, ambulation, and fine motor control, all are dramatically affected by changes in muscle strength. A wide range of outcome measures have been employed in clinical trials, including survival, functional scales, and measures of specific functions such as vital capacity, sniff nasal inspiratory pressure, timed up and go test, walking distance for a defined period of time, and many others. However, as all of these are, in large part, a function of muscle strength, direct strength measurements have been a part of the vast majority of most clinical trials of experimental therapies for motor neuron disease. It is critical, therefore, that strength measurements be sensitive, repeatable, and performed in the same manner across study centers in multicenter clinical trials. This review will discuss the ways that muscle strength has been measured in clinical trials, and address the positive attributes and limitations of the various methods. The relationships between strength measures and other outcome measures will also be addressed.
Methods of Strength Assessment
Manual Muscle Testing
Manual muscle testing (MMT) was first described in 1912 to assess the status of patients with poliomyelitis . In modern clinical settings, strength is most often assessed using the MMT scale established by the Medical Research Council of the Royal College of Physicians and Surgeons . In its original form, this scale grades strength of individual muscles on a scale from 0 to 5, with 0 representing no muscle function and 5 indicating normal strength. Grade 1 implies observation of muscle activation without movement, grade 2 requires the ability to move with gravity eliminated as a force, grade 3 means that a muscle can move a limb against gravity, and grade 4 requires good but not normal muscle power. In clinical trial settings, this scale has sometimes been expanded using either pluses or minuses or equivalently to 10 points with similar anchors but the ability to grade these subjective impressions in a somewhat finer manner.
MMT strength grading has been used in a number of clinical trial settings. In the phase III trial of riluzole versus placebo in patients with amyotrophic lateral sclerosis (ALS), a statistically significant benefit of riluzole was noted with respect to survival, with a suggestion of a dose-dependent trend toward increased efficacy with higher doses [3, 4]. However, MMT strength grading showed no difference from placebo, or any hint of a dose effect. In a later study comparing riluzole serum levels to both survival and MMT muscle testing, no effect of riluzole concentration was found on either measure . In a phase III study of minocycline in ALS, a statistically significant trend toward faster progression in the ALS Functional Rating Scale-revised (ALSFRS-R) in patients treated with minocycline versus placebo; there was a trend in the same direction for MMT, but this was not significant . A large, phase II trial of TCH346 in ALS showed a trend toward detriment on most outcome measures assessed but no effect on MMT . Other studies have employed MMT testing in ALS, spinobulbar muscular atrophy, and spinal muscular atrophy (SMA) [8, 9, 10, 11, 12]. However, in most studies, no therapeutic benefit was noted with any outcome measure, so that the relative sensitivity of MMT testing compared with other measures could not be assessed. One study that suggested a therapeutic benefit in spinobulbar muscular atrophy has been recently reported; in this study, clenbuterol treatment was associated with a benefit as measured by the 6-min walk test, while no effect was noted in strength as measured with MMT testing .
To evaluate the properties of MMT testing as an outcome measure in ALS, the Great Lakes ALS Consortium performed a multicenter natural history study, evaluating patients with ALS longitudinally every 3 months for 1 year . Eighteen muscle groups were evaluated bilaterally for a total of 36 muscles tested; evaluators all attended a training course to maximize consistency of measurement and technique. A 10-point scale was used. Coefficient of variation of rate of change [CoV(r)] was calculated for single muscles, as well as averages from 2 to 36 muscle groups. CoV(r) is a measure that incorporates variability from reliability of measurement, as well as intrinsic variability of progression from patient to patient. Not surprisingly, CoV(r) was improved as more muscles were averaged together; with 36 muscle groups, CoV(r) was as good or better than many other ALS outcome measures, including the commonly used ALSFRS-R and vital capacity.
Quantitative Muscle Testing
Although the abovementioned studies show that MMT testing can be performed reliably in patients with motor neuron diseases, the measure itself is subjective, and it is clear that the scaling does not meet the requirements of an interval scale. The fact that several studies suggesting therapeutic change using other outcome measures did not show effects on MMT muscle strength testing also raises questions regarding the sensitivity of this measure with respect to its ability to detect meaningful therapeutic benefit. For these reasons, a method to measure muscle strength quantitatively is potentially attractive.
A variety of quantitative measures have been employed, some using strain gauges and others with hand-held myometers [15, 17]. However, the first well-defined method to be used in clinical trials of motor neuron disease was developed by Munsat and Andres [16, 18, 19, 20, 21]. Named the Tufts Quantitative Neuromuscular Evaluation (TQNE), the entire instrument included measurements of muscle strength, pulmonary function, and timed motor tasks. With respect to muscle strength, standardized patient positions were defined, and a strain gauge moved around the patient to be orthogonal to specific muscle groups was used to measure isometric strength. The full battery included 9 muscle groups measured bilaterally, plus handgrip measured with a separate grip dynamometer.
Longitudinal studies of patients with ALS demonstrated a number of important findings. First, though careful patient positioning and rigorous evaluator training resulted in reproducible measurements for individual muscles over time, reliability was increased if certain muscle groups were considered together. To be determined which muscle groups could most effectively be combined, Andres et al.  performed a factor analysis on strength data from single studies of 176 patients with ALS to determine how different muscle groups were intercorrelated. This analysis showed that muscles from the arms could effectively be combined, as could muscles form the legs. As absolute strength for different muscles can be vastly different, each muscle strength measurement was linearly transformed to a z score, using ALS population means and SDs for every muscle. Muscles in different body regions were then averaged to yield a more global value called a megascore. Declines in megascores for the arms and legs were, in general, very linear over time; however, significant differences were noted both from patient to patient and from one area of the body to another. Decline in leg strength was slightly slower than arm strength in patients with ALS patients .
Several studies have directly compared quantitative strength testing with MMT grading. Andres et al.  compared progression in patients with ALS using both TQNE and MMT grading, and noted that reproducibility and sensitivity to decline were both strikingly greater using quantitative evaluations. The Great Lakes ALS Consortium  evaluated both the effect of averaging strength in multiple muscles and the relative sensitivity to change with MMT testing and TQNE. Not surprisingly, the characteristics of the measure improved with number of muscles evaluated both for TQNE and MMT. However, for any given number of muscles averaged, the characteristics of TQNE were superior to that of MMT grading.
Quantitative isometric strength using the TQNE system has been used in several multicenter ALS trials. In a trial of celecoxib in ALS , no significant differences between celecoxib and placebo were found, including for muscle strength. As previously noted, leg strength declined slightly slower than arm strength. In a trial of topiramate in ALS , only arm strength was measured. A deleterious effect was found for topiramate on all measures; this effect was not statistically significant for the ALSFRS-R or for vital capacity, but was significant for the arm megascore. The percent difference over time between treatment groups was greatest for the arm megascore compared with other measures. A small, phase II study comparing talampenal 50 mg orally 3 times daily to placebo in 60 patients over 9 months showed a trend toward reduced strength loss in patients treated with talampenal; a subsequent phase III study using MMT testing did not confirm this finding .
Both natural history studies and clinical trial data show that quantitative strength testing using the TQNE apparatus provided high-quality, reproducible data on muscle strength. However, in the clinical trial setting, several aspects of testing proved problematic. First, the apparatus was quite large, requiring a full examination room. Many clinical trial sites found that committing a full room for testing that occurred occasionally at most was too great an investment. Second, to test all of the muscles suggested in the original evaluation required that patients assume a variety of positions on a physical therapy table, including fully supine and fully prone. Such position changes were fatiguing for many patients, and the appropriate positioning was not possible for patients who had orthopnea. Thus, as the trial progressed, an increasing number of patients were unable to complete the evaluation. In addition, a trained physical therapist or other clinician was required to perform the test.
To address these issues, the use of a hand-held dynamometer was proposed. Such dynamometers have been in frequent use in a variety of clinical situations, primarily to assess recovery after stroke or injury. In spinal cord-injured patients, use of a hand-held myometer was much more reliable than MMT testing . In the clinic, quantitative strength measurements with a hand-held device has been useful in a variety of neuromuscular diseases [17, 27]. In general, however, reproducibility was not as good for TQNE in most cases, and issues were raised about variability of both patient and evaluator positioning, as well as the fact that, for strong muscles, the strength of the evaluator might be less than the patient . Another source of variability lay in the fact that some protocols required a “break” in position while performing testing; that is, in order to perform a test successfully, the evaluator must overcome the muscle force exerted by the patient. This requires more evaluator strength than the “make” maneuver, in which the evaluator simply matches the strength exerted by the patient. However, in normal volunteers, there is < 3 % difference between forces measured in the make versus break technique, suggesting that evaluator muscle strength may be less of a source of variability than originally proposed .
While studies of quantitative strength using HHD have provided important insights into the pattern of disease progression in ALS, and HHD has been used frequently in ALS clinical trials, strength measurement has been questioned as an important clinical trial endpoint for several reasons. First, rate of decline in extremity muscle strength does not strongly correlate with survival . The reason for this poor correlation, however, is clear. Death in ALS is almost always due to respiratory failure, an aspect of loss of strength not captured in extremity measurements. Second, power analyses comparing sample sizes required to show meaningful effects in different outcome measures have suggested that the ALSFRS-R has the potential to show a statistically significant difference with fewer patients per group than other measures, including quantitative muscle strength and vital capacity [36, 38]. These estimates of power are derived primarily from 2 factors: rate of change in the measure over time, and variability across patients. While important factors, neither addresses the question of whether a measure is sensitive to change as a function of a specific therapeutic agent. For example, a recent phase II trial of tirasemtiv, an agent intended to influence muscle strength, showed a robust effect on muscle strength and no effect on ALSFRS-R . Finally, a demonstration of a clinical effect on a functional rating scale provides little insight into what the effect actually is; the ALSFRS-R is a 12-item scale encompassing a range of functions so that an effect on this measure may or may not be clinically meaningful. For all of these reasons, it seems clear that assessment of muscle strength in a disease characterized by progressive weakness should be considered as an important component of clinical trial design.
Quantitative strength measurement using a HHD has also been used in studies of other motor neuron diseases. A small, open-label trial of valproic acid in patients with SMA types III/IV of ages 17 and older showed an increase in strength as measured by HHD in an open-label setting . However, a placebo-controlled trial in adults with SMA also showed no effect, though quantitative strength testing was found to be reliable and reproducible in a trial of valproic acid in SMA ; results suggested no benefit but the properties of the strength assessment suggested that it would be a good measure in future trials. Similarly, a trial of type II/III SMA enrolled patients aged 6 to 36 years in a crossover study of growth hormone; HHD was again found to be reliable and reproducible, although the results suggested no effect .
Children as young as 5 were enrolled in a natural history study of quantitative strength testing in SMA prior to the onset of a clinical trial . Interrater reliability was higher for upper than lower limbs but was quite good for both (intraclass correlation of 0.92–0.98 for upper limb muscles and >0.85 for all lower limb muscles except ankle dorsiflexion). Interrater reliability was also very good with an intraclass correlation of > 0.91 for all muscles.
Despite the clear advantages of quantitative muscle testing using HHD over MMT strength grading, concern is still expressed over the possibility that, for very strong muscles, the strength of the patient may be greater than the evaluator such that the evaluators strength is being measured rather than that of the patients. To address this issue, Andres et al.  have recently developed a modification of the TQNE system, in which patients are seated in a chair and exert effort against an immobile strain gauge rather than an evaluator. Called ATLIS, the apparatus measures 6 muscle groups bilaterally (grip, elbow and knee flexion and extension, and ankle dorsiflexion). In a study of 432 normal patients, ATLIS was highly reproducible, and evaluations could be rapidly obtained without the multiple different subject positions required for TQNE. Regression equations were established for males and females that described changes in muscle strength with age for the 6 muscle groups tested. Such datasets will be extremely valuable to scale values obtained from diseased patients over a wide range of neuromuscular disorders. Whether the restricted group of muscles reduces the overall quality of a combined muscle measure is yet to be determined.
In summary, strength measures have been used over many years to assess therapeutic benefit in clinical trials for motor neuron diseases. Quantitative measurements have clear advantages over qualitative muscle testing, and a range of techniques are now being incorporated into clinical trials. The use of such measurements have led to increased understanding of common patterns of disease progression in ALS; quantitative strength testing should be incorporated into any trial evaluating an agent intended to either slow motor dysfunction or to cause improvement. Available tools are not perfect; for example, it is currently not possible to distinguish between weakness caused by lesions anywhere in the neuraxis from muscle to motor cortex. The ability to determine objectively the source of weakness would be of great value and should be a subject of future research.
Required Author Forms
Disclosure forms provided by the authors are available with the online version of this article.
- 2.Council MR. Aids to the examination of the peripheral nervous system. War Memorandum. London: HMSO; 1943.Google Scholar
- 9.Young SD, Montes J, Kramer SS, et al. Six-minute walk test is reliable and valid in spinal muscular atrophy. Muscle Nerve 2016 Mar 25.Google Scholar
- 18.Andres P, W H, Finison L, Conlon T, Felmus M, Munsat T. Quantitative motor assessment in amyotrophic lateral sclerosis. Neurology 1986;36:937–941Google Scholar