In 2015, WHO proposed a new vision for healthy ageing in its World Report on Ageing and Health. The report defines the goal of healthy ageing as helping people in “developing and maintaining the functional ability that enables well-being”. Functional ability is determined by the intrinsic capacity of the individual, relevant environmental characteristics and the interactions between them. Although actions at all levels of the society are vital to foster healthy ageing, realigning health systems towards building and maintaining the intrinsic capacity and functional abilities of adults in the second half of life has been identified as an immediate priority.

Significant loss of intrinsic capacity (functioning) in older adults is characterized by the manifestation of common problems, such as difficulties walking at usual pace, loss of muscle mass and strength and mobility impairments [1]. At present, most health care professionals lack guidance or training to recognize and manage declines in physical capacities in older age. However, research evidence suggests that even in less resourced health care settings, health care professionals can be trained to detect declines in physical capacities (clinically expressed as mobility impairments) and deliver effective interventions to prevent and delay progression [2].

In 2017 WHO launched the Guidelines on community level interventions to manage declines in intrinsic capacity. The primary audience for these WHO-ICOPE guidelines on community-level interventions is health care providers working in primary and secondary care settings ( Guidance on the assessment of physical performance in daily clinical practice can be helpful for geriatricians and for physicians working with frail elderly subjects.

Considering the large number of tools available to measure physical function in older adults, including muscle mass and strength, WHO requested support from the World Health Organization, Collaborating Center for Public Health Aspects of Musculoskeletal Health and Aging to undertake an initial review and consultation towards the identification of the most appropriated tools. As measures of muscle strength and physical performance are increasingly used for research and practice, a review and identification of appropriated tools to evaluate them is timely. The findings presented in this paper are result of a fruitful collaboration between experts of the European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO) and the World Health Organization, Collaborating Center for Public Health Aspects of Musculoskeletal Health and Aging. In this paper, an overview of different methods available and applicable in clinical settings is proposed. The WHO Strategy aims to encourage clinicians and general practitioners (GPs) working in primary health care settings to timely identify declines in muscle strength and physical performance on in older adults. This process should be followed by a precise diagnosis and the start of interventions in affected individuals [3]. The respective measures can be categorized as primary or secondary prevention in older persons that are at high-risk of future care dependence.

It is important to clarify the terms muscle function and physical performance. Muscle function first is underlined by the three concepts of muscle strength, muscle power and muscle endurance. Muscle strength refers to” the amount of force a muscle can produce with a single maximal effort”. Muscle strength should be differentiated to muscle power which is defined by “the ability to exert a maximal force in as short a time as possible, as in accelerating, jumping and throwing implements” and to muscle endurance which is defined as “the ability of muscles to exert force against resistance over a sustained period of time”. The concept of physical performance has become confusing along time and merits a new vision. When first defined, physical performance measures were used to objectively assess how an individual performed different activities of daily living (ADLs) or physical tasks in the clinic, as opposed to ADL scales based on asking questions about the ability to perform such tasks. However, in the last decades, this old concept has gradually changed, and measures of physical performance are now mostly related to ambulation and transfers. In fact, some measures of physical performance (i.e. gait speed) have become part of the definitions of frailty and sarcopenia. No definition of this updated concept of physical performance exists. After long discussion, the group developed a definition of physical performance definition which describes it as “an objectively measured whole body function related with mobility”. Physical performance goes beyond muscle function measures, as it involves many other body organs and systems (bones, balance and other neurological inputs, cardiovascular aspects, motivation…) being a multidimensional concept (Fig. 1). It is thus linked to the WHO concept of intrinsic capacity in its aspects related with mobility. Impairments in physical performance may be evident way before disability starts (as defined by inability to perform ADLs), so it allows for detection of vulnerability in the first steps of the disabling cascade. Physical performance is different from physical capacity where the notion of capacity to achieve is primordial.

Fig. 1
figure 1

Hierarchy of loss of physical function

In order to make this paper easy to consult for clinicians, the authors chose to focus on tests that clinicians are susceptible to use in their daily practice instead of presenting a review of all available tests in the literature, including some not suitable for routine assessments. The instruments included in this paper were a consensus decision of the authors after reviewing all the evidence on the tools available. They took into account multiple criteria, such as the availability of the tool, the presence of validated protocols for measurement and cut-offs, etc. Ultimately, they agreed to present the most applicable tests for clinical assessments.


As in previous initiatives [4,5,6], the European Society for Clinical and Economic Aspects of Osteoporosis and Osteoarthritis (ESCEO) working group on frailty and sarcopenia consists of clinical scientists and experts in the field of musculoskeletal diseases. Different members of the ESCEO working group were asked to review the literature on (1) tools to assess muscle strength and power in daily practice—strengths and weaknesses (JB); (2) objective assessment of muscle strength and power (CS); (3) tools to assess physical performance in daily practice—strengths and weaknesses (ACJ); and (4) can reference standards in the assessment of muscle strength, muscle power and physical performance be developed? (RF). Randomized controlled trials, prospective studies, systematic reviews and meta-analyses published before September 2017 were sought on PubMed and Scopus using the following search terms: (1) Mesh terms searched: Aged, Muscle skeletal, Muscle strength, Physical functional performance, Gait, Walk test, Walking, Sarcopenia, Reference standards, Reference Values; (2) Additional terms searched in either title, abstract or keywords: muscle power, physical function, endurance test. Additional studies were identified through the expertise of the members of the working group but also by a manual search of the reference sections of relevant articles and existing reviews. Each member prepared a list of the most important papers based on their review of the literature and then created a set of preliminary recommendations. The subsequent step was a face-to-face meeting of the whole group to make amendments and discuss results. Shared conclusions were reached and finalized during the creation of the current manuscript whose drafts were circulated by mail (Fig. 2).

Measurement of Muscle Function

Muscle Strength

There are few well-validated techniques to measure muscle strength. Among them, strength of both upper limbs and lower limbs can be assessed; both of them having been shown to be highly correlated [7,8,9,10]. In clinical settings and for the diagnosis of sarcopenia and frailty, grip strength is the measure of choice for the assessment of overall muscle strength, as it has been shown to be a surrogate for lower extremity muscle strength and as it is easier to measure [5, 10, 11]. Indeed, grip strength measurement only requires the holding of a handheld dynamometer, has standardized protocols of measurement and has robust and validated cut-offs values available, which is not the case for all lower limb muscle strength values. The common use of handgrip strength in clinical daily practice has been highlighted recently in a survey based on the experience of 255 clinicians from 55 countries showing a large part of the sample using handgrip strength (66.4% of the sample) compared to lower limb muscle assessments (e.g. leg press 24.2%, chest press 9.4%, isokinetic parameters 7.4% of the sample) [12]. For these reasons, this consensus paper focused only on grip strength as the measure of muscle strength, following the recommendation of the experts involved in this paper.

Grip Strength


Handgrip strength is the most widely used method for the measurement of muscle strength. It is recognized to be easily applicable both in research and in clinic settings [5]. Handgrip strength is usually measured during muscular isotonic contraction.

How to Measure Grip Strength in Clinical Practice?

  • Procedure The measurement is easy to perform, the device is portable, with an acceptable cost, and does not require a specialist trained user. Because of the ease of its application, grip strength measurement can be used in clinical practice, and thus, can be applied in a large sample of older adults, symptomatic or asymptomatic, to identify those with low muscle strength. It is recommended to use standardized measurement protocols such the Southampton protocol, proposed by Roberts et al. [13, 14] for the measurement of grip strength. Briefly, standardized conditions for the test include seating the subject in a standard chair with forearms resting flat on the chair arms. The testing nurse or physician should demonstrate the use of the dynamometer and show that gripping very tightly registers the best score. Six measures should be taken, 3 with each arm. Ideally, the patients should be encouraged to squeeze as hard and as tightly as possible during 3–5 s during each of the 6 trials; usually the highest reading of the 6 measurements is reported as the final result.

  • Time of administration 5 min.

  • Equipment A well-calibrated handheld dynamometer. Different features exist such as hydraulic dynamometer [e.g. Jamar, which is usually considered as the gold reference for this measurement, with units in kilograms (kg) or pounds (lbf)], pneumatic [e.g. Martin vigorimeter with units in millimetres of mercury (mmHg) or pounds per square inch (psi)] which measure grip pressure, mechanical [e.g. Harpenden dynamometer with units in kg or lbf) and strain dynamometer (with units in Newtons of force (N)]. Dynamometers have to be calibrated appropriately by manufacturer prior using it [15].

Performance Characteristics

  • Highly feasible [16].

  • Testretest reliability in older adults is good (ICC ≥ 0.85) [17, 18].

  • Inter-rater reliability in older adult is excellent (ICC 0.95–0.98) [16, 19].

  • Responsiveness One study proposed a minimal change of 6 kg (13.2 lb) to be considered as significant [20]. Data on sensitivity of change in grip strength to interventions are still rather limited and inconsistent [18, 21].

  • Floor effects For patients with upper extremity impairment and/or affected by rheumatoid arthritis, hand osteoarthritis or carpal tunnel syndrome, grip strength measure may not be an accurate reflection of muscle strength and may lead to underestimations. The design of the Jamar dynamometer may be the reason for this [22] and a pneumatic dynamometer may be a good alternative for these patients. With the Martin vigorimeter for example, patients try to squeeze rubber balls (available in three sizes) following the same protocol as that described for the Jamar dynamometer. However, the comparison between these two devices is limited given the different unit of measure provided by them.

  • Note that the absolute values and the precision of grip strength measurement are influenced by hand position, hand size and dominance [23], body position [24] verbal encouragement [25] and patient motivation. For this reason, the application of a standardized protocol is highly recommended [13].

Reference Range

  • Variety of normative data have been proposed, ranging from 16 to 21 kg for women and 26 to 30 kg for men [10, 26,27,28,29]. It should also be noted that values adjusted for BMI or height also exists. For example, the EWGSOP [27] proposed general values for grip strength (< 30 kg for men and < 20 kg for women) but also BMI dependent value (e.g. ≤ 29 kg for men with a BMI ≤ 24 kg/m2; ≤ 30 kg for men with a BMI between 24.1 and 28 kg/m2; ≤ 32 kg for men with a BMI > 28 kg/m2; ≤ 17 kg for women with a BMI ≤ 23 kg/m2; ≤ 17.3 kg for women with a BMI between 23.1 and 26 kg/m2; ≤ 18 kg for women with a BMI between 26.1 and 29 kg/m2 and finally ≤ 21 kg for women with a BMI > 29 kg/m2).

  • Low grip strength is consistently associated with poor outcomes; care dependence, falls, fractures, mortality [30,31,32,33,34,35].

Muscle power

Compared to muscle strength, power concerns work rate (work done per unit time). In healthy older people, muscle power declines earlier and faster compared to muscle mass and strength [36]. Leg power has been shown to be highly correlated with physical performance tests such as gait speed, chair stand test and stair-climb time [37], and several comparative studies have found that muscle power is a better predictor of mortality compared to muscle strength [38]. Therefore, muscle power is often proposed as the primary therapeutic target for resistance training interventions in older adults. Muscle power can be assessed across a range of muscle groups, but most often the leg press and knee extension exercises are used to measure muscle power. Maximal strength is quantified through the 1RM (1 repetition maximum resistance), wherein the evaluation is carried out at the highest resistance for which the subject can complete the exercise once. To find the 1RM, the exercise is repeated several times at increasing resistance until failure to complete a single repetition. However, there is a distinct lack of standardization across studies, and different equipment and measurement techniques have been used [39]. The most common muscle power exercises, leg press and knee extension, show good reliability and validity [40]. Currently, a reference range for the different measures of muscle power is not yet available. Because measuring muscle power requires complex and sometimes expensive machines but also because the applicability of this measure is compromised in clinical settings (necessity of training for both clinicians and subjects, time-consuming, no standardized protocols available, etc.) and finally, because standardized cut-off points have not been agreed to define a low muscle power, muscle power assessments can hardly be considered useful in daily practice. For this reason, based on an agreement of the experts involved in this paper, measurement of muscle power will not be exhaustively discussed in the present paper.

Measurement of Physical Performance

Many tests are described in the literature to measure physical performance of older adults. Considering the applicability of these tests in clinical practice but also their performance characteristics, the experts involved in this paper agreed to focus on some of the available physical performance tests. They therefore chose to present and encourage the measurement of gait speed, chair stand test, short physical performance battery (SPPB) test and the timed up and go (TUG) test.

Gait Speed Test

Two main types of gait speed tests exist; the short-distance walk tests (2.4-m distance, 4-m distance, 6-m distance and 10-m distance) and the long-distance walk tests (400-m walk test and 6-min walk test). Long-distance walk tests require a corridor of at least 20 m as well as a minimum time for execution of 15 min. These tests may be extremely useful in discriminating different categories of risk among older individuals in healthy conditions. Besides measuring physical performance in older adults, these tests also evaluate endurance of the subjects. Nevertheless, short walk tests can be used as surrogates for long-distance walk tests to measure the overall functional status in older adults [41]. Indeed, the 4-m gait speed test, for example, has been shown to be highly predictive of the ability to perform the 400-m walk test in older adults [41, 42]. Taking this into account and because short-walk tests are clearly more readily applicable in clinical settings, the experts of this group agreed to focus on short walk tests for the current review and strongly recommend these tests to assess physical performance of older adults in routine practice.

Short Walk Measures of Usual Gait Speed


The gait speed measurement is probably one of the most widely used tools in clinical practice for the assessment of physical performance. It is generally measured in a short distance (2.4-m distance, 4-m distance, 6-m and 10-m distance) with the 4-m distance as being the most commonly used short-walk test validated in older adults.

How to Measure Gait Speed Test in Clinical Practice?

  • Procedure short-distance walk tests are applicable in clinics and in GPs offices but some training for the testing staff is required. This measure is also highly acceptable for participants and health professionals [43, 44]. Some tentative recommendations for a protocol of administration have been proposed following a systematic review [45]: (1) use a static start with timing commencing when the foot touches the floor the first time after the line; (2) usual or comfortable pace to be used as the standard, with fast pace used as appropriate for specific research questions. (3) Walking protocol to be reported in detail including pace instructions, verbal or other encouragement, and specific timing procedures.

  • Time of administration 95 ± 20 s.

  • Equipment 4-m long flat floor devoid of obstacles and a chronograph. This test can therefore be administered in restricted areas.

Performance Characteristics

  • Testretest reliability excellent test–retest reliability on 4- and 10-m distance has been shown for healthy older adults (ICC values ranging from 0.96 to 0.98) [46]. Test–retest reliability of gait speed has also been assessed in populations with comorbidities such as in patients with stroke [47], COPD [48], in cardiac rehabilitation [49], etc. It has been demonstrated to be highly reliable in each of these populations.

  • Inter-rater reliability very strong inter-rater reliability reported in older patients with COPD ICC value of 0.99 (95% CI 0.98–0.99) [48, 50].

  • Responsiveness the 4-m gait speed is responsive to clinically meaningful changes with 0.05 m/s denoting a small change (i.e. clinically detectable, potentially important) and 0.1 m/s indicating a substantial change (clinically detectable, definitely important) [51].

  • Floor effect A floor effect is obvious in subjects unable to walk. However, as it has been shown in a systematic review, there is a broad range of people for whom timed walking is a valid and sensitive outcome measure (patients with cancer, neurological problems, osteoarthritis, fractures, etc.) [45]. A ceiling effect has also been reported in certain populations. For example, it is doubtful whether a potential increase of muscle mass will result in an improvement in gait speed in well-functioning subjects with a high baseline walking speed.

Reference Range

  • Timed usual gait has been shown to be highly predictive for future care dependence [52], for other adverse health events such as severe mobility limitation or mortality [53]. But it has also been demonstrated that poor performance in other tests of lower extremity function (standing balance test and chair rise test) had comparable prognostic value [54].

  • Cut-off < 0.8 m/s for 4-m distance identifies subjects with poor physical performance [10]. A systematic review including 3 studies with 3261 participants revealed a very high sensitivity for the < 0.8 m/s cut-off for identifying frailty (Se = 0.99), as well as high negative predictive value (NPV = 0.99), but also a moderate specificity (Sp = 0.64) and a low positive predictive value (PPV = 0.26) [55].

  • Cut-off < 1 m/s for a 6-m distance identifies older persons at high risk of health-related negative events [54].

  • Other cut-offs, adapted for example, to the height of participants has also been proposed by Fried et al. on a distance of 15-ft (4.572 m) [17, 73].

Chair Stand Test


The 30-second (30-s) Chair stand test (CST) developed by Rikli and Jones [56] is one of the most important physical performance clinical tests because it measures lower body power, balance and endurance and relates it to the most demanding daily life activities. The 30-s CST has been widely used in many studies not only to evaluate functional fitness levels [57] but also to monitor training [58,59,60] and rehabilitation [61]. Another version of the chair stand test is also well known (as it is embedded in the SPPB) and consists of recording the amount of time to complete five sit-to-stand manoeuvers. The test has been shown as a predictor of falls but has some limitations. This test has indeed a restricted capacity to assess a wide variation in ability, which is relevant in older people, since some older adults cannot complete the five attempts and are therefore not assigned a score (floor effect). The utility of this test is therefore limited in subjects suffering from moderate to severe mobility limitations. Consequently, for older populations, the literature favours time-based protocols such as the 30-s chair stand test.

How to Measure the 30-s CST Test in Clinical Practice?

  • Procedure: Classically, the 30-s CST consists of manually counting the number of total sit-stand-sit cycles completed during the 30 s of the test. This test is highly feasible in clinical practice and may therefore be recommended as a measure of physical performance.

  • Time of administration 1–2 min.

  • Equipment A chair with a straight back without arm rests and a stopwatch.

Performance Characteristics

  • Testretest reliability good test–retest reliability (0.84 for men to 0.92 for women) in healthy older adults [62] as well as excellent reliability for subjects with knee arthroplasty [63], for subjects with mild-to moderate dementia [64] or hospitalized patients with stroke [65].

  • Inter-rater reliability limited data available but very strong inter-rater reliability on older adults in nursing home with mild to moderate dementia (perfect ICC of 1) [64].

  • Responsiveness Very limited data available. Only a minimal detectable change (MDC) value of 3.49 in older people with dementia [66] has been defined. Then, an improvement in more than 3.49 sit-stand-sit cycles during the 30-s chair stand test is considered to be a true change in performance with 95% confidence.

  • Floor effect this test is not impacted by floor effect, as it is the case of the five sit-to-stand test, since subjects unable to perform it are attributed a score of 0. Some authors argue that the 30-s CST protocol makes it possible to assess wider variations in ability levels of individuals compared to the five-time sit-to-stand test as it avoid potential floor effects [67].

Reference Range

  • Limited data in literature.

  • One author report an association between the test and the risk of falling in a population of older nursing home residents [68].

  • Normative values have been proposed for two particular populations: Hong Kong older adults and US older adults. Normative value for Hong Kong older adults 70–74 years for example is a mean of 10.1 ± 3.8 stands during 30-s and 13 stands during 30 s for US norms [69].

Short Physical Performance Battery (SPPB)


The short physical performance battery (SPPB) originally developed at the National Institute on Aging for use in the Established Population for the Epidemiologic Studies of the Elderly (EPESE) [70] is the most widely used physical performance test battery that has been applied in clinical and research settings including randomized controlled trials. This battery of tests has been validated in large-scale epidemiological studies and evaluates lower extremity functional performance using timed measures of standing balance, gait speed and lower extremity strength. Moreover, both the global score and its individual elements may be analysed separately in different clinical or research settings.

How to Measure SPPB Test in Clinical Daily Practice?

  • Time of administration 10 min.

  • Equipment a 4-m track, ground marks, a chronometer and straight-backed chair.

  • Procedure applicable in research and in clinics as well as in GPs offices but training is required. The full detailed instructions can be downloaded for free from the web ( but are summarized here below. This test is longer to apply compared to chair rising test alone or to gait speed test alone but is nevertheless feasible and can be recommended as a screening test for poor physical performance and risk of sarcopenia.

The three following tests should be administrated successively in the same order as they are presented below:

  1. 1.

    Balance test

    For the balance test, the subject is asked to hold three increasingly challenging standing positions for 10 s each: (1) a side-by-side position; (2) semi-tandem position (the heel of one foot beside the big toe of the other foot); (3) tandem position (the heel of one foot in front of and touching the toes of the other foot).

  2. 2.

    Walking speed test

    For the walking speed test, the subject is asked to walk at his/her usual pace over a 4-m course (originally 8 feet or 2.4 m). He/she is instructed to stand with both feet touching the starting line and to start walking after a verbal command. The subject is allowed to use walking aids (cane, walker, or other walking aid) if necessary, but no assistance by another person can be provided. Timing begins when the starting command is given, and time in seconds needed to complete the entire distance is recorded. The faster of two walks is usually considered in computing the SPPB score.

  3. 3.

    Repeated chair stands test

    The repeated chair stands test is performed using a straight-backed chair, placed with its back against a wall. The subject is first asked to stand from a sitting position without using their arms. If he/she is able to perform the task, he/she is then asked to stand up and sit down five times, as quickly as possible with arms folded across their chests. The time to complete five stands is recorded.

    These three physical performance subtasks are used to calculate a summary score. The score of each of the three tests ranges from 0 to 4, where 4 indicates the best result and 0 the worst result. Therefore, a summary score ranging from 0 (worst performers) to 12 (best performers) is calculated by adding the categorical results derived from the three timed physical performance subtasks.

Performance Characteristics

  • Testretest reliability has been shown to be good to excellent (ICC ranging from 0.83 to 0.92 for measures made 1 week apart) [71,72,73]. The reproducibility of the SPPB, already very good, can nevertheless be enhanced through the use of standardized equipment and an appropriate standard operating procedure.

  • Inter-rater reliability has been shown to be excellent (ICC 0.91) among acutely admitted older medical patients [16]. Data on healthy populations are limited.

  • Responsiveness the SPPB is responsive to clinically meaningful changes [71, 74] with 0.5 points denoting a small change (i.e. clinically detectable, potentially important) and 1 point denoting a substantial change (clinically detectable, definitely important) [51].

  • Ceiling effects it may have ceiling effects for high functioning and very fit older adults (who will score 12 points). For research, a more challenging SPPB has been proposed, for example, in the Health, Aging and Body Composition (Health ABC) Study, with time to walk 400 m measured instead of 4-m distance, and by extending the times from 10 to 30 s for the three standard balance tests [54]. This approach is relevant to assess the wide range of functioning at baseline among fit older participant of cohort studies but is unlikely to be useful in clinical practice.

  • Floor effects it is still not clear how to score a subject unable to walk and therefore unable to perform the SPPB test correctly. Floor effects of 0 points may be observed in this specific case.

Reference Range

  • Scores obtained on a 12-point summary scale provides a gradient of functional decline that has been shown to be valid and reliable in predicting the future risk of mobility impairment, care dependence, institutionalization, hospital admission and mortality [71, 75, 76].

  • Cut-off point of  10 points has been shown to be a strong predictor of the loss of ability to walk 400 meters with a sensitivity of 0.69 and a specificity of 0.84 [42].

  • Cut-off point of  8 points has been showed to be associated with mobility-related disability [52].

  • A very low SPPB test (06 points) has been shown to be associated with increased risk of death [75, 77].

Timed-Get-Up-and-Go Test


The timed-get-up-and-go (TUG) test is a physical performance measure that has been mainly used to assess gait and dynamic balance [78]. It is a single test measuring the time a person takes to complete a complex series of different motor tasks.

How to Measure the TUG Test in Clinical Daily Practice?

  • Procedure Applicable in clinical settings and GPs offices and on different populations (with frailty, Parkinson’s disease, cognitive impairment, recent joint surgery, osteoarthritis, etc.); little training is necessary.

  • TUG requires the subject to stand up from a chair, walk three meters, turn around, return and sit down again. The stopwatch is started on the word “go” and stopped at the moment the subject sits down. The person wears regular footwear, uses his/her customary walking aid and walks at her/his usual pace. No physical assistance is given. A practice trial is given first, and the average time of two consecutive trials is recorded.

  • Complete instructions can be downloaded in the following website: A video provided by the Center for Disease Control and Prevention (CDC) is also available on the following website:

  • Time of administration 2–3 min.

  • Equipment Chair with armrest, 3-m track, ground marks, stopwatch.

Performance Characteristics

  • Testretest reliability moderate (ICC 0.53) [79] to good test–retest reliability (ICC 0.97–0.99) [78, 80, 81] across studies.

  • Inter-rater reliability excellent in the older population (ICC = 0.99 [78], ICC = 0.98 [82]).

  • Responsiveness not well defined in older populations. A reduction in time greater than or equal to 0.8, 1.4 and 1.2 s on the TUG has been determined to be the Minimal Important Clinical Change score (MCID) for patients suffering from hip osteoarthritis [83].

  • Floor and ceiling effects the TUG does not suffer from ceiling effects [84]. Floor effects are observed, as for each test involving a person’s walking ability.

Reference Range

  • The time to perform the task is compared to normative values for age, gender, and research-based guidelines that measure increased risk of falls and functional decline. The performance of the patient can also be summarized into a five-point scaled score [85].

  • A TUG score of 14 s is sensitive (87%) and specific (87%) for identifying older individuals who are at risk for falls [82].

  • In the systematic review of Clegg et al. [55]., assessing, among others, the accuracy of the TUG test for identifying frailty, a sensitivity of 0.93 has been shown for the cut-off of > 10 points (with a negative predictive value of 0.99), with a specificity of 0.62 (with a positive predictive value of only 0.17).

Practice Recommendations

After a review of the literature and discussion during the working group meeting, the following statements and recommendations on the use of measures of muscle strength and physical performance are proposed:

  1. 1.

    Muscle strength and physical performance measures are rarely assessed in daily practice compared to other clinical or biochemical parameters. However, limited strength and performance result in physical limitations which are strong predictors of adverse negative health outcomes such as care dependence, falls, fractures, hospitalization and death. Therefore, the experts involved in this position paper strongly encourage clinicians to routinely assess strength and physical performance in older adults. Unfortunately, data do not yet reveal what is the ideal frequency of repeated measures. Evidence is still needed in this regard.

  2. 2.

    Measurements of muscle strength and function are facilitated by their non-invasiveness and their effective limited time for application.

  3. 3.

    Many different tests to measure muscle strength and physical performances have been described, although not all have the same level of evidence or are easy to use in routine clinical practices. The experts involved in this paper summarized the most widely applicable tests in daily practice. For measuring muscle strength, the experts advise the use of a handheld dynamometer to assess handgrip strength. For measuring physical performance, the experts advise the assessment of 4-m gait speed, Short Physical Performance Battery test or Timed Up and Go test.

  4. 4.

    The choice of the tool to use should be guided according to different parameters:

    1. a.

      The purpose of the assessment should be essential in the choice of the clinician. For an intervention, it is more important to have a tool with a responsiveness to test the impact of an intervention on muscle strength and function. Based on the experts’ review of the literature, only grip strength, SPPB test and gait speed show a sufficient responsiveness in general older population. Good responsiveness of other tools has also been shown in specific population (e.g. population suffering from dementia, COPD population, etc.). For a practical purpose, it is essential to have a cheap and easy-to-administer tool applicable to the majority of older adults. The presence of validated normative values for a large group of patients, and from different countries, is also essential. Based on the experts’ review of the literature, validated normative values are mainly available for grip strength, gait speed and SPPB test. For diagnosis purposes, the objective is somewhat different and requires therefore a tool with robust validity and reliability. Based on the experts’ review of the literature, all the tools presented in this paper show satisfactory validity and reliability.

    2. b.

      The population’s characteristics will also determine the choice of test. Given the heterogeneity of the population in daily practice, it is important to choose a tool that is highly practicable in various patients’ groups. For older people with significant losses of capacity or ability or robust, clinicians should be aware of the presence of floor and ceiling effects. Based on the experts’ review of the literature, each test involving a person’s walking ability is prone to floor effects. It is also the case for grip strength, with limitations particularly in some patients with local disease, for example advanced arthritis. Some adaptations of the tests have been proposed in order to characterize the wide range of functioning in these patients.

    3. c.

      Applicability in clinical settings (cost, required time for the examination, necessary of training, complex equipment, etc.). Based on the experts’ review of the literature, the two most applicable tools in the clinical settings are grip strength measured by a handheld dynamometer and the 4-m gait speed. Both methods do not require a complex and expensive equipment, are cheap, quick and easy to administer, do not require a special training of clinicians and, for the grip strength measurement mainly, standardized validated protocols of assessment are available.

    4. d.

      The performance characteristics of the tools (test–retest reliability, inter-rater reliability, responsiveness, floor and ceiling effects, etc.). Based on the experts’ review of the literature, all the proposed tools seem reliable but only some of them have already been evaluated for responsiveness. The SPPB test and the 4-m gait speed test seem to be the most robust tools in terms of responsiveness.

    5. e.

      The prognostic values of the tests for adverse clinical outcomes. Numerous normative data and cut-off points for predicting outcomes have been proposed for the different tools available, both in healthy older population and in specific comorbidities. Based on the experts’ review of the literature, currently, there is substantial evidence for threshold values for handgrip strength, SPPB and 4-m gait speed as being strong predictors of adverse negative health outcomes.

  5. 5.

    Based on the available evidences, the applicability of the tools in clinical practice, the required time for the test, the required equipment, the performance characteristics of the tool but also the availability of robust cut-off points, the experts advised grip strength to measure muscle strength and 4-m gait speed or SPPB test to measure physical performance in daily practice. Subjects with low strength or performance should receive an additional diagnostic workup to achieve a full diagnosis of the condition that causes such problems (sarcopenia, frailty or other problems).