Background

Post-exercise muscle soreness – a dull, aching sensation following unaccustomed muscular exertion – is usually said to follow an inverted U-shaped curve over time. Typical statements found in the literature are that soreness: "develops" 24 to 48 hours after exercise [1] ; "is usually perceived approximately 24 hours after the exercise bout [and] may linger for an additional 24 to 48 hours" [2] and "peaks 24–48 hours after exercise" [3].

These claims are well supported by data from studies which have measured soreness at various times following an exercise bout. For example. High et al. [4] assessed soreness at 24 and 48 hours after eccentric contractions of the elbow flexors. Soreness scores were higher at the 48 hour follow-up. Similarly, Byrnes et al. [5] reported that muscle soreness steadily increased when measured 6, 18 and 42 hours following 30 minutes of downhill treadmill running. Soreness has also been found to be higher at 48 hours than at 24 hours in untreated controls undertaking bench-stepping [2] and to peak at 48 hours in three separate studies of eccentric exercise of the elbow flexors [68]. In our own studies of soreness following bench-stepping, soreness scores peaked at 48 hours [9] (see figure 1). Perhaps the only study inconsistent with these findings was that of Donelly et al, who found that soreness induced by downhill running on a treadmill was slightly higher at 24 than at 48 hours [3]. That said, a classic inverted U-shaped curve was reported, with soreness low immediately following exercise, highest at 24 and 48 hours and falling again at 72 hours. Such findings are so prevalent that post-exercise muscle soreness is generally described in the literature as "delayed-onset" muscle soreness.

Figure 1
figure 1

Time course of muscle soreness following long-distance running and benchstepping

The studies above involved exercise undertaken in a laboratory setting. With rare exceptions [10], post-exercise soreness has not been investigated in athletes undertaking their chosen sport. Whilst conducting data analysis on a study of long distance running, we noticed that soreness seemed to peak immediately and then reduce gradually over time (see figure 1). This was a similar time course to two previous studies in long-distance runners for which we had obtained original data [11, 12]. We therefore decided to examine the hypothesis that the time course of post-exercise soreness is different for long-distance running than for bench-stepping by examining data from three clinical trials we had completed. If the time course of soreness resulting from an "artificial" laboratory based exercise (bench-stepping) is different from that resulting from a sport (long-distance running) it is plausible that these two methods of soreness induction have different physiological consequences.

Methods

We compared data from a study in which soreness was induced by a bench-stepping regime with those from a trial in which soreness resulted from long-distance running. In the bench-stepping trial, healthy volunteers undertook a 10 minute period of bench-stepping while carrying a backpack containing sandbags equivalent to 10% of their body weight. This is a previously validated method of soreness induction [2]. Only untrained individuals were eligible. Subjects were excluded if they regularly participated in vigorous exercise in the previous six months. "Vigorous exercise" was defined as playing a sport or undertaking an activity designed to increase physical fitness; "regular participation" was defined as undertaking the activity three or more times a week for more than three consecutive weeks and for more than 15 minutes at a time. We included data from 82 subjects in two trials in which the eligibility criteria and method of soreness induction were identical [9]. The running trial studied 400 runners expecting soreness after a long-distance run [13]. Subjects took part in a number of different runs. Mean race length was 21.4 miles SD (7.79). The shortest race was just under 2 miles (one subject) and the longest 50 miles (one subject) but 95% of the races were between 6.2 and 26.2 miles. Approximately two-thirds of participants ran races of marathon length. The trials were approved by the joint University College London / University College London Hospital committees on the ethics of human research and written informed consent was received from participants.

In both the bench-stepping and running trials, differences in muscle soreness between treatment groups were small and did not reach statistical significance. It therefore seemed appropriate to analyze the data from each trial as a whole, ignoring treatment assignment.

The method of outcome measurement was similar in both trials: a 7 point Likert scale of muscle soreness (see figure 2) taken every 12 hours for five days. The only slight difference was that the first measure was taken at 12 hours after exercise in the bench-stepping trials but at a set time, 9 pm, in the running study. Most participants in the running trial undertook marathon races that started at 9 am and took an average of 4 hours to complete. In other words, follow-up times were at 12, 24, 36, 48 hours and so on in the bench-stepping trial but at approximately 8, 20, 32, 44 .... 116 hours in the running study. The Likert scores were summed to produce a total five day score.

Figure 2
figure 2

Likert Scale of Muscle Soreness

The ages and total 5 day soreness scores for the two trials were compared using the Wilcoxon ranksum test. The male : female ratio of the two trials were compared using χ2.

To test the hypothesis that the time course of post-exercise soreness was different between trials, we examined three different analysis models. The purpose of doing so was to determine whether the results depended on the particular model used. In the first model, a subject was classed as experiencing delayed soreness ("DS1") if the soreness score at the third or fourth follow-up time (approximately 36 and 48 hours post-exercise) was higher than the soreness score at both the first and second follow-ups. In an alternative model, a subject was classed as experiencing "DS2" if they met any of the criteria of the first model or soreness at the second follow-up was higher than at the first follow-up. χ2 was used for differences between groups. An ANOVA was also conducted with time, trial and interaction between time and trial as co-variates. Analysis was conducted using the statistical software package STATA 6 (Stata Corporation, College Station, Texas 77840 USA). p < 0.05 was considered to be statistically significant.

Results

There were statistically significant differences between the two trials for age, 5 day soreness and male : female ratio (see table 1). Subjects in the bench-stepping trial were younger, more likely to be female and suffered higher total soreness, though differences between groups are not large.

Table 1 Comparison of age, soreness and male : female ratio

The number of people in each trial experiencing delayed soreness is given in table 2. Regardless of the model used, many more subjects in the bench-stepping trial experienced delayed soreness than in the running trial. The difference between groups is statistically significant (χ2 = 41 p << 0.0001 for DS1; χ2 = 65 p << 0.0001 for DS2). In the ANOVA model, time and time by trial were highly significant (F = 60, p << 0.0001 and F = 14, p << 0.0001 respectively). Trial alone was not associated with soreness (F = 0.1 p = 0.7). This analyses suggests that time course of soreness differed between bench-stepping and running trials.

Table 2 Frequency of delayed soreness (defined in two different ways, see text) following each trial

Exploratory analyses were undertaken to see whether other differences between the two trials could have been responsible for the different time course of post-exercise soreness in each. First, a sub-group analysis including runners only in races of the same length, marathon distance, continued to find differences between running and bench-stepping (χ2 = 71 p << 0.0001 for DS2). Results would be unaffected even if the sample was restricted to runners in a single race, the London marathon. Age, sex, total soreness and trial were entered into a logistic regression with DS2 as the dependent variable. Backwards stepwise regression was used where a p value of 0.05 was the criterion for keeping a variable in the model. The final model included only age and trial. The coefficient for age was small and did not substantially affect the relation between trial and DS2, which remained significant (p << 0.0001).

Discussion

The time course of soreness experienced by participants in the bench-stepping trials is typical of that described in the literature. Subjects in the running trial, however, generally recorded peak soreness at the first follow-up, rather than experiencing delayed soreness. The difference in time course of soreness between the two conditions was highly statistically significant and robust to exploratory analyses of the possible effect of confounding variables.

The two trials differed in intensity, duration and nature of the exercise and the training status of the subjects. It is not possible to state definitively which of these factors or combination thereof led to the different time course of muscle soreness. However, training status seems unlikely: the inverted U time course of "delayed" muscle soreness has been reported in trained as well as untrained subjects undertaking exercises such as bench-stepping. In perhaps the only direct comparison of soreness in trained and untrained subjects [10], the time course for subjective soreness was remarkably similar in each group, though trained subjects experienced a smaller rise in markers of muscle damage.

It seems more likely that the type of exercise – intensity, duration or type of contraction – influences time course. Most researchers have started from the observation that soreness is most severe after eccentric contractions [14] and have designed regimes which involve short, intense, exhaustive, eccentric exercise, typically of a single muscle group. A typical example would be Rodenburg et al. [6, 7] who designed a complex apparatus involving a pulley system in which subjects slowly lowered a weight by elbow extension. So that subjects did not undertake exercise other than this single eccentric contraction, the weights were lifted back into place by a team of student volunteers (Rodenburg, personal communication). Long-distance running, like most forms of sporting activity, differs from the exercises used in this and other post-exercise soreness studies because it involves a large number of different muscle groups, each of which is subject to a variety of different forms of contraction. It seems plausible that short, intense, exhaustive, eccentric exercise produces physiological changes in muscle that are distinct from exercise involving mixed types of contraction taking place with less intensity over a greater period of time.

One note of caution is that the comparison between the bench-stepping and running trials was not randomized. It may be worth replicating this study as a parallel trial with randomized assignment to different forms of soreness induction. That said, there is a persuasive consistency in the results from different studies. In all three trials of long-distance running, mean soreness was highest at the first follow-up. In all of the numerous laboratory-based studies, soreness showed an inverted U-shaped curve over time.

The results of this study suggest that research in the laboratory setting does not necessarily generalize to areas in which trained athletes engage in their chosen sport. Whether there are characteristics of exercise other than the time course of post-exercise soreness that are different in laboratory and natural sports settings is for other researchers to determine. Further research might also examine the time course of objective correlates of muscle soreness, such as performance decrement, following different forms of exercise.