The Runner’s WOD

The most important skill in the early days of human development is catching prey or getting away from a predator — running. This statistical investigation takes apart the anatomical components of running and looks at the variables which make or break a good runner. In my previous post Weightlifter’s Normative I came upon some athlete data… let’s start at the bottom.

Deadlift

8,635 men, 2,380 women. ±5% jitter. Original data clipped to 25kg – 300kg, 40s – 150s.

This amorphous cluster tells us that men, as a group, are faster than women; fastest women deadlift around 125kg, fastest men around 180kg; and that higher deadlift seems to correspond to faster running but, up to a point.  Further analysis is complicated by the fact that our data is sparse, kidney-shaped blob.  To extract [more] quantifiable insights, the analysis must internalize the functional meaning of these exercises. Namely, what matters for survival is that you outrun your fellow cave(wo)man to get the food, or outrun her while bolting from a sabertooth, thus not becoming the food.  Whether you are ahead by one second or one minute is irrelevant.  Consider the following experiment:

  • Two individuals sprint 400m.
  • If the winner’s deadlift is greater than the loser’s
    • record it as [win-win, |deadlift difference|];
  • If the winner’s deadlift is less than the loser’s
    • record it as [win-lose, |deadlift difference|];
  • Repeat for every pair of individuals.

Plot win-win percentage vs. |deadlift difference|. In other words: plot the victory probability of a stronger athlete vs. his/her [leg] strength advantage. This [computational] experiment produces an impressive amount of data because there are 8,635 men and 2,380 women available for this calculation. Any two individuals constitute a valid race: 37 million male pairs and 2.8 million female pairs. Recall that the number of pairs in n elements is given by n(n-1)/2.  The plot below encompasses 40 million races.

±5% jitter is added to counteract rounding and recording biases.  Mean filter with a radius of 3 points applied to the results. Transparent bands indicate uncertainty of a population measurement (a.k.a. standard error), mostly proportional to n. Translucent, inverted semi-parabolas are logarithmic counts of the number of races for every deadlift difference, corresponding to the axis on the right.

The trends deteriorate and can no longer be trusted after the race count drops below 5k or so. Removing all data generated from less than 5k races yields:

8,635 men, 2,380 women. ±5% jitter. Original data clipped to 25kg – 300kg, 40s – 150s; results clipped to 5,000 races or more.  3-point radius mean filter applied to the results.

First feature to note is that both trends begin at (0, 50%). This desirable characteristic indicates that the samples are random and unbiased: if two athletes of the same gender have identical deadlifts, ignoring all else, neither has a statistical advantage in a 400m race.

As the deadlift difference between two athletes increases, so does the stronger athlete’s chance of winning the race. One may wonder why the curves aren’t steeper: why, for example, does a 100kg deadlift advantage only corresponds to a ~ 20% victory probability increase for men? Three answers: athlete height, body mass liability, and the fact that running isn’t entirely below the belt; in ascending order of importance.

Athlete Height

Athlete height is a correlate of leg length. Longer legs = longer stride = more distance covered per breath and heartbeat. Common wisdom suggests that taller athletes are better sprinters. Does this assertion stand up to mathematical scrutiny? Kind of…

400m_vs_height

8,760 men, 2,403 women. ±5% jitter. Original data clipped to 135cm – 250cm, 40s – 150s; results clipped to 5,000 races or more. 3-point radius mean filter applied to the results.

For women, height correlates to a modest increase in sprinting speed. Height appears to be irrelevant for men; which is peculiar and deserves an explanation.

If I am magically lengthened by 10%, the taller me will be able to keep up with the old me only if he is endowed with the amount of muscle commensurate with his new mass, which increases by ~ 33% because volume is proportional to length cubed. It is safe to assert that when it comes to running, men in this data set scale proportionally, while women scale super-proportionally. Meaning that as men get taller, on average, they are just strong enough to keep up with their shorter cohorts. Women’s strength increases at a rate greater than necessary to merely keep up, women get much stronger with increased height; again, on average.

Body Mass

For the most part, body mass is not a runner’s friend. However, it doesn’t appear to matter much up to a difference of ~ 30lb, acts as a handicap thereafter, especially for women.

400m_vs_weight

8,821 men, 2,275 women. ±5% jitter. Original data clipped to 40kg – 150kg, 40s – 150s; results clipped to 5,000 races or more. 3-point radius mean filter applied to the results.

Upper Body Strength and Endurance

Put your hands in your pockets; now run. Primates don’t swing their arms while walking and running because it looks cool, we do it because bipedal locomotion is little more than a series of controlled falls. To control these falls, we must constantly control our center of mass. The faster we run, the more adjustments are needed per unit time. Every movement of the leg is countered by, or synchronized with, a movement of the arm and a slight twist of the torso.  If the arms and the torso can’t move as fast as the legs, upper body imposes a speed limit regardless of the leg strength.

A kipping pull-up encompasses the coordination of the upper and the lower body, core strength, arm strength, and cardiovascular endurance.

400m_vs_pullups

6,621 men, 1,548 women. ±5% jitter for the sprint time, none for the number of pull-ups. Original data clipped to 1 – 100 (athletes unable to do one kipping pull-up are excluded from this analysis), 40s – 150s; results clipped to 5,000 races or more. 3-point radius mean filter applied to the results.

Ignoring all other athlete parameters, unbroken kipping pull-ups appear to be a better predictor of sprinting ability than a deadlift maximum.

Correlation vs. Causation

One will be wise to question whether the correlations presented are meaningful. Would it not stand a reason for more experienced athletes to be able to lift more weight, do more kipping pull-ups unbroken, and run faster? If so, then we are merely looking at the athletes’ temporal progression. Much like plotting a person’s height vs. vocabulary: both grow naturally but, aren’t caused by one another. [Artificially] increasing one will have no effect on the other.

Chicken vs. Egg

If we assume that deadlift, kipping pull-ups, and running are anatomically related, in which direction is the causation arrow pointing? Are we looking at athletes who can do a lot of kipping pull-ups because they’re great runners, or athletes who are exceptional runners because they deadlift a lot of weight?

Part of the answer can be gleaned from the origin of the data: CrossFit. Sprinting, or running of any kind, is not emphasized in the CrossFit community to the same extent as, say, kipping pull-ups. CrossFit athletes are rarely instructed on proper speed or distance running techniques. Therefore, sprinting performance in the CrossFit community is at least in part an effect of the CrossFit training, and not a component thereof.

A more convincing piece of evidence are the contour plots of the 400m sprint victory probability as a function of the deadlift and the kipping pull-up differences:

Women’s 400m Sprint Victory Probability

Men’s 400m Sprint Victory Probability

6,490 men, 1,516 women. ±5% jitter for the sprint time and the deadlift, none for the number of pull-ups. Original data clipped to 1 – 100 (athletes unable to do one kipping pull-up are excluded from this analysis), 40s – 150s, 25kg – 300kg; results clipped to at least 5% of the maximum number of races per cell measuring 1s by 1kg.  10-point radius mean filter applied to the results. Thick contour is the 50% mark, the contour of no statistical advantage.

Non-trivially slanted contours are strong indication of the deadlift and the pull-up differences contributing to the sprinting speed, as opposed to solely correlating with it.

Consider the violet reference points. For women, being able to deadlift 70kg more than her opponent and do a few more kipping pull-ups provides the same statistical advantage, 70%, as being able to do 24 more kipping pull-ups than her equally-deadlifting opponent. For men, 70% statistical victory can be achieved by being able to do 25 more kipping pull-ups or just 12 more with a 100kg deadlift advantage.

The data clearly shows that when it comes to running, the upper body can compensate for the shortcomings of the legs and vice versa. This is so because the speed of bipedal (and quadrupedal) locomotion depends on two variables: stride length and frequency. If an athlete has strong legs, (s)he can make long strides not requiring as much coordination of the upper body. Athlete with lacking leg strength can take shorter strides at a higher frequency. Stride length is mostly limited by the muscles in the legs, stride frequency is mostly limited by the upper body strength and endurance.

Conclusion

Short of, or in complement to actual running, it would behoove anyone aspiring to lower sprint times to practice kipping pull-ups and deadlifts, in that order. Especially the former, because it much easier to get 10 more kipping pull-ups than deadlift 100 more pounds. As it pertains to sprinting, an increase of one kipping pull-up is equivalent to ~ 7kg deadlift increase for men; ~ 4kg for women:

 ∆ 1 kipping pull-up  ≈ ∆ 10
15
lb deadlift for women
for men

A Word Of Caution

When one looks at, for example, the victory probability vs. height graph and notes a flat line for men, the takeaway is not that height has no effect on sprinting speed. The correct interpretation: it is not possible to use a man’s height alone to place a wager on his performance in a 400m race. Such interpretation is only valid for the data analyzed, and neither necessarily extends nor rules out its relevance to the entire population of Earth.

Leave a comment