Yesterday I introduced my modelling process for using historical betting market prices to come up with expected prices for races. You can read more here, but it essentially looks at the probability of a rider to win based on betting market prices race-by-race and attempts to attribute that probability to various variables: climbing difficulty, bunch sprint finish, etc. In the end, I came up with a list of favorites for Stage 1 of the Tour de France solely based on their past prices to win races.
This model was built on nearly 1000 races from 2020 to 2024, which provides a fantastic resource not only to model the opinion of the betting market, but also to measure what factors influence a rider’s performance relative to the betting market’s expectations. In short, the market gives us a list of riders expected to perform well and not well in a stage. While their prices to win the race are certainly not 1 to 1 correlation between the order they expect riders to finish in, it is a decent proxy which incorporates the opinions of bettors, recent performance, past performance, estimates of team tactics, and many other factors.
Modelling finish position by betting market Win price
I set up a model to predict the finish position of a rider in a race just given one input: the implied probability from the betting market (which I log transformed such that 10% is double the impact of 1%). I also log transformed the finish position such that finish 1st = 0, finish 3rd = 1.1, finish 7th = 1.9, finish 20th = 3.0. This model asks “given the Win price for a rider, what is the expected finish position for a rider?”.
The results are significant at about 0.24 R^2, meaning about 24% of variance in finish position are explained by pre-race Win odds. Not bad. The best fit equation is 1.67 - 0.35 * log(implied_probability).
This indicates a large favorite like Tadej Pogacar tomorrow (2.0) should finish about 8th-9th position on average tomorrow, while the 15th favorite - Paul Lapeira - has an expected finish of about 25th. Fabio Jakobsen is the worst lined rider for tomorrow and he is expected to finish 126th on average.
Here are the implied finish positions for the top riders in the betting market tomorrow.
Modelling the impact of other variables
Now that we have baseline expectation for the expected finishing position of a rider based on their betting market price pre-race, we can examine a rider’s performance relative to that expected finishing position based on other factors.
One very interesting variable in the impact of temperature. Various forecasts of tomorrow’s race between Florence and Rimini place the expected temperature in the high 20s or 30s celsius (eg, San Luca racebook site shows 31 celsius as an average for the day). That is significantly hotter than the average race and 31 celsius would be about 89th percentile for heat across all races. We can analyze a rider’s ability to cope with the heat and deliver results relative to expectation by incorporating temperature into the basic model above.
I centered the temperature relative to 21 degrees celsius (a median value across all races) so we’re looking for the impact of relatively hotter or colder temperature on results. The model call looks like lmer(log(finish_position) ~ log(implied_probability) + (0 + temperature | rider))
The results have essentially the same intercept and coefficient for implied probability. The coefficient for the impact of temperature are small; they range from about +0.04 to -0.03. Giulio Ciccone - a climber for the Lidl Trek team - has the coefficient which suffers the most from the heat at 0.038. That means if he is 5% to win a race he is expected to finish 14th place in average temperature (21 C), 21st place in hot temperatures (31 C) and 10th in cool temperature (11 C).
Applying the model output to tomorrow’s prices, using 31 C as expected temperature and comparing to what finishing positions would be with average 21 C expected temperature we get the following for the top riders in market.
Ciccone suffers a drop of about 11% relative to average temperature expectation, while Philipsen gets a 7% boost. Healy (+5% improvement) and Pidcock (+3% improvement) are two others who get better results in hot weather. Pogacar relatively struggles (-6% worse). In fact, among the four major GC contenders only Vingegaard is not expected to get worse results in hotter weather. 3rd favorite Bettiol also has struggled relative to expectations in the heat.