Friday, January 28, 2011

2010 Form Indicator

As a natural extension to the race prediction tool, http://fsgl.wni-sec.com/RacePredictions.aspx, I thought it would be good to be able to produce a figure that gives an indication of an athlete’s performance over a whole season. As a starting point, I decided on producing a theoretical average time for 10km based on a year’s results for distances 10km, 15km, 21.1km and marathon. Hence using the same algorithm as for the predictor tool but considering races for a whole season instead of just recent form, the following figures came up for some of the regular runners from the club. Also included is the 10 km figure from the race predictor tool.
 ABC
1AthletePredicted 10 km time based on recent form2010 form indicator (10km)
2Jean Marc35.1435.21
3James37.3338.12
4Nico37.3137.55
5Gerald36.3737.54
6Mirelle37.2137.41

For a first attempt, I’m quite pleased with the results as they are not a million miles away from what I would have guessed. The next race between James and Nico should be interesting, although Gerald should still be way out in front. I’m looking forward to the road season starting again so I can test and refine the algorithm.
One obvious future problem is that if someone jogs around a race, maybe to accompany a slower runner, the statistics will be distorted. A solution could be to exclude results that differ from an athlete’s average performance by more than 2 or 3 standard deviations. This is not too complicated to achieve but means I’ll have to start refining the SQL to ensure query speed is not affected.

Tuesday, January 25, 2011

Prediction Tool

The site now has a race prediction tool. It is similar to the McMillan Running Calculator except that instead of entering a reference time as an initial input value, it looks at recent performances for each athlete and makes estimates accordingly. For obvious reasons, only road races over 10, 15 and 21.1 km are considered for the initial input value. I suppose I should also include Marathon performances and will do in a future version.
There is also a confidence indicator as to the accuracy of results. The more recent the last few races of each athlete, the higher the confidence level of an accurate prediction. Given the lack of road races at the moment, it’s unlikely that too many predictions will have a high confidence level.
The ratios used for making predictions, once a base performance level has been established, are based on McMillan’s values but it is intended to also have a set of ratios based on performances from within the club. This is not expected to produce significantly better results as not enough athletes compete regularly over the full range of distances in order to produce a large enough sample. If there were, it would also be possible to provide a difficulty level for each race and hence make even more accurate predictions.
Of course the tool is only a guide and can’t know about external factors such as weather, injuries, hangovers etc. If the predictions are wildly inaccurate please comment below and I’ll check the algorithm.