Tuesday, February 1, 2011

Version 0.5 - Improved Stats and a Better Prediction Tool

Version 0.5 is now released with the following new features and enhancements.

·         Improvements to the Race Predictor Tool to include a 5km estimate.
·         A rewrite of the Best Performances page to allow search by category, distance, sex and year.
·         A change to the predictor tool algorithm to produce more accurate results.
·         A season review (2010) table on the prediction page. I.e. Theoretical average times based on results over a whole season.
·         Some format changes to make the site display better in Chrome and Firefox.
·         Several bug fixes to the results page including the sort order of races when viewing per athlete.
·         An updated athlete list.

Friday, January 28, 2011

2010 Form Indicator

As a natural extension to the race prediction tool, http://fsgl.wni-sec.com/RacePredictions.aspx, I thought it would be good to be able to produce a figure that gives an indication of an athlete’s performance over a whole season. As a starting point, I decided on producing a theoretical average time for 10km based on a year’s results for distances 10km, 15km, 21.1km and marathon. Hence using the same algorithm as for the predictor tool but considering races for a whole season instead of just recent form, the following figures came up for some of the regular runners from the club. Also included is the 10 km figure from the race predictor tool.
 ABC
1AthletePredicted 10 km time based on recent form2010 form indicator (10km)
2Jean Marc35.1435.21
3James37.3338.12
4Nico37.3137.55
5Gerald36.3737.54
6Mirelle37.2137.41

For a first attempt, I’m quite pleased with the results as they are not a million miles away from what I would have guessed. The next race between James and Nico should be interesting, although Gerald should still be way out in front. I’m looking forward to the road season starting again so I can test and refine the algorithm.
One obvious future problem is that if someone jogs around a race, maybe to accompany a slower runner, the statistics will be distorted. A solution could be to exclude results that differ from an athlete’s average performance by more than 2 or 3 standard deviations. This is not too complicated to achieve but means I’ll have to start refining the SQL to ensure query speed is not affected.

Tuesday, January 25, 2011

Prediction Tool

The site now has a race prediction tool. It is similar to the McMillan Running Calculator except that instead of entering a reference time as an initial input value, it looks at recent performances for each athlete and makes estimates accordingly. For obvious reasons, only road races over 10, 15 and 21.1 km are considered for the initial input value. I suppose I should also include Marathon performances and will do in a future version.
There is also a confidence indicator as to the accuracy of results. The more recent the last few races of each athlete, the higher the confidence level of an accurate prediction. Given the lack of road races at the moment, it’s unlikely that too many predictions will have a high confidence level.
The ratios used for making predictions, once a base performance level has been established, are based on McMillan’s values but it is intended to also have a set of ratios based on performances from within the club. This is not expected to produce significantly better results as not enough athletes compete regularly over the full range of distances in order to produce a large enough sample. If there were, it would also be possible to provide a difficulty level for each race and hence make even more accurate predictions.
Of course the tool is only a guide and can’t know about external factors such as weather, injuries, hangovers etc. If the predictions are wildly inaccurate please comment below and I’ll check the algorithm.

Wednesday, December 29, 2010

v0.2 Released

Version 0.2 of the site is finally released. It's not much like I planned and instead contains a much improved interface for results and performance. As well as looking better, both sections have click through or drill down. For example, from the inital Best Performances page you can select best performances for a race and then from there, best performances for an athlete.

As much as I would like to spend some time working on the stats page for v0.3, I really need to rework the FFA interface, to get in more data and also to restructure the database somewhat.

Wednesday, December 8, 2010

FFA Interface

The site now has a basic interface that allows importing race data from the FFA site. It’s been more difficult than expected due to the weird way the FFA presents its data, especially for distances. I’ve so far imported races for the last 6 months of 2010 although there a few missing due to a bug in the importer. I’ll import the missing race data and the data for the rest of 2010 in the next few days. Going forward should be much easier.

Next up is to work a bit on the presentation which is “bare minimum” at the moment. I’ve got a new race selector tool half developed so hope to get it out there soon.

Monday, November 29, 2010

Results web site v0.1 release

Due to some unforeseen spare time, caused by a bad back, I've been developing a web site that automatically crawls around the internet to retrieve and store race results for all the runners at my club, the Foulées de Saint Germain en Laye. The more visible part of the project is the front end that allows the results to be presented in various formats and can be found at http://FSGL.wni-sec.com.

It’s still very basic but it will get better as time goes by. I need to work on the presentation as well, not least putting it into French.

I’d like to emphasize that my motivation for the site is, to misquote Mark Zuckerberg,

“Sometimes you do something because you like building things”.

Anyhow, please feel free to leave comments, suggestions, general abuse etc... Any criticism is acceptable, but remember, I know where you live, or can probably find out.