Ultimate Tennis Statistics aims to become the ultimate tennis statistics destination for the die-hard tennis fans. It tries to provide all kind of tennis statistics in the Open Era men's tennis with simple yet effective web GUI. If you have any suggestions for additional features or tweaks to the current features, please e-mail me at firstname.lastname@example.org or log a GitHub issue.
More info about feature details can be found at Glossary page.
Some useful tips can be found at Tips page.
Data on which the statistics are based is from the open source tennis data repository by Jeff Sackmann, with some corrections and additions where data is wrong or lacking.
Even with these corrections and additions, there are still small errors and data missing, most notably for many tournaments between 1968 and 1972.
Rankings before official ATP rankings started in 1973 season are estimated and as well still not complete.
In-Match statistics data (Aces, DFs, Serve %...) started to be recorded from 1991 onward, and thus this data is available on UTS also only for seasons 1991 onward.
Data is updated every Monday morning with new rankings and newly completed tournaments, followed by re-computation of aggregated data and records. In-progress tournament forecasts are updated every 15 minutes. Data is realigned with Jeff's repository once a year in the tennis off-season.
Qualifying, Challengers and Futures matches are currently not included. They are in medium-to-remote plan to be included: #328
Woman tennis data is not included, for now it is only in the very remote plans: #289
Please provide feedback on data as well at email@example.com or GitHub.
Court Speed Index tries to estimate tournament event's court speed based on overall Ace %, Service Points Won % and Service Games Won %, averaged in a way to eliminate dependency on participating players dominant playing style (for more details see Glossary page).
A lot of content on this website is based on 'GOAT' Points formula, which is a formula to quantify tennis player achievements throughout their careers and to compare players from different eras. 'GOAT' Points formula is based on assigning 'GOAT' points to players for tournament results, ATP and Elo rankings and various important achievements. For a visual description of the 'GOAT' Points formula please click:
Ultimate Tennis Statistics is powered by the Open-Source Software:
Ultimate Tennis Statistics logo is a courtesy of and designed by firstname.lastname@example.org.
Ultimate Tennis Statistics and Tennis Crystal Ball source code is licensed under the Apache 2.0 License.
'GOAT' Points formula, customizations of Elo Ratings for tennis, Tennis Crystal Ball Match Prediction, Tournament Forecast and other algorithms by Ultimate Tennis Statistics are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
In short: Attribution is required. Non-commercial use only.
If you like this website and want to support it, please consider a small donation to support the project.
All donations will be used only for paying the web hosting bill.
About Tournament Forecast
Tournament Forecast is driven by individual Match Prediction. In each round, probabilities for each match in the draw are calculated using Neural Network Match Prediction Algorithm.
Based on this probabilities, chances for probable matchups in the further tournament rounds are calculated. Finally, the probability of the player to win the title is calculated as a multiplier of probabilities to win in the each of the rounds.
If the round is far, like semi-final or final, there are many potential opponents and probabilities for a player to win over all of them are calculated.
For example, a probability to win the title depends on the probability of the player to reach the final as well as probabilities of all players in the other half of the draw to reach the final, multiplied by probabilities for a player to win the final match over the each of them.
Tracking Tournament Progress
As a tournament progresses, the outcome of some matches gets known, thus the match probabilities are set to 100% and 0% for the winner and for the loser respectively.
Elo Ratings are recalculated after the each round and Elo rating points earned/lost by wins/loss in all the previous rounds (including current round if the match is finished) are presented in brackets.
Sometimes, as initial tournament draws are out, they include unknown qualifiers. The probability for the player to win over an unknown qualifier is determined by variation of the Match Prediction algorithm that includes average Elo Rating and ATP ranking points of the qualifiers as well as winning percentages vs qualifiers, overall and by surface, level, etc...
Lets name match probability that player A wins over player B as PmA vs B.
These probabilities determine the probability for each player to pass to the second round PrA(R2) = PmA vs B.
The probability of the player A to reach the next round R+1 is calculated this way:
PrA(R+1) = PrA(R) * ΣN=1-n ( PrN(R) * PmA vs N )
This means that probability for player A to reach the next round R+1 depend on probability for player A to reach the previous round R multiplied by the weighted sum of probabilities for player A to win over his potential opponents in the next round. Weights of the potential opponents are the probabilities of each opponent to reach the round R.
About Tennis Crystal Ball Match Prediction Algorithm
Match Prediction is based on players' previous results and track records.
Previous results are analyzed by the Neural Network algorithm with ~60 neurons for different features about players like Elo Ratings, ATP Points, Head-to-Head ratios, Winning Percentages and Recent Form varied by surface, tournament level, round, tournament, recency, match or set ratios, vs rank, vs hand, vs backhand...
Match win probabilities given by each of the features (neurons) are then combined by the neural network using different weights.
Training and Tuning
Neural Network is trained on the historical data for the highest prediction rates and to determine optimal feature weights.
In order to further increase prediction accuracy, the Neural Network is trained specifically for different surfaces, resulting in different feature weights per surface.
During training, some neurons are determined to be useless and they are removed from the network, thus about ~40 neurons remain.
Primary and Secondary Probability Contributors
Elo Ratings, overall, by surface and by set, are the primary contributors to the match prediction, followed by the recent form, H2H and winning percentages.
Elo Rating neurons individually give high prediction rates, but when they are combined with the recent form, H2H and various winning percentages, the prediction accuracy is even further increased.
However, the importance of the secondary contributors is very surface dependent, so for example on grass, recent form is pretty much irrelevant because the momentum of form is often disturbed by the surface adaptation and because of a relatively short length of the grass season. Instead, winning percentages are more important on grass than on other surfaces.