Ultimate Tennis Statistics aims to become the ultimate tennis statistics destination for die-hard tennis fans. It tries to provide all kinds of tennis statistics in the Open Era men's tennis with a simple yet effective web GUI. If you have any suggestions for additional features or tweaks to the current features, please e-mail me at mcekovic@gmail.com or log a GitHub issue.
More info about feature details can be found on the Glossary page.
Some useful tips can be found on the Tips page.
Data Source
Data on which the statistics are based is from the open-source tennis data repository by Jeff Sackmann, with some corrections and additions where data is wrong or lacking.
Even with these corrections and additions, there are certainly still some errors, most notably for tournaments and rankings in the sixties and seventies.
Rankings before official ATP rankings started in the 1973 season are estimated and not complete.
In-Match Statistics
In-Match statistics data (Aces, DFs, Serve %...) started to be recorded from 1991 onward, and thus this data is available on UTS also only for seasons 1991 onward.
Data Update
Data is updated every Monday morning with new rankings and newly completed tournaments, followed by re-computation of aggregated data and records. In-progress tournament forecasts are updated every 15 minutes. Data is realigned with Jeff Sackmann's repository once a year in the tennis off-season.
Non-ATP Data
Qualifying, Challengers and Futures matches are currently not included. They are in medium-to-remote plan to be included: #328
Woman (WTA) tennis data is not included. For now, it is only in the very remote plans: #289
Raw Data
For raw data please check out Jeff Sackmann tennis data repository or the Ultimate Tennis Statistics pre-populated database Docker image mcekovic/uts-database.
Data Feedback
Please provide feedback on data as well at mcekovic@gmail.com or GitHub.
Grand Slam Tour Finals Alt. Finals Masters Olympics ATP 500 ATP 250
Masters
Masters tournaments classification is per ATP starting from 1990 season.
Unfortunately, using top group within Grand Prix Super Series as Masters tournaments is not practical because the group size varies significantly from season to season (from 2 to 30).
Instead, 9 tournaments with the strongest participation of top players based on player ranks are considered Masters tournaments.
Tour Finals and Alternative Tour Finals
In addition to official Tour Finals tournaments, Dallas WCT Finals (1971-1989), Grand Slam Cup (1990-1999) and Tennis Champions Classic (1970, 1971) are considered alternative Tour Finals and are weighted a little less than official Tour Finals, the same as Masters tournaments.
ATP 500 / 250
Separation of ATP 500/Championship Series vs ATP 250/World Series for seasons 1990 and onwards is as per ATP.
For seasons before 1990, 11 tournaments per season with the strongest participation based on player ranks are considered ATP 500 tournaments.
For seasons 1968 and 1969, to compensate for the lack of Tour Finals, 16 strongest tournaments per season are considered ATP 500 tournaments.
Pepsi Grand Slam (1976-1981), WCT Challenge Cup (1976-1980) and Seasonal WCT Finals (1972, 1982) are also considered as ATP 500 tournaments.
Indoor: Hard (i) Clay (i) Carpet
Very Fast
Fast
Medium Fast
Medium
Medium Slow
Slow
Very Slow
Court Speed Index tries to estimate tournament event's court speed based on overall Ace %, Service Points Won % and Service Games Won %, averaged in a way to eliminate dependency on participating players dominant playing style (for more details see Glossary page).
A lot of content on this website is based on 'GOAT' Points formula, which is a formula to quantify tennis player achievements throughout their careers and to compare players from different eras. 'GOAT' Points formula is based on assigning 'GOAT' points to players for tournament results, ATP and Elo rankings and various important achievements. For a visual description of the 'GOAT' Points formula please click:
Please do not perform massive parallel crawling of the Ultimate Tennis Statistics website!
This is a very efficient and optimized website. You can see by yourself that even the most complex pages with very complex tennis statistics data open very quickly.
However, it is hosted on very modest hardware (1 CPU Core, 2 GB RAM), hence massive parallel crawling will seriously hamper the browsing experience of the regular visitors.
This is a free website with a sort-of-premium content, please do not misuse it!
If you need raw data, please consider Jeff Sackmann's repository or Docker image mcekovic/uts-database of the Ultimate Tennis Statistics pre-populated database.
Ultimate Tennis Statistics is powered by the Open-Source Software:
Linux | PostgreSQL | Java | Spring Boot |
Ultimate Tennis Statistics logo is a courtesy of and designed by crazysesirdzija@gmail.com.
Ultimate Tennis Statistics and Tennis Crystal Ball source code is available at GitHub and is licensed under the Apache 2.0 License.
'GOAT' Points formula, customizations of Elo Ratings for tennis, Tennis Crystal Ball Match Prediction, Tournament Forecast and other algorithms by Ultimate Tennis Statistics are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
In short: Attribution is required. Non-commercial use only.
About Tournament Forecast
Tournament Forecast is driven by individual Match Prediction. In each round, probabilities for each match in the draw are calculated using Neural Network Match Prediction Algorithm.
Based on this probabilities, chances for probable matchups in the further tournament rounds are calculated. Finally, the probability of the player to win the title is calculated as a multiplier of probabilities to win in the each of the rounds.
If the round is far, like semi-final or final, there are many potential opponents and probabilities for a player to win over all of them are calculated.
For example, a probability to win the title depends on the probability of the player to reach the final as well as probabilities of all players in the other half of the draw to reach the final, multiplied by probabilities for a player to win the final match over the each of them.
Tracking Tournament Progress
As a tournament progresses, the outcome of some matches gets known, thus the match probabilities are set to 100% and 0% for the winner and for the loser respectively.
Elo Ratings are recalculated after the each round and Elo rating points earned/lost by wins/loss in all the previous rounds (including current round if the match is finished) are presented in brackets.
Unknown Qualifiers
Sometimes, as initial tournament draws are out, they include unknown qualifiers. The probability for the player to win over an unknown qualifier is determined by variation of the Match Prediction algorithm that includes average Elo Rating and ATP ranking points of the qualifiers as well as winning percentages vs qualifiers, overall and by surface, level, etc...
Mathematical details
Lets name match probability that player A wins over player B as PmA vs B.
These probabilities determine the probability for each player to pass to the second round PrA(R2) = PmA vs B.
The probability of the player A to reach the next round R+1 is calculated this way:
PrA(R+1) = PrA(R) * ΣN=1-n ( PrN(R) * PmA vs N )
This means that probability for player A to reach the next round R+1 depend on probability for player A to reach the previous round R multiplied by the weighted sum of probabilities for player A to win over his potential opponents in the next round.
Weights of the potential opponents are the probabilities of each opponent to reach the round R.
About Tennis Crystal Ball Match Prediction Algorithm
Match Prediction is based on players' previous results and track records.
Previous results are analyzed by the Neural Network algorithm with ~60 neurons for different features about players like Elo Ratings, ATP Points,
Head-to-Head ratios, Winning Percentages and Recent Form varied by surface, tournament level, round, tournament, recency, match or set ratios, vs rank, vs hand, vs backhand...
Match win probabilities given by each of the features (neurons) are then combined by the neural network using different weights.
Training and Tuning
Neural Network is trained on the historical data for the highest prediction rates and to determine optimal feature weights.
In order to further increase prediction accuracy, the Neural Network is trained specifically for different surfaces, resulting in different feature weights per surface.
During training, some neurons are determined to be useless and they are removed from the network, thus about ~40 neurons remain.
Primary and Secondary Probability Contributors
Elo Ratings, overall, by surface and by set, are the primary contributors to the match prediction, followed by the recent form, H2H and winning percentages.
Elo Rating neurons individually give high prediction rates, but when they are combined with the recent form, H2H and various winning percentages, the prediction accuracy is even further increased.
However, the importance of the secondary contributors is very surface dependent, so for example on grass, recent form is pretty much irrelevant because the momentum of form is often disturbed by the surface adaptation and because of a relatively short length of the grass season.
Instead, winning percentages are more important on grass than on other surfaces.
Mental Toughness Rating
Mental Toughness Rating compares players in pressure situations: Mental Toughness Rating = Mental Points won / Mental Points lost
Mental Points
Mental Points are weighted pressure situations: Mental Point = 2 * Best-of-3 Deciding Set + 4 * Best-of-5 Deciding Set + 2 * Final Match + Non-Deciding Set Tie-Break + 2 * Deciding Set Tie Break