|Date||Tournament||Surface||Draw||Pls.||Part.||Str.||Elo||1-st Favorite||2-nd Favorite||Forecast|
About Tournament Forecast
Tournament Forecast is driven by individual Match Prediction. In each round, probabilities for each match in the draw are calculated using Neural Network Match Prediction Algorithm.
Based on this probabilities, chances for probable matchups in the further tournament rounds are calculated. Finally, the probability of the player to win the title is calculated as a multiplier of probabilities to win in the each of the rounds.
If the round is far, like semi-final or final, there are many potential opponents and probabilities for a player to win over all of them are calculated.
For example, a probability to win the title depends on the probability of the player to reach the final as well as probabilities of all players in the other half of the draw to reach the final, multiplied by probabilities for a player to win the final match over the each of them.
Tracking Tournament Progress
As a tournament progresses, the outcome of some matches gets known, thus the match probabilities are set to 100% and 0% for the winner and for the loser respectively.
Elo Ratings are recalculated after the each round and Elo rating points earned/lost by wins/loss in all the previous rounds (including current round if the match is finished) are presented in brackets.
Sometimes, as initial tournament draws are out, they include unknown qualifiers. The probability for the player to win over an unknown qualifier is determined by variation of the Match Prediction algorithm that includes average Elo Rating and ATP ranking points of the qualifiers as well as winning percentages vs qualifiers, overall and by surface, level, etc...
Lets name match probability that player A wins over player B as PmA vs B.
These probabilities determine the probability for each player to pass to the second round PrA(R2) = PmA vs B.
The probability of the player A to reach the next round R+1 is calculated this way:
PrA(R+1) = PrA(R) * ΣN=1-n ( PrN(R) * PmA vs N )
This means that probability for player A to reach the next round R+1 depend on probability for player A to reach the previous round R multiplied by the weighted sum of probabilities for player A to win over his potential opponents in the next round. Weights of the potential opponents are the probabilities of each opponent to reach the round R.
About Tennis Crystal Ball Match Prediction Algorithm
Match Prediction is based on players' previous results and track records.
Previous results are analyzed by the Neural Network algorithm with ~60 neurons for different features about players like Elo Rating, Surface Elo Rating, ATP Points, Recent Form, Head-to-Head ratios and Winning Percentages varied by surface, tournament level, tournament, round, recency, match or set ratios, vs rank, vs hand, vs backhand...
Match win probabilities given by each of the features (neurons) are then combined by the neural network using different weights.
Training and Tuning
Neural Network is trained on the historical data for the highest prediction rates and to determine optimal feature weights.
In order to further increase prediction accuracy, the Neural Network is trained specifically for different surfaces, resulting in different feature weights per surface.
During training, some neurons are determined to be useless and they are removed from the network, thus about ~40 neurons remain.
Primary and Secondary Probability Contributors
Elo Ratings, overall, by surface and by set, are the primary contributors to the match prediction, followed by the recent form, H2H and winning percentages.
Elo Rating neurons individually give high prediction rates, but when they are combined with the recent form, H2H and various winning percentages, the prediction accuracy is even further increased.
However, the importance of the secondary contributors is very surface dependent, so for example on grass, recent form is pretty much irrelevant because the momentum of form is often disturbed by the surface adaptation and because of a relatively short length of the grass season. Instead, winning percentages are more important on grass than on other surfaces.