Singular Value Decomposition (SVD) Tutorial
Singular Value Decomposition (SVD) Tutorial - How does SVD work? PDF Print E-mail
Article Index
Singular Value Decomposition (SVD) Tutorial
How does SVD work?
Extending the SVD with More Factors
All Pages

Extending the SVD with More Factors

More complicated matrices cannot be completely predicted just by using one set of factors as we have done. In that case, we have to introduce a second set of factors to refine our predictions. To do that, we subtract our predicted scores from the actual scores, getting the residual scores. Then we find a second set of HoleDifficulty2 and PlayerAbility2 numbers that best predict the residual scores.

Rather than guessing HoleDifficulty and PlayerAbility factors and subtracting predicted scores, there exist powerful algorithms than can calculate SVD factorizations for you. Let's look at the actual scores from the first 9 holes of the 2007 Players Championship as played by Phil, Tiger, and Vijay.


Hole Par Phil Tiger Vijay
1 4 4 4 5
2 5 4 5 5
3 3 3 3 2
4 4 4 5 4
5 4 4 4 4
6 4 3 5 4
7 4 4 4 3
8 3 2 4 4
9 5 5 5 5

The 1-D SVD factorization of the scores is shown below. To make this example easier to understand, I have incorporated the ScaleFactor into the PlayerAbility and HoleDifficulty vectors so we can ignore the ScaleFactor for this example.


Phil Tiger Vijay
3.95 4.64 4.34
4.27 5.02 4.69
2.42 2.85 2.66
3.97 4.67 4.36
3.64 4.28 4.00
3.69 4.33 4.05
3.33 3.92 3.66
3.08 3.63 3.39
4.55 5.35 5.00
=
HoleDifficulty
4.34
4.69
2.66
4.36
4.00
4.05
3.66
3.39
5.00
*
PlayerAbility
Phil Tiger Vijay
0.91 1.07 1.00

Notice that the HoleDifficulty factor is almost the average of that hole for the 3 players. For example hole 5, where everyone scored 4, does have a factor of 4.00. However hole 6, where the average score is also 4, has a factor of 4.05 instead of 4.00. Similarly, the PlayerAbility is almost the percentage of par that the player achieved, For example Tiger shot 39 with par being 36, and 39/36 = 1.08 which is almost his PlayerAbility factor (for these 9 holes) of 1.07.

Why don't the hole averages and par percentages exactly match the 1-D SVD factors? The answer is that SVD further refines those numbers in a cycle. For example, we can start by assuming HoleDifficulty is the hole average and then ask what PlayerAbility best matches the scores, given those HoleDifficulty numbers? Once we have that answer we can go back and ask what HoleDifficulty best matches the scores given those PlayerAbility numbers? We keep iterating this way until we converge to a set of factors that best predict the score. SVD shortcuts this process and immediately give us the factors that we would have converged to if we carried out the process.

One very useful property of SVD is that it always finds the optimal set of factors that best predict the scores, according to the standard matrix similarity measure (the Frobenius norm). That is, if we use SVD to find the factors of a matrix, those are the best factors that can be found. This optimality property means that we don't have to wonder if a different set of numbers might predict scores better.

Now let's look at the difference between the actual scores and our 1-D approximation. A plus difference means that the actual score is higher than the predicted score, a minus difference means the actual score is lower than the prediction. For example, on the first hole Tiger got a 4 and the predicted score was 4.64 so we get 4 - 4.64 = -0.64. In other words, we must add -0.64 to our prediction to get the actual score.

Once these differences have been found, we can do the same thing again and predict these differences using the formula HoleDifficulty2 * PlayerAbility2. Since these factors are trying to predict the differences, they are the 2-D factors and we have put a 2 after their names (ex. HoleDifficulty2) to show they are the second set of factors.


Phil Tiger Vijay
0.05 -0.64 0.66
-0.28 -0.02 0.31
0.58 0.15 -0.66
0.03 0.33 -0.36
0.36 -0.28 0.00
-0.69 0.67 -0.05
0.67 0.08 -0.66
-1.08 0.37 0.61
0.45 -0.35 0.00
=
HoleDifficulty2
-0.18
-0.38
0.80
0.15
0.35
-0.67
0.89
-1.29
0.44
*
PlayerAbility2
Phil Tiger Vijay
0.82 -0.20 -0.53

There are some interesting observations we can make about these factors. Notice that hole 8 has the most significant HoleDifficulty2 factor (-1.29). That means that it is the hardest hole to predict. Indeed, it was the only hole on which none of the 3 players made par. It was especially hard to predict because it was the most difficult hole relative to par (HoleDifficulty - par) = (3.39 - 3) = 0.39, and yet Phil birdied it making his score more than a stroke below his predicted score (he scored 2 versus his predicted score of 3.08). Other holes that were hard to predict were holes 3 (0.80) and 7 (0.89) because Vijay beat Phil on those holes even though, in general, Phil was playing better.



 
Joomla Templates by Joomlashack