Latent Semantic Analysis (LSA) Tutorial
Latent Semantic Analysis (LSA) Tutorial - Part 4 - Clustering by Color PDF Print E-mail
Article Index
Latent Semantic Analysis (LSA) Tutorial
A Small Example
Part 1 - Creating the Count Matrix
Part 2 - Modify the Counts with TFIDF
Part 3 - Using the Singular Value Decomposition
Part 4 - Clustering by Color
Part 5 - Clustering by Value
Advantages, Disadvantages, and Applications of LSA
All Pages

Part 4 - Clustering by Color

We can also turn the numbers into colors. For instance, here is a color display that corresponds to the first 3 dimensions of the Titles matrix that we showed above. It contains exactly the same information, except that blue shows negative numbers, red shows positive numbers, and numbers close to 0 are white. For example, Title 9, which is strongly positive in all 3 dimensions, is also strongly red in all 3 dimensions.

Top 3 Dimensions of Book Titles

We can use these colors to cluster the titles. We ignore the first dimension for clustering because all titles are red. In the second dimension, we have the following result.

Dim2 Titles
red 6-7, 9
blue 1-5, 8

Using the third dimension, we can split each of these groups again the same way. For example, looking at the third dimension, title 6 is blue, but title 7 and title 9 are still red. Doing this for both groups, we end up with these 4 groups.

Dim2 Dim3 Titles
red red 7, 9
red blue 6
blue red 2, 4-5, 8
blue blue 1, 3

It’s interesting to compare this table with what we get when we graph the results in the next section.



 
Joomla Templates by Joomlashack