This article is a continuation of the previous article on Distance based clustering using Cosine Similarity. Read previous post so you can follow all steps.
Using Euclidean Distance For Clustering
Follow the Step 1 to Step 3 from previous post.
Now, we have to calculate distance among these documents. We have two methods Euclidean Distance and Cosine Similarity.
Mathematically Euclidean Distance (ED) can be represented as:
Where D is the distance calculated between documents q and p. qi represents the ith term in Term Incidence Matrix.
For example, if we want to calculate distance between documents d2 and d3 then: