Clustering with Euclidean distance

This article is a continuation of the previous article on Distance based clustering using Cosine Similarity. Read previous post so you can follow all steps.

Using Euclidean Distance For Clustering

Follow the Step 1 to Step 3 from previous post.

Now, we have to calculate distance among these documents. We have two methods Euclidean Distance and Cosine Similarity.

Mathematically Euclidean Distance (ED) can be represented as:

Euclidean distance formula

Where D is the distance calculated between documents q and p. qi represents the ith term in Term Incidence Matrix.

For example, if we want to calculate distance between documents d2 and d3 then:

Leave a Reply

Your email address will not be published. Required fields are marked *