Data Mining: Classification VS Clustering (cluster analysis)

For someone who is new to Data mining, classification and clustering can seem similar because both data mining algorithms essentially “divide” the datasets into sub-datasets; But there is difference between them and this blog-post, we’ll see exactly that:

CLASSIFICATION CLUSTERING
  • We have a Training set containing data that have been previously categorized
  • Based on this training set, the algorithms finds the category that the new data points belong to
  • We do not know the characteristics of similarity of data in advance
  • Using statistical concepts, we split the datasets into sub-datasets such that the Sub-datasets have “Similar” data
Since a Training set exists, we describe this technique as Supervised learning Since Training set is not used, we describe this technique as Unsupervised learning
Example:We use training dataset which categorized customers that have churned. Now based on this training set, we can classify whether a customer will churn or not. Example:We use a dataset of customers and split them into sub-datasets of customers with “similar” characteristics. Now this information can be used to market a product to a specific segment of customers that has been identified by clustering algorithm

If you want to learn about Data Mining, check out the “free Book in PDF format: Mining the massive data-sets”.

About these ads

7 thoughts on “Data Mining: Classification VS Clustering (cluster analysis)

  1. Pingback: What is the difference between Data Analysis and Data Mining? | Paras Doshi's Blog

  2. Pingback: PowerPivot Model: Why am I not seeing “Month Names” in correct logical order? | Paras Doshi's Blog

  3. Pingback: Excel data Mining in Action: Forecasting Twitter Followers for next week | Paras Doshi – Blog

  4. Pingback: Machine Learning VS. Data Mining | Paras Doshi – Blog

  5. Pingback: Amplifying Information Using S-Clustering « Pharmaceutical Intelligence

  6. Pingback: Data Mining Demo for Marketing vertical: How to create a Targeted mailing list? | Paras Doshi – Blog

  7. Pingback: What’s “Naive” about Naive Bayes Machine Learning Algorithm? | Paras Doshi - Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s