What is Unsupervised Learning?
Unsupervised learning algorithms are used with the goal of recognizing patterns and/or structures in data or extracting information. They are therefore used descriptively and do not require "labeled" data sets. Thus, they are well suited for exploratory data analyses.
The three most commonly listed uses of Unsupervised ML algorithms are clustering, association analysis, and dimension reduction.
The first and best known application of Unsupervised ML algorithms is clustering. The algorithm divides the data set into different groups (clusters) based on the similarity of the data and thus finds a structure in the data. An example is assigning apples, strawberries and bananas into three clusters based on their characteristics such as shape and color. However, the algorithm does not know that these are different types of fruit, but simply assigns the fruit to the clusters based on similarity.
The second area of application for Unsupervised ML algorithms is association analysis. In very simplified terms, the algorithm identifies objects that frequently occur together in a data set. An example for practical use are so-called shopping cart analyses, in which it is examined which products are purchased together.
The last area of application of Unsupervised ML algorithms is dimension reduction. The basic goal is to reduce the complexity of the data set while maintaining the information content as best as possible.
A well-known and commonly used unsupervised learning algorithm is K-Means.
While unsupervised learning has many advantages, it also brings some challenges: Due to the high computational complexity, the calculations are often time-consuming and resource-intensive. In addition, it is difficult to track and verify the outcome of Unsupervised Learning algorithms. Furthermore, the outputs of unsupervised learning algorithms always have to be interpreted by humans.
Damage good. All good.