Cluster and Distance Measure
Segmentation; Unsupervised learning
Clustering is the assignment of objects to groups of similar objects (clusters). The objects are typically described as vectors of features (also called attributes). So if one has n attributes, object x is described as a vector (x1,..,xn). Attributes can be numerical (scalar) or categorical. The assignment can be hard, where each object belongs to one cluster, or fuzzy, where an object can belong to several clusters with a probability. The clusters can be overlapping, though typically they are disjoint. Fundamental in the clustering process is the use of a distance measure.
In the clustering setting, a distance (or equivalently a similarity) measure is a function that quantifies the similarity between two objects.