Unsupervised learning
Contents
Unsupervised learning#
In unsupervised learning we start with a data matrix:
Quantitative, eg. weight, height, number of children, …;
Qualitative, eg. college major, profession, gender, …;
Goals of unsupervised learning#
In unsupervised learning we start with a data matrix:
Our goal is to:
Find meaningful relationships between the variables or units: Correlation analysis.
Find interpretable low-dimensional representations of the data which make it easy to visualize the variables and units. PCA, ICA, isomap, locally linear embeddings, etc.
Find meaningful groupings of the data. Clustering.
Unsupervised learning is sometimes referred to in Statistics as exploratory data analysis.
Striking example#
1387 European subjects were genotyped and differences (SNPs) are measured
Can be used to form a distance between subjects.
This distance looks surprisingly close to a map of Europe