Supervised and Unsupervised Learning
In supervised learning: we have access to
n
features
x
1
,
x
2
, ...,
x
n
measured on
m
observations. The goal is to predict an associate
response variable
ˆ
y
(that is, also measured on those
m
observations) using
x
1
,
x
2
, ...,
x
n
.
In unsupervised learning: we only have access to
x
1
,
x
2
, ...,
x
n
measured on
m
observations. The goal is to discover patterns and
interesting things about measurements on
x
1
,
x
2
, ...,
x
n
. For
instance:
How can we visualise the data effectively?
Can we find subgroups among the observations?
Can we find subgroups among the features and use it for
dimensionality reduction?
3 / 36