Here I summarize some problem types explained in a course taught by Alex Smola.
I must say again that I'm writing about things I'm learning. I do not completely understand some of these topics.1. Binary Classification:
It's probably the most frequently studied problem in machine learning. Given a pattern $x$ drawn from a domain $X$, estimate which value an associated binary random variable $y \in \{\pm1\}$.
2. Multiclass Classification:
It's an extension of binary classification. The difference is that now $y \in \{1, \ldots, n\}$
3. Structured Estimation:
It goes beyond simple multiclass estimation by assuming that the labels $y$ have some additional structure that can be used in the estimation process. This is unclear and may be explained in another post.
4. Regression:
Here the goal is to estimate a real-valued variable $y \in \mathbb{R}$ given a pattern $x$.
5. Novelty Detection:
Describes the issue of determining "unusual" observations given a set of past measurements. This is also unclear and may be explained in another post.
Now some Learning-related concepts
In a dataset, we may find sequences of $(x^{(i)}, y^{(i)})$ pairs for which $y^{(i)}$ needs to be estimated in an instantaneous online fashion. This is called online learning.
1. Offline/Batch Learning
So there are two collections
\[
X := \{x^{(1)}, x^{(2)}, \ldots, x^{(m)}\}\\
Y := \{y^{(1)}, y^{(2)}, \ldots, y^{(m)}\}
\]
Which are used to estimate $y$ for a set of unseen data
$X' = \{x'^{(1)}, x'^{(2)}, \ldots, x'^{(m')}\}$
This is called induction.
We might be allowed to know $X'$ already at the time of constructing the model. Then this is commonly referred to as transduction.
2. Active Learning
If we are allowed to choose $X$ for the purpose of model building.
3. Learning/Estimation with Missing Variables
If we don't have full information about $X$, e.g., some of the coordinates of the $x_i$ might be missing.
4. Covariate Shift Correction
If the sets $X$ and $X'$ might come from different data sources.
5. Co-training.
If we are given observations stemming from two problems at the same time with the side information that both problems are somehow related.