The risk of an optimal model describes an empirical method for determining the optimal model \( \htmlId{tooltip-optimalModel}{\hat{f}} \) for a given problem. It accomplishes this task by evaluating all candidate models \( \htmlId{tooltip-model}{h} \) from the hypothesis space \( \htmlId{tooltip-hypothesisSpace}{\mathcal{H}} \) on a sampled dataset and selecting the model with the minimum risk.
\(\hat{f}\) | This symbol denotes the optimal model for a problem. |
\(y\) | This symbol stands for the ground truth of a sample. In supervised learning this is often paired with the corresponding input. |
\(\mathcal{H}\) | This is the symbol representing the set of possible models. |
\(h\) | This symbol denotes a model in machine learning. |
\(u\) | This symbol denotes the input of a model. |
The symbol \( \mathcal{H} \) denotes the set of possible models, often from a particular class like "polynomials of any degree" or "multi-layer perceptron networks". For any learning algorithm, \( \mathcal{H} \) indicates the space where an optimal model may be found.
The symbol \(\hat{f}\) denotes the optimal model for a problem. It yields the lowest risk \( \htmlId{tooltip-risk}{R} \) for pairs of inputs and outputs. The goal of machine learning is to optimize \( \htmlId{tooltip-model}{h} \) until it becomes \(\hat{f}\).
Suppose, we have the following models with their empirical risk calculated on an arbitrary dataset of samples:
\[\begin{align*}\htmlId{tooltip-risk}{R}^{emp}(\htmlId{tooltip-model}{h}_1) &= 3 \\\htmlId{tooltip-risk}{R}^{emp}(\htmlId{tooltip-model}{h}_2) &= 2.3 \\\htmlId{tooltip-risk}{R}^{emp}(\htmlId{tooltip-model}{h}_3) &= 6\end{align*}\]
Using the equation described above, we conclude observe that the optimal model \( \htmlId{tooltip-optimalModel}{\hat{f}} \) is the model /.h with the lowest risk.
Therefore, we obtain \( \htmlId{tooltip-optimalModel}{\hat{f}} \) = \(\htmlId{tooltip-model}{h}_2\).
Was this page helpful?