Bias-variance Decomposition
The notation used is as follows:
Symbol | Notation |
---|---|
\(\mathcal D\) | the dataset |
\(x\) | the sample |
\(y_\mathcal D\) | the observation of \(x\) in \(\mathcal D\), affected by noise |
\(y\) | the real value of \(x\) |
\(\bar y\) | the mean of the real values |
\(f\) | the model learned with \(\mathcal D\) |
\(f(x)\) | the prediction of \(f\) with \(x\) |
\(\bar f(x)\) | the expectation of prediction of \(f\) with \(x\) |
\(l(f(x), y_\mathcal D)\) | the loss function, chosen to be squared error |
By assuming that the observation errors averages to \(0\), the expectation of the error will be \[ \begin{aligned} E_{x \sim \mathcal D}&[l(f(x), y_\mathcal D)] = E[(f(x) - y_\mathcal D)^2] = E\{[(f(x) - \bar f(x) + (\bar f(x) - y_\mathcal D)]^2\} \\ &= E[(f(x) - \bar f(x))^2] + E[(\bar f(x) - y_\mathcal D)^2] + 2E[(f(x) - \bar f(x))(\bar f(x) - y_\mathcal D)] \\ &= E[(f(x) - \bar f(x))^2] + E\{[(\bar f(x) - y) + (y - y_\mathcal D)]^2\} \\ &\quad + 2\underbrace{E[f(x) - \bar f(x)]}_0 E[\bar f(x) - y_\mathcal D] \\ &= E[(f(x) - \bar f(x))^2] + E[(\bar f(x) - y)^2] + E[(y - y_\mathcal D)^2] + 2E[(\bar f(x) - y)(y - y_\mathcal D)] \\ &= E[(f(x) - \bar f(x))^2] + E[(y - y_\mathcal D)^2] + E\{[(\bar f(x) - \bar y) + (\bar y - y)]^2\} \\ &\quad + 2E[\bar f(x) - y] \underbrace{E[y - y_\mathcal D]}_0 \\ &= E[(f(x) - \bar f(x))^2] + E[(y - y_\mathcal D)^2] + E\{[(\bar f(x) - \bar y) + (\bar y - y)]^2\} \\ &= E[(f(x) - \bar f(x))^2] + E[(y - y_\mathcal D)^2] + E[(\bar f(x) - \bar y)^2] + E[(\bar y - y)^2] + 2E[(\bar f(x) - \bar y)(\bar y - y)] \\ &= E[(f(x) - \bar f(x))^2] + E[(y - y_\mathcal D)^2] + E[(\bar f(x) - \bar y)^2] + E[(\bar y - y)^2] \\ &\quad + 2E[\bar f(x) - \bar y]\underbrace{E[\bar y - y]}_0 \\ &= \underbrace{E[(f(x) - \bar f(x))^2]}_{variance} + \underbrace{E[(\bar f(x) - \bar y)^2]}_{bias^2} + \underbrace{E[(y - y_\mathcal D)^2]}_{noise} + \underbrace{E[(\bar y - y)^2]}_{scatter} \\ \end{aligned} \] 5 ways to achieve right balance of Bias and Variance in ML model | by Niwratti Kasture | Analytics Vidhya | Medium