MSE is a commonly used error metric. But is it principly justified? In this post we show that minimising the mean-squared error (MSE) is not just something vaguely intuitive, but emerges from maximising the likelihood on a linear Gaussian model. Defining the terms Linear Gaussian Model Assume the data is described by the linear model , where . Assume is … Read More

## Maximum Likelihood as minimising KL Divergence

Sometimes you come across connections that are simple and beautiful. Here’s one of them! What the terms mean Maximum likelihood is a common approach to estimating parameters of a model. An example of model parameters could be the coefficients in a linear regression model , where is Gaussian noise (i.e. it’s random). Here we choose parameter values that maximise the … Read More