I’m just learning the Machine Learning course of Andrew Ng on coursera , and at the Normal Equation lesson I encountered a matrix formula to compute **regression coefficients** without the explanation of how to come up with that.

There are some blogs on the Internet prove this formula in very detail and I just want to share an easy way to explain as well as to memorize the formula that is efficiently and handy.

So we have matrix as the **design matrix**, and is the output vector of size (m+1). We want to find matrix so that:

$latex X \Theta = Y$

All we want to do is isolating in the left side (just like we always do to isolate x when solving a equation). And to do that, we want to “bring” what multiply by to the right side, and we can do that only when “what” is a square matrix (make sense right ?).

So now is not a square matrix yet, we will multiply by its **transpose matrix**, which is $latex X^{T}$. Now we have:

$latex X^{T}X\Theta = X^{T}Y$

Now is a square matrix, we can “bring” it to the right side:

And that’s it !

### Like this:

Like Loading...