# Linear regression The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term): $$ \hat{y}=\theta_0 + \theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_n x_n$$ This can be writen more easy by using vector notation form for $m$ values. Therefore, the model will become: $$ \begin{bmatrix} \hat{y}^0 \\ \hat{y}^1\\ \hat{y}^2\\ \vdots \\ \hat{y}^m \end{bmatrix} = \begin{bmatrix} 1 & x_1^0 & x_2^0 & \cdots &x_n^0\\ 1 & x_1^1 & x_2^1 & \cdots & x_n^1\\ \vdots & \vdots &\vdots & \cdots & \vdots\\ 1 & x_1^m & x_2^m & \cdots & x_n^m \end{bmatrix} \begin{bmatrix} \theta_0 \\ \theta_1 \\ \theta_2 \\ \vdots \\ \theta_n \end{bmatrix} $$ Resulting: $$\hat{y}= h_\theta(x) = x \theta $$ **Now that we have our mode, how do we train it?** Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function. The most common performance measure of a regression model is the Mean Square Error (MSE). Therefore, to train a Linear Regression model, you need to find the value of θ that minimizes the MSE: $$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left(\hat{y}^{(i)}-y^{(i)} \right)^2$$ $$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left( x^{(i)}\theta-y^{(i)} \right)^2$$ $$ MSE(X,h_\theta) = \frac{1}{m} \left( x\theta-y \right)^T \left( x\theta-y \right)$$ # The normal equation To find the value of $\theta$ that minimizes the cost function, there is a closed-form solution that gives the result directly. This is called the **Normal Equation**; and can be find it by derivating the *MSE* equation as a function of $\theta$ and making it equals to zero: $$\hat{\theta} = (X^T X)^{-1} X^{T} y $$ $$ Temp = \theta_0 + \theta_1 * t $$ ```python import pandas as pd df = pd.read_csv('data.csv') df ```