NE-BGD/Readme.md

# Linear regression

The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term):

$$ \hat{y}=\theta_0 + \theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_n x_n$$

This can be writen more easy by using vector notation form for $m$ values. Therefore, the model will become:

$$ 
    \begin{bmatrix}
    \hat{y}^0 \\ 
    \hat{y}^1\\
    \hat{y}^2\\
    \vdots \\
    \hat{y}^m
    \end{bmatrix}
    =
    \begin{bmatrix}
    1   & x_1^0 & x_2^0 & \cdots &x_n^0\\
    1   & x_1^1 & x_2^1 & \cdots & x_n^1\\
    \vdots & \vdots &\vdots & \cdots & \vdots\\
    1   &  x_1^m & x_2^m & \cdots & x_n^m
    \end{bmatrix}

    \begin{bmatrix}
    \theta_0 \\
    \theta_1 \\
     \theta_2 \\
    \vdots \\
    \theta_n
    \end{bmatrix}
$$

Resulting:

$$\hat{y}= h_\theta(x) = x \theta $$

**Now that we have our mode, how do we train it?**

Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function. The most common performance measure of a regression model is the Mean Square Error (MSE). Therefore, to train a Linear Regression model, you need to find the value of θ that minimizes the MSE:

$$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left(\hat{y}^{(i)}-y^{(i)} \right)^2$$


$$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left(  x^{(i)}\theta-y^{(i)} \right)^2$$

$$ MSE(X,h_\theta) = \frac{1}{m}  \left(  x\theta-y \right)^T \left(  x\theta-y \right)$$

# The normal equation

To find the value of $\theta$ that minimizes the cost function, there is a closed-form solution that gives the result directly. This is called the **Normal Equation**; and can be find it by derivating the *MSE* equation as a function of $\theta$ and making it equals to zero:


$$\hat{\theta} = (X^T X)^{-1} X^{T} y $$

$$ Temp = \theta_0 + \theta_1 * t $$


```python
import pandas as pd
df = pd.read_csv('data.csv')
df 
```
basic information to start example 2 months ago			`# Linear regression`

			`The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term):`

			`$$ \hat{y}=\theta_0 + \theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_n x_n$$`

			`This can be writen more easy by using vector notation form for $m$ values. Therefore, the model will become:`

			`$$`
			`\begin{bmatrix}`
			`\hat{y}^0 \\`
			`\hat{y}^1\\`
			`\hat{y}^2\\`
			`\vdots \\`
			`\hat{y}^m`
			`\end{bmatrix}`
			`=`
			`\begin{bmatrix}`
			`1 & x_1^0 & x_2^0 & \cdots &x_n^0\\`
			`1 & x_1^1 & x_2^1 & \cdots & x_n^1\\`
			`\vdots & \vdots &\vdots & \cdots & \vdots\\`
			`1 & x_1^m & x_2^m & \cdots & x_n^m`
			`\end{bmatrix}`

			`\begin{bmatrix}`
			`\theta_0 \\`
			`\theta_1 \\`
			`\theta_2 \\`
			`\vdots \\`
			`\theta_n`
			`\end{bmatrix}`
			`$$`

			`Resulting:`

			`$$\hat{y}= h_\theta(x) = x \theta $$`

			`Now that we have our mode, how do we train it?`

			`Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function. The most common performance measure of a regression model is the Mean Square Error (MSE). Therefore, to train a Linear Regression model, you need to find the value of θ that minimizes the MSE:`

			`$$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left(\hat{y}^{(i)}-y^{(i)} \right)^2$$`


			`$$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left( x^{(i)}\theta-y^{(i)} \right)^2$$`

			`$$ MSE(X,h_\theta) = \frac{1}{m} \left( x\theta-y \right)^T \left( x\theta-y \right)$$`

			`# The normal equation`

			`To find the value of $\theta$ that minimizes the cost function, there is a closed-form solution that gives the result directly. This is called the Normal Equation; and can be find it by derivating the MSE equation as a function of $\theta$ and making it equals to zero:`


			`$$\hat{\theta} = (X^T X)^{-1} X^{T} y $$`

			`$$ Temp = \theta_0 + \theta_1 * t $$`


			```python
			`import pandas as pd`
			`df = pd.read_csv('data.csv')`
			`df`
			```