You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
64 lines
2.0 KiB
Markdown
64 lines
2.0 KiB
Markdown
2 months ago
|
# Linear regression
|
||
|
|
||
|
The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term):
|
||
|
|
||
|
$$ \hat{y}=\theta_0 + \theta_1 x_1 + \theta_2 x_2 + \cdots + \theta_n x_n$$
|
||
|
|
||
|
This can be writen more easy by using vector notation form for $m$ values. Therefore, the model will become:
|
||
|
|
||
|
$$
|
||
|
\begin{bmatrix}
|
||
|
\hat{y}^0 \\
|
||
|
\hat{y}^1\\
|
||
|
\hat{y}^2\\
|
||
|
\vdots \\
|
||
|
\hat{y}^m
|
||
|
\end{bmatrix}
|
||
|
=
|
||
|
\begin{bmatrix}
|
||
|
1 & x_1^0 & x_2^0 & \cdots &x_n^0\\
|
||
|
1 & x_1^1 & x_2^1 & \cdots & x_n^1\\
|
||
|
\vdots & \vdots &\vdots & \cdots & \vdots\\
|
||
|
1 & x_1^m & x_2^m & \cdots & x_n^m
|
||
|
\end{bmatrix}
|
||
|
|
||
|
\begin{bmatrix}
|
||
|
\theta_0 \\
|
||
|
\theta_1 \\
|
||
|
\theta_2 \\
|
||
|
\vdots \\
|
||
|
\theta_n
|
||
|
\end{bmatrix}
|
||
|
$$
|
||
|
|
||
|
Resulting:
|
||
|
|
||
|
$$\hat{y}= h_\theta(x) = x \theta $$
|
||
|
|
||
|
**Now that we have our mode, how do we train it?**
|
||
|
|
||
|
Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function. The most common performance measure of a regression model is the Mean Square Error (MSE). Therefore, to train a Linear Regression model, you need to find the value of θ that minimizes the MSE:
|
||
|
|
||
|
$$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left(\hat{y}^{(i)}-y^{(i)} \right)^2$$
|
||
|
|
||
|
|
||
|
$$ MSE(X,h_\theta) = \frac{1}{m} \sum_{i=1}^{m} \left( x^{(i)}\theta-y^{(i)} \right)^2$$
|
||
|
|
||
|
$$ MSE(X,h_\theta) = \frac{1}{m} \left( x\theta-y \right)^T \left( x\theta-y \right)$$
|
||
|
|
||
|
# The normal equation
|
||
|
|
||
|
To find the value of $\theta$ that minimizes the cost function, there is a closed-form solution that gives the result directly. This is called the **Normal Equation**; and can be find it by derivating the *MSE* equation as a function of $\theta$ and making it equals to zero:
|
||
|
|
||
|
|
||
|
$$\hat{\theta} = (X^T X)^{-1} X^{T} y $$
|
||
|
|
||
|
$$ Temp = \theta_0 + \theta_1 * t $$
|
||
|
|
||
|
|
||
|
```python
|
||
|
import pandas as pd
|
||
|
df = pd.read_csv('data.csv')
|
||
|
df
|
||
|
```
|