{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Linear regression\n", "\n", "The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term):\n", "\n", "$$ \\hat{y}=\\theta_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\cdots + \\theta_n x_n$$\n", "\n", "This can be writen more easy by using vector notation form for $m$ values. Therefore, the model will become:\n", "\n", "$$ \n", " \\begin{bmatrix}\n", " \\hat{y}^0 \\\\ \n", " \\hat{y}^1\\\\\n", " \\hat{y}^2\\\\\n", " \\vdots \\\\\n", " \\hat{y}^m\n", " \\end{bmatrix}\n", " =\n", " \\begin{bmatrix}\n", " 1 & x_1^0 & x_2^0 & \\cdots &x_n^0\\\\\n", " 1 & x_1^1 & x_2^1 & \\cdots & x_n^1\\\\\n", " \\vdots & \\vdots &\\vdots & \\cdots & \\vdots\\\\\n", " 1 & x_1^m & x_2^m & \\cdots & x_n^m\n", " \\end{bmatrix}\n", "\n", " \\begin{bmatrix}\n", " \\theta_0 \\\\\n", " \\theta_1 \\\\\n", " \\theta_2 \\\\\n", " \\vdots \\\\\n", " \\theta_n\n", " \\end{bmatrix}\n", "$$\n", "\n", "Resulting:\n", "\n", "$$\\hat{y}= h_\\theta(x) = x \\theta $$\n", "\n", "**Now that we have our mode, how do we train it?**\n", "\n", "Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function. The most common performance measure of a regression model is the Mean Square Error (MSE). Therefore, to train a Linear Regression model, you need to find the value of θ that minimizes the MSE:\n", "\n", "$$ MSE(X,h_\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left(\\hat{y}^{(i)}-y^{(i)} \\right)^2$$\n", "\n", "\n", "$$ MSE(X,h_\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left( x^{(i)}\\theta-y^{(i)} \\right)^2$$\n", "\n", "$$ MSE(X,h_\\theta) = \\frac{1}{m} \\left( x\\theta-y \\right)^T \\left( x\\theta-y \\right)$$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# The normal equation\n", "\n", "To find the value of $\\theta$ that minimizes the cost function, there is a closed-form solution that gives the result directly. This is called the **Normal Equation**; and can be find it by derivating the *MSE* equation as a function of $\\theta$ and making it equals to zero:\n", "\n", "\n", "$$\\hat{\\theta} = (X^T X)^{-1} X^{T} y $$\n", "\n", "$$ Temp = \\theta_0 + \\theta_1 * t $$\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0
024.218
123.154
224.347
324.411
424.411
......
29546.357
29646.551
29746.519
29846.551
29946.583
\n", "

300 rows × 1 columns

\n", "
" ], "text/plain": [ " 0\n", "0 24.218\n", "1 23.154\n", "2 24.347\n", "3 24.411\n", "4 24.411\n", ".. ...\n", "295 46.357\n", "296 46.551\n", "297 46.519\n", "298 46.551\n", "299 46.583\n", "\n", "[300 rows x 1 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df = pd.read_csv('data.csv')\n", "df" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'df' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[1], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[43mdf\u001b[49m)\n", "\u001b[0;31mNameError\u001b[0m: name 'df' is not defined" ] } ], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" } }, "nbformat": 4, "nbformat_minor": 2 }