You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

600 lines
87 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Linear regression\n",
"\n",
"The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term):\n",
"\n",
"$$ \\hat{y}=\\theta_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\cdots + \\theta_n x_n$$\n",
"\n",
"This can be writen more easy by using vector notation form for $m$ values. Therefore, the model will become:\n",
"\n",
"$$ \n",
" \\begin{bmatrix}\n",
" \\hat{y}^0 \\\\ \n",
" \\hat{y}^1\\\\\n",
" \\hat{y}^2\\\\\n",
" \\vdots \\\\\n",
" \\hat{y}^m\n",
" \\end{bmatrix}\n",
" =\n",
" \\begin{bmatrix}\n",
" 1 & x_1^0 & x_2^0 & \\cdots &x_n^0\\\\\n",
" 1 & x_1^1 & x_2^1 & \\cdots & x_n^1\\\\\n",
" \\vdots & \\vdots &\\vdots & \\cdots & \\vdots\\\\\n",
" 1 & x_1^m & x_2^m & \\cdots & x_n^m\n",
" \\end{bmatrix}\n",
"\n",
" \\begin{bmatrix}\n",
" \\theta_0 \\\\\n",
" \\theta_1 \\\\\n",
" \\theta_2 \\\\\n",
" \\vdots \\\\\n",
" \\theta_n\n",
" \\end{bmatrix}\n",
"$$\n",
"\n",
"Resulting:\n",
"\n",
"$$\\hat{y}= h_\\theta(x) = x \\theta $$\n",
"\n",
"**Now that we have our mode, how do we train it?**\n",
"\n",
"Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function. The most common performance measure of a regression model is the Mean Square Error (MSE). Therefore, to train a Linear Regression model, you need to find the value of θ that minimizes the MSE:\n",
"\n",
"$$ MSE(X,h_\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left(\\hat{y}^{(i)}-y^{(i)} \\right)^2$$\n",
"\n",
"\n",
"$$ MSE(X,h_\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left( x^{(i)}\\theta-y^{(i)} \\right)^2$$\n",
"\n",
"$$ MSE(X,h_\\theta) = \\frac{1}{m} \\left( x\\theta-y \\right)^T \\left( x\\theta-y \\right)$$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"# The normal equation\n",
"\n",
"To find the value of $\\theta$ that minimizes the cost function, there is a closed-form solution that gives the result directly. This is called the **Normal Equation**; and can be find it by derivating the *MSE* equation as a function of $\\theta$ and making it equals to zero:\n",
"\n",
"\n",
"$$\\hat{\\theta} = (X^T X)^{-1} X^{T} y $$\n",
"\n",
"$$ Temp = \\theta_0 + \\theta_1 * t $$\n"
]
},
{
"cell_type": "code",
2 months ago
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
2 months ago
"0 24.218\n",
"1 23.154\n",
"2 24.347\n",
"3 24.411\n",
"4 24.411\n",
" ... \n",
"295 46.357\n",
"296 46.551\n",
"297 46.519\n",
"298 46.551\n",
"299 46.583\n",
"Name: 0, Length: 300, dtype: float64"
]
},
2 months ago
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"df = pd.read_csv('data.csv')\n",
2 months ago
"y = df['0']\n",
"y"
]
},
{
"cell_type": "code",
2 months ago
"execution_count": 4,
"metadata": {},
"outputs": [
{
2 months ago
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x10dca4890>]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA2hElEQVR4nO3de3TU9Z3/8dfMlEQxJMg1YRMCQla0iL8tRZx6luVAMKL2p+Xyw2pP1aZc2uBBbkKsW4u/1iChIrtbWKE52tMDRvFIXd0Kq1zizxLYGGVBXTkJB4RZbq49JBBKwOT7++O7M5lJ5p6Z79yej3NywlyYfPg6MC8/n/fn/bEZhmEIAADAIvZEDwAAAGQWwgcAALAU4QMAAFiK8AEAACxF+AAAAJYifAAAAEsRPgAAgKUIHwAAwFLfSPQAuuvs7NSpU6fUr18/2Wy2RA8HAACEwTAMXbhwQcOGDZPdHnxuI+nCx6lTp1RUVJToYQAAgCicPHlShYWFQZ+TdOGjX79+kszB5+bmJng0AAAgHK2trSoqKvJ8jgeTdOHDvdSSm5tL+AAAIMWEUzJBwSkAALAU4QMAAFiK8AEAACxF+AAAAJYifAAAAEsRPgAAgKUIHwAAwFKEDwAAYCnCBwAAsBThAwAAWIrwAQBAJnG5pD17zO8JknRnuwAAgBhxuaSmJiknRzp2TNq9W9q8WerslOx2adMmqbzc8mERPgAASEc1NdK8eWbQ8KezU5o/XyorkwoLLR0ayy4AAKQblyt48HDr6JCam60ZkxdmPgAASDUul7Rvn/nr73zH/N7UJJWUmLMY69eHDh6S5HBIo0fHb5wBED4AAEhm7roNd7CoqZHmzpUMo+s5Npt522aTHnxQ2rIl9Os6HNKLL1q+5CKx7AIAQPJau1YqLpamTDG//+xnPYOH1HXbMEIHD5tNWrZMOn48IcWmEjMfAAAkB+8ZDkn65S/NmQm3zk7p2Weje22bTdqwQRo4UHI6EzLb4Y3wAQCAlbpvf5WkQ4ekqqrw6jQilcAttYEQPgAAsEqo7a+xZrdL+/dLEyZY8/PCRM0HAABWCHf7a7hsNukHPzALR923vblnPJIseEjMfAAAELnuO1C612t0X1YZOVL6f/8vtjMeS5dK1dXmck1zc9eW2fp683sS1HYEYjOM7iWzidXa2qq8vDy1tLQoNzc30cMBAMCX99KJzSZNny7t2NEVLNzbXmPlBz+Qtm71DS52u/TFF0kVLiL5/GbZBQCAcHVfOjEM6Y9/9A0GvQkeNlvX8ondLq1ZI/3+9+byiXt5xeEwbydR8IgUyy4AAISrqSk2Syc2m/Qv/yL95S/m7REjpLa2rqUT9zKKO2CUl5tnsHS/P0URPgAAkHrWcXS/T5I+/zw2P8swzJqQe+/1/7i/cFFYmPKhw43wAQDA2rXSihVddRxLl0pDh/reF8s6jgSdqZIsCB8AgMzlcvXsJGoYZhjxFk7wcNdqGIZZr7FypfS//pd533/8R1cTsQSeqZIsCB8AgMzk74C2SK1bJ91xR/B6DUmaPVtasCBtajZ6q1e7XVavXi2bzabHH3/cc9/kyZNls9l8vhYsWNDbcQIAEB2XS9qzx/zufd+8eb0LHg6HNGuW2cRr8uSumgz3r7sL9liGiTp8NDQ06MUXX9S4ceN6PDZ37lydPn3a87VmzZpeDRIAgLB0Dxo1Nb6nwtbUmPf3dteK3Z7xSye9EVX4uHjxoh566CFt3rxZ119/fY/H+/btq/z8fM8XzcIAAHHn7/h5754cnZ3mbZdL+vDD6H6GzSZt3Gg2+Eqig9pSTVTho6KiQvfcc49KS0v9Pr5lyxYNGjRIY8eOVWVlpS5duhTwtdrb29Xa2urzBQBARKqrpeXLfYPGs8/2nN3o7DQLTFeu7PkaCxZIJ0+ar2UP8PFoGNKYMcx49FLEBae1tbX66KOP1NDQ4PfxBx98UMXFxRo2bJgOHTqkFStW6MiRI3rjjTf8Pr+qqkqrVq2KdBgAgEwTqA/HW29JTzwR/uts3ux/yWXOHPN1ly2THnjAPCPlgQd8n5vhW2RjJaLwcfLkSS1atEjvvvuurrnmGr/PmTdvnufXt9xyiwoKCjR16lQdPXpUo0aN6vH8yspKLVmyxHO7tbVVRUVFkQwLAJDuuvfhcH/WbNoUedFoZ6c5sxEsVBQWmjtUWlul+fOljg62yMZQRAfL/eEPf9D3vvc9Odz95SV1dHTIZrPJbrervb3d5zFJamtrU05Ojnbs2KGysrKQP4OD5QAAPqqrI5vZCMXhkFavNpdevENFoBoOl4stsmGI5PM7opmPqVOn6vDhwz73PfrooxozZoxWrFjRI3hI0sGDByVJBQUFkfwoAECmc7mkffvMGY9I2GzSzJnS9u1muHAf1ubd4Ku83FxSCSdUpFFb82QRUfjo16+fxo4d63Pfddddp4EDB2rs2LE6evSotm7dqrvvvlsDBw7UoUOHtHjxYk2aNMnvllwAQAZz13Dk5EgXL3bVcrhc0vr10vPPR74d1m6X9u83e294z1hIPYMGoSJhYtrhNCsrS++9955eeOEFtbW1qaioSDNnztRTTz0Vyx8DAEh13jUcbjabNH269M470TX/stvNGpAJE8zb3cOFO9js2eNbtArLRVTzYQVqPgAgjfk7SyUS3Q94cxefTp0qOZ3BA0VNTVffD3dQoVdHzETy+U34AABYozdnqdjtUm2tGTAkcxusFDpwuLlcZuOx7jtcjh9nBiRG4lZwCgBAWLr35OjNWSruItHZs7vu8/51OPy1U+/oMOtACB+WI3wAAGLDHTgaG3v25BgyJLqzVBYsMNuk9zYglJSE7u0ByxA+AAC9511P4c0wgtd32O3SXXdJO3b4/l673ezFsXx5bMZXWGjWeNAwLClQ8wEA6B1/9RTh8J7VcG+Lve46qa0tfg29XC6zXsQwpO98h/ARQ9R8AADip3s9R7TH07vPUpEC99zwd55Lb+zcyY6XJBDVqbYAgAzlfWz98OHm7MXnnwc+BTaQcOotamq6flZxsXm7N9xFr94n386fb94PSzHzAQAIT/czVkLVcwQSTr1FoKBQVhb9DAg7XpIG4QMAEFpDQ+RnrDgcUlWVNGKEeXvEiPDrOeIRFNjxkjQIHwAAk7/eHE1N0q5d0rPPRtajY906adas5AoK7HhJGoQPAIDvVln3GSvdt7+Gy+HoXfCQ4hcUysvNpZtwTrNF3LDVFgAyXUODdPvt0QWN7ryPrI8F75NpCQpJja22AIDwRHPeis0mHTggFRR0nbESST1HJDj2Pi0RPgAg3blrN3JypGPHzPu+8x3zezTnraxZ03VsfaRnrAAifABAegvU9lySvv3tyJZa3C3Ply2L3fiQkQgfAJCuuvfK6O7DD0O/ht0urVwpTZtG3QVihvABAOlq/frIi0jnz5ceeCD+Z6wgoxE+ACCVeddzXLxo9seQpLfeMluhR8Jul556irCBuCN8AECqWrvW7DrqPbths0VeQCrRcAuW4mA5AEhF1dXS8uU9l1WCBQ+bTdq40TwMzuEw77PbzQLS48c53RWWockYAKQal8s8UTaSf767Hx9P8y7EGE3GACBVeZ+vIvmeteJ+/LXXIg8e+/d39eaQaN6FhCJ8AECy8K7hsNnM+wzD/PW8eebtzZsj782xaZNv8AASjPABAMmgulp64omu294zG4ZhFoMG4h1UvO/bsEG6915mOJB0KDgFgERzucwZj2i4z1nZvLmriNThMG8vWJC5wcPlkvbsMb8j6TDzAQBWcLmkffvMX48c2XXGysiRZk+OaGv/DcNsBsZR8V28W8p3L7RFUmC3CwDEWzQnx4bL4TC3yWZy2PDmcknFxb51MVwjS0Ty+c2yCwDEU0ND7IOHu8aDxmA9NTX1LMjt6DBnhJA0WHYBgFhzb5dtbDSLSCMNHt27lNrt0m9+Iw0cKDmd5n0sr/hXUmJer+4zH6NHJ25M6IHwAQCx4A4cu3ZJVVWRH+jm5nCYx9avXGn+H7t7dqN7zQKhw7/CQrPGY/583+vH9Uoq1HwAQG95Fzj2hndxJB1Ie4frZ7l
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
2 months ago
"source": [
"import matplotlib.pyplot as plt\n",
"plt.plot(y,'.r')"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1., 0.],\n",
" [ 1., 1.],\n",
" [ 1., 2.],\n",
" [ 1., 3.],\n",
" [ 1., 4.],\n",
" [ 1., 5.],\n",
" [ 1., 6.],\n",
" [ 1., 7.],\n",
" [ 1., 8.],\n",
" [ 1., 9.],\n",
" [ 1., 10.],\n",
" [ 1., 11.],\n",
" [ 1., 12.],\n",
" [ 1., 13.],\n",
" [ 1., 14.],\n",
" [ 1., 15.],\n",
" [ 1., 16.],\n",
" [ 1., 17.],\n",
" [ 1., 18.],\n",
" [ 1., 19.],\n",
" [ 1., 20.],\n",
" [ 1., 21.],\n",
" [ 1., 22.],\n",
" [ 1., 23.],\n",
" [ 1., 24.],\n",
" [ 1., 25.],\n",
" [ 1., 26.],\n",
" [ 1., 27.],\n",
" [ 1., 28.],\n",
" [ 1., 29.],\n",
" [ 1., 30.],\n",
" [ 1., 31.],\n",
" [ 1., 32.],\n",
" [ 1., 33.],\n",
" [ 1., 34.],\n",
" [ 1., 35.],\n",
" [ 1., 36.],\n",
" [ 1., 37.],\n",
" [ 1., 38.],\n",
" [ 1., 39.],\n",
" [ 1., 40.],\n",
" [ 1., 41.],\n",
" [ 1., 42.],\n",
" [ 1., 43.],\n",
" [ 1., 44.],\n",
" [ 1., 45.],\n",
" [ 1., 46.],\n",
" [ 1., 47.],\n",
" [ 1., 48.],\n",
" [ 1., 49.],\n",
" [ 1., 50.],\n",
" [ 1., 51.],\n",
" [ 1., 52.],\n",
" [ 1., 53.],\n",
" [ 1., 54.],\n",
" [ 1., 55.],\n",
" [ 1., 56.],\n",
" [ 1., 57.],\n",
" [ 1., 58.],\n",
" [ 1., 59.],\n",
" [ 1., 60.],\n",
" [ 1., 61.],\n",
" [ 1., 62.],\n",
" [ 1., 63.],\n",
" [ 1., 64.],\n",
" [ 1., 65.],\n",
" [ 1., 66.],\n",
" [ 1., 67.],\n",
" [ 1., 68.],\n",
" [ 1., 69.],\n",
" [ 1., 70.],\n",
" [ 1., 71.],\n",
" [ 1., 72.],\n",
" [ 1., 73.],\n",
" [ 1., 74.],\n",
" [ 1., 75.],\n",
" [ 1., 76.],\n",
" [ 1., 77.],\n",
" [ 1., 78.],\n",
" [ 1., 79.],\n",
" [ 1., 80.],\n",
" [ 1., 81.],\n",
" [ 1., 82.],\n",
" [ 1., 83.],\n",
" [ 1., 84.],\n",
" [ 1., 85.],\n",
" [ 1., 86.],\n",
" [ 1., 87.],\n",
" [ 1., 88.],\n",
" [ 1., 89.],\n",
" [ 1., 90.],\n",
" [ 1., 91.],\n",
" [ 1., 92.],\n",
" [ 1., 93.],\n",
" [ 1., 94.],\n",
" [ 1., 95.],\n",
" [ 1., 96.],\n",
" [ 1., 97.],\n",
" [ 1., 98.],\n",
" [ 1., 99.],\n",
" [ 1., 100.],\n",
" [ 1., 101.],\n",
" [ 1., 102.],\n",
" [ 1., 103.],\n",
" [ 1., 104.],\n",
" [ 1., 105.],\n",
" [ 1., 106.],\n",
" [ 1., 107.],\n",
" [ 1., 108.],\n",
" [ 1., 109.],\n",
" [ 1., 110.],\n",
" [ 1., 111.],\n",
" [ 1., 112.],\n",
" [ 1., 113.],\n",
" [ 1., 114.],\n",
" [ 1., 115.],\n",
" [ 1., 116.],\n",
" [ 1., 117.],\n",
" [ 1., 118.],\n",
" [ 1., 119.],\n",
" [ 1., 120.],\n",
" [ 1., 121.],\n",
" [ 1., 122.],\n",
" [ 1., 123.],\n",
" [ 1., 124.],\n",
" [ 1., 125.],\n",
" [ 1., 126.],\n",
" [ 1., 127.],\n",
" [ 1., 128.],\n",
" [ 1., 129.],\n",
" [ 1., 130.],\n",
" [ 1., 131.],\n",
" [ 1., 132.],\n",
" [ 1., 133.],\n",
" [ 1., 134.],\n",
" [ 1., 135.],\n",
" [ 1., 136.],\n",
" [ 1., 137.],\n",
" [ 1., 138.],\n",
" [ 1., 139.],\n",
" [ 1., 140.],\n",
" [ 1., 141.],\n",
" [ 1., 142.],\n",
" [ 1., 143.],\n",
" [ 1., 144.],\n",
" [ 1., 145.],\n",
" [ 1., 146.],\n",
" [ 1., 147.],\n",
" [ 1., 148.],\n",
" [ 1., 149.],\n",
" [ 1., 150.],\n",
" [ 1., 151.],\n",
" [ 1., 152.],\n",
" [ 1., 153.],\n",
" [ 1., 154.],\n",
" [ 1., 155.],\n",
" [ 1., 156.],\n",
" [ 1., 157.],\n",
" [ 1., 158.],\n",
" [ 1., 159.],\n",
" [ 1., 160.],\n",
" [ 1., 161.],\n",
" [ 1., 162.],\n",
" [ 1., 163.],\n",
" [ 1., 164.],\n",
" [ 1., 165.],\n",
" [ 1., 166.],\n",
" [ 1., 167.],\n",
" [ 1., 168.],\n",
" [ 1., 169.],\n",
" [ 1., 170.],\n",
" [ 1., 171.],\n",
" [ 1., 172.],\n",
" [ 1., 173.],\n",
" [ 1., 174.],\n",
" [ 1., 175.],\n",
" [ 1., 176.],\n",
" [ 1., 177.],\n",
" [ 1., 178.],\n",
" [ 1., 179.],\n",
" [ 1., 180.],\n",
" [ 1., 181.],\n",
" [ 1., 182.],\n",
" [ 1., 183.],\n",
" [ 1., 184.],\n",
" [ 1., 185.],\n",
" [ 1., 186.],\n",
" [ 1., 187.],\n",
" [ 1., 188.],\n",
" [ 1., 189.],\n",
" [ 1., 190.],\n",
" [ 1., 191.],\n",
" [ 1., 192.],\n",
" [ 1., 193.],\n",
" [ 1., 194.],\n",
" [ 1., 195.],\n",
" [ 1., 196.],\n",
" [ 1., 197.],\n",
" [ 1., 198.],\n",
" [ 1., 199.],\n",
" [ 1., 200.],\n",
" [ 1., 201.],\n",
" [ 1., 202.],\n",
" [ 1., 203.],\n",
" [ 1., 204.],\n",
" [ 1., 205.],\n",
" [ 1., 206.],\n",
" [ 1., 207.],\n",
" [ 1., 208.],\n",
" [ 1., 209.],\n",
" [ 1., 210.],\n",
" [ 1., 211.],\n",
" [ 1., 212.],\n",
" [ 1., 213.],\n",
" [ 1., 214.],\n",
" [ 1., 215.],\n",
" [ 1., 216.],\n",
" [ 1., 217.],\n",
" [ 1., 218.],\n",
" [ 1., 219.],\n",
" [ 1., 220.],\n",
" [ 1., 221.],\n",
" [ 1., 222.],\n",
" [ 1., 223.],\n",
" [ 1., 224.],\n",
" [ 1., 225.],\n",
" [ 1., 226.],\n",
" [ 1., 227.],\n",
" [ 1., 228.],\n",
" [ 1., 229.],\n",
" [ 1., 230.],\n",
" [ 1., 231.],\n",
" [ 1., 232.],\n",
" [ 1., 233.],\n",
" [ 1., 234.],\n",
" [ 1., 235.],\n",
" [ 1., 236.],\n",
" [ 1., 237.],\n",
" [ 1., 238.],\n",
" [ 1., 239.],\n",
" [ 1., 240.],\n",
" [ 1., 241.],\n",
" [ 1., 242.],\n",
" [ 1., 243.],\n",
" [ 1., 244.],\n",
" [ 1., 245.],\n",
" [ 1., 246.],\n",
" [ 1., 247.],\n",
" [ 1., 248.],\n",
" [ 1., 249.],\n",
" [ 1., 250.],\n",
" [ 1., 251.],\n",
" [ 1., 252.],\n",
" [ 1., 253.],\n",
" [ 1., 254.],\n",
" [ 1., 255.],\n",
" [ 1., 256.],\n",
" [ 1., 257.],\n",
" [ 1., 258.],\n",
" [ 1., 259.],\n",
" [ 1., 260.],\n",
" [ 1., 261.],\n",
" [ 1., 262.],\n",
" [ 1., 263.],\n",
" [ 1., 264.],\n",
" [ 1., 265.],\n",
" [ 1., 266.],\n",
" [ 1., 267.],\n",
" [ 1., 268.],\n",
" [ 1., 269.],\n",
" [ 1., 270.],\n",
" [ 1., 271.],\n",
" [ 1., 272.],\n",
" [ 1., 273.],\n",
" [ 1., 274.],\n",
" [ 1., 275.],\n",
" [ 1., 276.],\n",
" [ 1., 277.],\n",
" [ 1., 278.],\n",
" [ 1., 279.],\n",
" [ 1., 280.],\n",
" [ 1., 281.],\n",
" [ 1., 282.],\n",
" [ 1., 283.],\n",
" [ 1., 284.],\n",
" [ 1., 285.],\n",
" [ 1., 286.],\n",
" [ 1., 287.],\n",
" [ 1., 288.],\n",
" [ 1., 289.],\n",
" [ 1., 290.],\n",
" [ 1., 291.],\n",
" [ 1., 292.],\n",
" [ 1., 293.],\n",
" [ 1., 294.],\n",
" [ 1., 295.],\n",
" [ 1., 296.],\n",
" [ 1., 297.],\n",
" [ 1., 298.],\n",
" [ 1., 299.]])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"n = len(y)\n",
"x = np.linspace(0,n-1,n)\n",
"X = np.c_[np.ones(n), x]\n",
"X"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([25.70275643, 0.07850281])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)\n",
"theta"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABMsklEQVR4nO3de1xUdf4/8Nc5k6DILZCbO4AXyEuG7RrhfNvMVYsyu6rbxbYsUmmxNZWLmN+13N0gwczdTRfa+dmum6G1uYXfTXMV6WuAAemqW7ngV9CToq4pCCYonN8f00wMDMOcuV9ez8eDR3LmzJkPp9F58/6835+PIMuyDCIiIiInEV09ACIiIvItDD6IiIjIqRh8EBERkVMx+CAiIiKnYvBBRERETsXgg4iIiJyKwQcRERE5FYMPIiIicqrrXD2Anrq6unDq1CkEBQVBEARXD4eIiIgsIMsyLl26hKFDh0IUzec23C74OHXqFGJjY109DCIiIrLCyZMnoVarzZ7jdsFHUFAQAN3gg4ODXTwaIiIiskRLSwtiY2MNn+PmuF3woZ9qCQ4OZvBBRETkYSwpmVBUcPrSSy9BEASjr9GjRxsev3LlCjIyMhAeHo7AwEDMnDkTZ86cUT5yIiIi8lqKu11uvPFGnD592vC1b98+w2OLFy9GaWkp3n33XZSXl+PUqVN4+OGH7TpgIiIi8myKp12uu+46REdH9zre3NwMrVaLzZs3Y8qUKQCAjRs3YsyYMaiqqsLEiRNtHy0RERF5PMXBR11dHYYOHYqBAwdCo9EgLy8PcXFxqK2txdWrVzFt2jTDuaNHj0ZcXBwqKyv7DD7a29vR3t5u+L6lpaXfMciyjGvXrqGzs1Pp8EmhAQMGQKVSuXoYRETkRRQFHykpKXjrrbcwatQonD59Gi+//DJuv/12HDlyBE1NTfDz80NoaKjRc6KiotDU1NTnNfPy8vDyyy9bPIaOjg6cPn0aly9fVjJ0spIgCFCr1QgMDHT1UIiIyEsoCj7uuecew5+TkpKQkpKC+Ph4bN26FYMGDbJqALm5uViyZInhe32rjildXV04fvw4VCoVhg4dCj8/Py5E5kCyLOPcuXOQJAmJiYnMgBARkV3Y1GobGhqKG264AfX19bjzzjvR0dGBixcvGmU/zpw5Y7JGRM/f3x/+/v4WvV5HRwe6uroQGxuLgIAAW4ZOFoqIiEBDQwOuXr3K4IOIiOzCpr1dWltbcezYMcTExGDChAkYMGAAdu/ebXj86NGjOHHiBDQajc0D7a6/ZVvJfphZIiIie1OU+cjMzMR9992H+Ph4nDp1CitXroRKpcJjjz2GkJAQpKWlYcmSJQgLC0NwcDCef/55aDQadroQERGRgaIUgiRJeOyxxzBq1Cj89Kc/RXh4OKqqqhAREQEAWLt2LWbMmIGZM2di0qRJiI6Oxvvvv++QgRMREZEVamqAKVN0/3URRZmPkpISs48PHDgQb7zxBt544w2bBkXWmzx5Mm6++Wa8/vrrFp3/1ltv4YUXXsDFixcdOi4iInKxmhogOxuIjATKyoBNm4BbbnHJUHy7eMINoj8iIiKHKy0Fpk/XBR0ffqg7VlICfP45UFsLNDY6dThut7GcU/35zy6P/oiIiBymtBRYtcr4l+xvv9X99+xZYMKE74/LstOG5fmZD1kG2tos//ryS2DfPuDTT3VRHwC8847u+337dI9bei0F/6MmT56M559/Hi+88AKuv/56REVF4c0330RbWxuefvppBAUFISEhAR999JHhOeXl5bj11lvh7++PmJgYLFu2DNeuXTM83tbWhieffBKBgYGIiYnBmjVrer1ue3s7MjMz8YMf/ACDBw9GSkoK9u7da/XtJiIiD1FaCtx/f//Z/euuA/7yF+eMSf+STn01R7h8GbB19c1z54Af/1j581pbgcGDLT79T3/6E7Kzs/HZZ59hy5YteO6557Bt2zY89NBDWL58OdauXYuf/exnOHHiBC5cuIDp06dj7ty5+POf/4yvvvoK8+bNw8CBA/HSSy8BALKyslBeXo4PPvgAkZGRWL58OT7//HPcfPPNhtdcuHAhvvjiC5SUlGDo0KHYtm0b7r77bhw+fBiJiYnKf2YiInJvprId5uzfD/zoR44dU0+ym2lubpYByM3Nzb0e+/bbb+UvvvhC/vbbb78/2Noqy7ochPO/Wlst/rnuuOMO+cc//rHh+2vXrsmDBw+Wf/aznxmOnT59WgYgV1ZWysuXL5dHjRold3V1GR5/44035MDAQLmzs1O+dOmS7OfnJ2/dutXw+Pnz5+VBgwbJixYtkmVZlhsbG2WVSiV//fXXRmOZOnWqnJubK8uyLG/cuFEOCQnpc9wm7zkREbmnDz+0/DNMEHT/ra21y0ub+/zuyfMzHwEBugyEEgcPms507NsHdMsaWPTaCiQlJRn+rFKpEB4ejptuuslwLCoqCgBw9uxZfPnll9BoNEaLfN12221obW2FJEm4cOECOjo6kJKSYng8LCwMo0aNMnx/+PBhdHZ24oYbbjAaR3t7O8LDwxWNnYiI3JjSbAcAjBkDfPONrvvFyTw/+BAERVMfAAD9PjSiCHR1ff/fQYOUX0uBAQMGGH0vCILRMX2g0dXVZZfXa21thUqlQm1tba+l0blRHBGRh9O3zs6YASxdquy54eHAjh26wMPCLU7syfODD2tERgLR0UBsLJCWBmi1wMmTLon++jJmzBj89a9/hSzLhqDk008/RVBQENRqNcLCwjBgwADs378fcXFxAIALFy7g3//+N+644w4AwA9/+EN0dnbi7NmzuP322132sxARkZ3V1OiCjjNndF2blho7Fjh/Xtdk0ccmrs7gm8GHWg00NAB+frrMyfz5QEeHS6K/vvz85z/H66+/jueffx4LFy7E0aNHsXLlSixZsgSiKCIwMBBpaWnIyspCeHg4IiMj8eKLLxrte3PDDTdgzpw5ePLJJ7FmzRr88Ic/xLlz57B7924kJSXh3nvvdeFPSEREiumnVwYO1AUeSrg429GdbwYfgPGNFwSX/4/o6Qc/+AH+/ve/IysrC+PHj0dYWBjS0tKwYsUKwzkFBQVobW3Ffffdh6CgICxduhTNzc1G19m4cSN+/etfY+nSpfj6668xZMgQTJw4ETNmzHD2j0RERLbQt84q5SbZju4EWXbiqiIWaGlpQUhICJqbmxEcHGz02JUrV3D8+HEMHz4cAwcOdNEIfQvvORGRi1lTTKoXHg4cOOCUbIe5z++efDfzQURE5O68KNvRneevcEpERORtSkuB5GTrAg99bUdjIzBypP3HZgfMfBAREbkDW1pnAwKAxESgqcltsx3dMfggIiJyNWtbZ0eP1i0UVlEBjBjhdp2bfWHwQURE5Cq2ts5+/LFxMakHBB4Agw8iIiLX8NJiUkuw4JSIiMiZvLyY1BLMfBARETlDTQ3wi18AlZXKn+sF2Y7umPlwocmTJ+OFF15w9TCIiMjRSkuB6dOtCzy8JNvRHYMPD7F3714IgoCLFy+6eihERGSJmhrgppt0WYv77wfOnbP8uQEBwPjxQFQUsH+/LtvhIcWkluC0CxERkT3p1+sYPBg4ckTZcz20dVYpn858SJKEsrIySJLk8Ndqa2vDk08+icDAQMTExGDNmjVGj2/atAm33HILgoKCEB0djccffxxnz54FADQ0NOAnP/kJAOD666+HIAiYO3cuAGDHjh348Y9/jNDQUISHh2PGjBk4duyYw38eIiIyobQUSE3VrdWxfbuy5+pbZ0+c0E2vuOGmp/bis8GHVqtFfHw8pkyZgvj4eGi1Woe+XlZWFsrLy/HBBx/g448/xt69e/H5558bHr969Sp+9atf4Z///Cf+9re/oaGhwRBgxMbG4q9//SsA4OjRozh9+jTWrVsHQBfULFmyBDU1Ndi9ezdEUcRDDz2Erq4uh/48RETUTfcOlm++UfbcsWO9dnqlLz65q60kSYiPjzf6gFapVGhoaIBarbb6un1pbW1FeHg4/vKXv2D27NkAgG+++QZqtRrz58/H66+/3us5NTU1SE5OxqVLlxAYGIi9e/fiJz/
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"ypre = X.dot(theta)\n",
"plt.plot(x, ypre, '*-r', label='model')\n",
"plt.plot(x,y, '.k', label='data')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Polynomial model\n",
"\n",
"$$ y = \\theta_0+\\theta_1 t+\\theta_2 t^2++\\theta_3 t^3$$"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 2.28848082e+01, 1.35240024e-01, -1.89756565e-04])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X = np.c_[np.ones(len(x)), x, x*x]\n",
"theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)\n",
"theta"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABOzklEQVR4nO3deXhU5d3/8feZgUQgJJiwBJwQ0KSgIioYMa4U0IjiYxWorbQuTQlU8JF9UevSWuEBFGgfQYJ5pIqI2JZaULGAIS4sQpCCIpjwI4YjqywZCBIgc35/TDJkyDohmZlkPq/rykUy52Ry5zAkH+7zvb+3YVmWhYiIiIif2AI9ABEREQktCh8iIiLiVwofIiIi4lcKHyIiIuJXCh8iIiLiVwofIiIi4lcKHyIiIuJXCh8iIiLiV00CPYDzuVwu9u7dS8uWLTEMI9DDERERkRqwLIvjx4/ToUMHbLaq5zaCLnzs3buXuLi4QA9DREREamHPnj04HI4qzwm68NGyZUvAPfjIyMgAj0ZERERqwul0EhcX5/k9XpWgCx+lt1oiIyMVPkRERBqYmpRMqOBURERE/ErhQ0RERPxK4UNERET8KuhqPmrCsizOnj1LcXFxoIfS6DVt2hS73R7oYYiISCPS4MLH6dOn2bdvHydPngz0UEKCYRg4HA4iIiICPRQREWkkGlT4cLlc7N69G7vdTocOHQgLC1MjsnpkWRaHDh3CNE0SExM1AyIiInWiQYWP06dP43K5iIuLo3nz5oEeTkho06YNeXl5nDlzRuFDRETqRIMsOK2ubavUHc0siYhIXdNvcREREfErhQ8RERHxK4WPRqZ3796MGjWqxucvWLCAVq1a1dt4REQkOJhOk8zdmZhOM9BDaVgFp3Vu0yaYMAGmTYPrrgv0aEREROqM6TRZm78WDNh6YCtTPpuCy3JhM2ykD0gntUdqwMYW2uHjjTcgMxPefFPhQ0REGo2MzRkMXTYUC6vcMZflYti/fktKYSyOW+4OwOgaw20Xy4LCwpq/ffMNfPYZfP45LF7sfo6333Z//Nln7uM1fS6r/F9qZXr37s3jjz/OqFGjuPjii2nXrh3z58+nsLCQRx99lJYtW5KQkMCHH37o+ZysrCyuv/56wsPDad++PZMmTeLs2bOe44WFhTz00ENERETQvn17XnrppXJft6ioiHHjxnHJJZfQokULevXqxZo1a2p9uUVEJLht/H5jpcGjVLEBuf/8Pz+OylvDn/k4eRIutPvmoUNw882+f96JE9CiRY1P/+tf/8qECRP44osveOedd/jd737H0qVLue+++3jyySeZOXMmv/71r8nPz+fo0aPcddddPPLII7zxxhvs2LGDoUOHctFFF/Hcc88BMH78eLKysnjvvfdo27YtTz75JJs3b+aaa67xfM2RI0eyfft2Fi9eTIcOHVi6dCl33nkn27ZtIzEx0ffvWUREAsZ0muQcziEiLIITp0+QGOP+OZ5zOIdE8yQfzR3L0C7fVhk8AGwuSFiaBUM2u/8j3bo1xMf741sAwLAsH/777gdOp5OoqCgKCgqIjIz0Onbq1Cl2795N586dueiii9wPFhZeePioLR/CR+/evSkuLubTTz8FoLi4mKioKO6//37eeOMNAPbv30/79u1Zt24dy5Yt4+9//zvffPONp9fGnDlzmDhxIgUFBZw8eZKYmBgWLlzI4MGDAThy5AgOh4O0tDRmzZpFfn4+l156Kfn5+XTo0MEzln79+nH99dfz4osvsmDBAkaNGsWxY8cqHHeF11xERPyiNGwkxiTyUe5HpC1Pw2W5vM4xMNxhwwIDsKprz2TB9JUwbp3hPYN/gXGgqt/f52v4Mx/Nm7tDgC+2bKl4puOzz6DMrEGNvrYPunfv7nnfbrcTExPDVVdd5XmsXbt2ABw8eJBvvvmG5ORkryZfN910EydOnMA0TY4ePcrp06fp1auX53h0dDRdunTxfLxt2zaKi4v5yU9+4jWOoqIiYmJifBq7iIj4j+k0mb1+Ni+vfxmX5cLAHRQqChaeWQ6Dyuc7LPdxmwumroRx60ofBJo0gQUL6vpbqFLDDx+G4dOtDwCaNXP/abOBy3Xuz2bNfH8uHzRt2tTrY8MwvB4rDRoul3eqra0TJ05gt9vJzs4u1xpdG8WJiASPsjMci79azISVE7xunVil0xo+MlwwZ6uDnu2uoTCqOQmvLsHhPO+kDRugR48L+wZ81PDDR220bQuxsRAXB6mpkJEBe/a4Hw8Sl19+OX//+9+xLMsTSj7//HNatmyJw+EgOjqapk2bsmHDBjp27AjA0aNH+fbbb7ntttsAuPbaaykuLubgwYPccsstAfteRETErezy186tOrP76G4+3vgO87/7B6562M1ibKs7Gb60ZCHD5s0wbUn5/3gHQGiGD4cD8vIgLMw9c5KWBqdPQ3h4oEfm8dhjjzFr1iwef/xxRo4cyc6dO3n22WcZM2YMNpuNiIgIUlNTGT9+PDExMbRt25annnrKa9+bn/zkJwwZMoSHHnqIl156iWuvvZZDhw6xevVqunfvzt13B2aJlYhIKKpq+WttZjWqY8PgidT55x4Iov94h2b4AO+gYRhBFTwALrnkEj744APGjx/P1VdfTXR0NKmpqTz99NOec6ZPn86JEye45557aNmyJWPHjqWgoMDreV5//XVeeOEFxo4dy/fff0/r1q254YYbGDBggL+/JRGR0FPSzNJ8fixDP656+atPXGCzGbiw3LnFKik0LantsLtg3rVP44h0nPucIPqPd8Nf7SL1StdcRMRb2foMR6TD62PAsxR299HdsOB1Or+9gk+H3MzYFp/Vyde3uSD92t+T0ieN3CO5JBS1gLvvJjchmhZ330fh+0tJyD2CY81md+Dwk9Ba7SIiIuInGZszPMtdDQz6/3AxK1ofxVUyo+FZ9lqqJTAUMD7zzErUlFFyctnns7lg/WuQtOxnEOk4N7Px9R4cpTMaI18MulKC8zX8DqciIiJ+YDpNrz4bFhYfxBzxBI/Sx8oxyvxZ2b0GFxglx2wYjEseR/7ofObfPA17SU2oHRvpX3UiyRVbvk4jPNwdPCAoSwnOp5kPERGRGsg5nFOuwZfPhaIl58+9ay49nS3I+++HwIDkPYBhkHuxRcLC93Hc1B+A1L7jSOl+H7kn8kmIScTR8pKgn9WoCYUPEREJeV51HN/u9xSJ5jiae2o5dhza4fOtk8p0bdOVpPYJJB0pWX0yzb36xLFnD8Rf5XWuo81lONpcdu6BBh48QOFDRERC3Iy1M5i4aqKnjmPsj9fQruhLJq5eg8uwvOs4yqwoqbHzzrcbdhKiEyAyeFaf+JtqPkREJHRs2gR9+sCmTZhOk+HLhjN+5XivOo4Zzb5k/O3gKinCKFfHUbZ2ozSTuM4ViNqw8eTVj7Pk31Esyb6MpyL6Yys5z27YmTdg3rlC0QZWq1FXNPMhIiKh4403IDOTjHcmMTTi48r7blQ3s2HAzBVwUz4UhhskHLbgg/fJvaQZCdEJ7nDRfzqEhTHYMBhesIfcA9+QEHuFd++NEHVBMx9Tp07FMAxGjRrleax3794YhuH1Nnz48Asdp4iIiM9Mp0nm+rcxP/vA3V78nXcwIyGt+eoLavhld8GgpleT9Myr9G59HY7msTjir6J3p94Vzmo4ouLo/ZM7FDxK1HrmY+PGjcybN89rp9ZSQ4cO5Q9/+IPn4+Y+7v4aKnr37s0111zDrFmzAj0UEZEGz/z0fXJmPk3i6Bdw3HK3V08OmwvSl0HqQcjpBK4L+K+3zbAx7+5XcDw3LORqNepKrS7/iRMnGDJkCPPnz+fiiy8ud7x58+bExsZ63qrrdCbVW7NmDYZhcOzYsUAPRUQk6MxYO4P41QPoc/UW4lffw1Orn/LqyeGyQdo9YEbCpvZU3m+jiskQA4O5d83lu1HfkXr98JCs1agrtQofI0aM4O6776Zfv34VHn/rrbdo3bo13bp1Y/LkyZw8ebLS5yoqKsLpdHq9iYiI1Mh33zH9nSfcRaMlWcBlWLz46YvlenK4bPDCLTDpdsrVdAzvOZw9v/iC6es
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Xnew1 = np.linspace(0,300,50)\n",
"Xnew = np.c_[np.ones(len(Xnew1)), Xnew1, Xnew1*Xnew1]\n",
"ypred = Xnew.dot(theta)\n",
"plt.plot(Xnew1, ypred, '*-r', label='model')\n",
"plt.plot(x,y, '.g', label='data')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Batch gradient descent\n",
"$$ \\theta_{new} = \\theta - \\eta \\nabla_{\\theta} $$ \n",
"$$\\nabla = \\frac{2}{m}X^T(X\\theta-y) $$"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}