You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

599 lines
133 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "eb335a6d",
"metadata": {},
"source": [
"# Linear regression\n",
"\n",
"The linear regression is a training procedure based on a linear model. The model makes a prediction by simply computing a weighted sum of the input features, plus a constant term called the bias term (also called the intercept term, $\\theta_0$):\n",
"\n",
"$$ \\hat{y}=\\theta_0 x_0 + \\theta_1 x_1 + \\theta_2 x_2 + \\cdots + \\theta_n x_n$$\n",
"\n",
"\n",
"This can be written more easy by the using vector notation form:\n",
"\n",
"$$\\hat{y}= h_\\theta(x) = \\theta^T x$$\n",
"\n",
"---\n",
"**Now that we have our model, how do we train it?**\n",
"\n",
"*Please, consider that training the model means adjusting the parameters to reduce the error or minimizing the cost function.* \n",
"\n",
"The most common performance measure of a regression model is the **Mean Square Error (MSE)**. Therefore, to train a Linear Regression model, you need to find the value of $\\theta$ that minimizes the MSE:\n",
"\n",
"\n",
"$$ MSE(X,h_\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left( \\theta^T x^{(i)}-y^{(i)} \\right)^2$$\n",
"\n",
"\n",
"$$ MSE(X,h_\\theta) = \\frac{1}{m} \\sum_{i=1}^{m} \\left(\\hat{y}^{(i)}-y^{(i)} \\right)^2$$\n",
"\n",
"$$ MSE(X,h_\\theta) = \\frac{1}{m} \\left(\\hat{y}-y \\right)^2$$"
]
},
{
"cell_type": "markdown",
"id": "12cf820d",
"metadata": {},
"source": [
"# The normal equation\n",
"\n",
"To find the value of $\\theta$ that minimizes the cost function, there is a closed-form or direct solution that gives the result. This is called the **Normal Equation**; and can be found it by deriving the *MSE* equation as a function of $\\theta$ and making it equals to zero:\n",
"\n",
"\\begin{eqnarray*}\n",
"\t\\frac{\\partial J(\\theta)}{\\partial \\theta} = 0\\\\\n",
"\t\\frac{\\partial J(\\theta)}{\\partial \\theta}=\\frac{1}{m}\\left(\\theta x-y \\right)^2=\\frac{1}{m}\\left(\\theta x-y \\right)^T\\left(\\theta x-y \\right)\\\\\n",
"\t\\frac{1}{m}\\left[(\\theta x)^T-y^T \\right] \\left[\\theta x-y \\right]\n",
"\\end{eqnarray*}"
]
},
{
"cell_type": "markdown",
"id": "9139b1c9",
"metadata": {},
"source": [
"just considers that:\n",
"\\begin{align*}\n",
"\t(A^T)^T = A\\\\\n",
"\t(A+B)^T = A^T + B^T\\\\\n",
"\t(kA)^T = kA^T\\\\\n",
"\t(AB)^T =A^TB^T\n",
"\\end{align*}"
]
},
{
"cell_type": "markdown",
"id": "5b069092",
"metadata": {},
"source": [
"just considers that $(\\theta x)^Ty=y^T(\\theta x)$\n",
"\\begin{align*}\n",
"\t0=\\frac{\\partial\\,}{\\partial \\theta}\\frac{1}{m}\\left[(\\theta x)^T\\theta x - (\\theta x)^Ty-y^T\\theta x+y^Ty \\right]\\\\\n",
"\\end{align*}"
]
},
{
"cell_type": "markdown",
"id": "ae45a858",
"metadata": {},
"source": [
"$$0=\\frac{\\partial\\,}{\\partial \\theta}\\frac{1}{m}\\left[(\\theta x)^T\\theta x - 2(\\theta x)^Ty+y^Ty \\right]$$"
]
},
{
"cell_type": "markdown",
"id": "ec27d3cd",
"metadata": {},
"source": [
"$$0=\\frac{1}{m}\\left[2\\theta x^T x - 2(x)^Ty+0 \\right]$$"
]
},
{
"cell_type": "markdown",
"id": "7c8c8cfe",
"metadata": {},
"source": [
"$$0=\\frac{2}{m}\\left[\\theta x^T x - (x)^Ty+0 \\right]$$"
]
},
{
"cell_type": "markdown",
"id": "c06b74fd",
"metadata": {},
"source": [
"$$\\hat{\\theta} = (X^T X)^{-1} X^{T} y $$"
]
},
{
"cell_type": "markdown",
"id": "1315356e",
"metadata": {},
"source": [
"## A basic example\n",
"\n",
"First try to implement the linear regressor by only using two instances or points, $(x^0, y^0)$ and $(x^1,y^1)$, thus:\n",
"\n",
"$$\\hat{y}^{(i)}=\\theta_0 x_0+ \\theta_1 x_1^{(i)}$$\n",
"\n",
"but, considers that $x_0=1$, then:\n",
"\n",
"$$\\hat{y}^{(i)}=\\theta_0 + \\theta_1 x_1^{(i)}$$\n",
"\n",
"then, to write it in vector form\n",
"\n",
"$$\n",
" \\hat{y}^0=\n",
" \\begin{bmatrix}\n",
" \\theta_0 & \\theta_1\n",
" \\end{bmatrix} \n",
" \\begin{bmatrix}\n",
" 1 \\\\ \n",
" x_0\n",
" \\end{bmatrix}\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "54dda83d",
"metadata": {},
"source": [
"Now, we need a matrix form to manage all data:\n",
"\n",
"$$\n",
" \\begin{bmatrix}\n",
" \\hat{y}^0 \\\\ \n",
" \\hat{y}^1\\\\\n",
" \\end{bmatrix}\n",
" =\n",
" \\begin{bmatrix}\n",
" \\theta_0 + \\theta_1 x^0_1 \\\\\n",
" \\theta_0 + \\theta_1 x^0_1 \\\\\n",
" \\end{bmatrix}\n",
"$$"
]
},
{
"cell_type": "markdown",
"id": "d86618e4",
"metadata": {},
"source": [
"$$\n",
" \\begin{bmatrix}\n",
" \\hat{y}^0 \\\\ \n",
" \\hat{y}^1\\\\\n",
" \\end{bmatrix}\n",
" =\n",
" \\begin{bmatrix}\n",
" 1 & x^0_1 \\\\\n",
" 1 & x^1_1 \\\\\n",
" \\end{bmatrix}\n",
" \\begin{bmatrix}\n",
" \\theta_0 \\\\\n",
" \\theta_1 \\\\\n",
" \\end{bmatrix}\n",
"$$\n",
"\n",
"Now, let's compute $\\theta$ using the normal equation:"
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 2,
"id": "d2670f5b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1.000e+00, 1.000e-03],\n",
" [1.000e+00, 2.196e+00]])"
]
},
1 year ago
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
1 year ago
"X0 = np.array([[0.001],\n",
" [2.196]])\n",
"X = np.c_[np.ones(2),X0]\n",
"y = np.array([4.314825, 10.877373])\n",
"X"
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 3,
"id": "d95f739b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([4.31183523, 2.9897713 ])"
]
},
1 year ago
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)\n",
"theta"
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 4,
"id": "b241fb39",
"metadata": {},
"outputs": [],
"source": [
"t0 = theta[0]\n",
"t1 = theta[1]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "152366f0",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAadUlEQVR4nO3de4xU9fn48Wdg3YUSdrzCsnVEtCoWrRdUAtIoFiWUGPUPq4RavMXGYJSaajWpt6BdMUZNvyFqTQoKKtqo2NZW4g2NigoCDVqrooh4AWvVHUA71uX8/jDsz5WbqzOf2dl9vZITOmfOzHk2p5PzdubsbC7LsiwAABLpVe0BAICeRXwAAEmJDwAgKfEBACQlPgCApMQHAJCU+AAAkhIfAEBSddUe4Os2btwY7733XvTv3z9yuVy1xwEAvoEsy2LdunXR3NwcvXpt+72NLhcf7733XhQKhWqPAQB8C6tXr47dd999m9t0ufjo379/RHw5fGNjY5WnAQC+iWKxGIVCof08vi1dLj42fdTS2NgoPgCgxnyTSyZccAoAJCU+AICkxAcAkFSn4+Opp56K448/PpqbmyOXy8W8efM63H///ffHcccdF7vsskvkcrlYtmxZmUYFALqDTsfHhg0b4qCDDooZM2Zs9f7Ro0fH9OnTv/NwAED30+nfdhk/fnyMHz9+q/efdtppERHx1ltvfeuhAIDuq+q/alsqlaJUKrXfLhaLVZwGAKi0ql9w2tLSEvl8vn3x7aYA0L1VPT4uvfTSaG1tbV9Wr15d7ZEAgAqqenw0NDS0f5upbzUFgMpavHhxHHPMMbF48eKqzVD1+AAA0rnjjjviiSeeiNmzZ1dthk5fcLp+/fpYsWJF++2VK1fGsmXLYuedd4499tgjPvroo3j77bfjvffei4iIV199NSIimpqaoqmpqUxjAwDf1KpVq+LDDz+MXC4X99xzT0REzJ07NyZPnhxZlsWuu+4agwcPTjZPLsuyrDMPWLBgQYwZM2az9ZMnT45Zs2bFrFmz4owzztjs/iuuuCKuvPLK7T5/sViMfD4fra2tPoIBgDL46h97y+VykWVZ+7+bdDIHNtOZ83en46PSxAcAlNedd94Zp59+enzxxReb3VdXVxezZs2KSZMmfad9dOb8XfXv+QAAKmvSpEmx//77x/Dhwze77/nnn49DDz006TwuOAWAHqRXr14d/q3KDFXbMwCQzIABA6KpqSmGDx8et9xySwwfPjyamppiwIAByWdxzQcA9BClUinq6+vbLzb9/PPPo6GhoSzP7ZoPAGAzXw2NXC5XtvDoLB+7AABJiQ8AICnxAQAkJT4AgKTEBwCQlPgAAJISHwBAUuIDAEhKfAAASYkPACAp8QEAJCU+AICkxAcAkJT4AACSEh8AQFLiAwBISnwAAEmJDwAgKfEBACQlPgCApMQHAJCU+AAAkhIfAEBS4gMASEp8AABJiQ8AICnxAQAkJT4AgKTEBwCQlPgAAJISHwBAUuIDAEhKfAAASYkPACAp8QEAJNXp+Hjqqafi+OOPj+bm5sjlcjFv3rwO92dZFpdffnkMGjQo+vbtG2PHjo3XX3+9XPMCADWu0/GxYcOGOOigg2LGjBlbvP+6666L3//+93HLLbfE888/H/369Ytx48bFf//73+88LABQ++o6+4Dx48fH+PHjt3hflmVx0003xW9/+9s44YQTIiLijjvuiIEDB8a8efPi1FNP/W7TAgA1r6zXfKxcuTLWrFkTY8eObV+Xz+djxIgRsXDhwi0+plQqRbFY7LAAAN1XWeNjzZo1ERExcODADusHDhzYft/XtbS0RD6fb18KhUI5RwIAupiq/7bLpZdeGq2tre3L6tWrqz0SAFBBZY2PpqamiIhYu3Zth/Vr165tv+/rGhoaorGxscMCAHRfZY2PIUOGRFNTUzz22GPt64rFYjz//PMxcuTIcu4KAKhRnf5tl/Xr18eKFSvab69cuTKWLVsWO++8c+yxxx4xderUuPrqq2OfffaJIUOGxGWXXRbNzc1x4oknlnNuAKBGdTo+Fi9eHGPGjGm/feGFF0ZExOTJk2PWrFlx8cUXx4YNG+Kcc86JTz75JEaPHh0PP/xw9OnTp3xTAwA1K5dlWVbtIb6qWCxGPp+P1tZW138AQI3ozPm76r/tAgD0LOIDAEhKfAAASYkPACAp8QEAJCU+AICkxAcAkJT4AACSEh8AQFLiAwBISnwAAEmJDwAgKfEBACQlPgCApMQHAJCU+AAAkhIfAEBS4gMASEp8AABJiQ8AICnxAQAkJT4AgKTEBwCQlPgAAJISHwBAUuIDAEhKfAAASYkPACAp8QEAJCU+AICkxAcAkJT4AACSEh8AQFLiAwBISnwAAEmJDwAgKfEBACQlPgCApMQHAJCU+AAAkqpIfKxbty6mTp0agwcPjr59+8aoUaNi0aJFldgVAFBjKhIfZ599djzyyCMxe/bsWL58eRx33HExduzYePfddyuxOwCghuSyLMvK+YSfffZZ9O/fPx588MGYMGFC+/rhw4fH+PHj4+qrr97m44vFYuTz+WhtbY3GxsZyjgYAVEhnzt9lf+fjiy++iLa2tujTp0+H9X379o2nn3663LsDAGpM2eOjf//+MXLkyJg2bVq899570dbWFnPmzImFCxfG+++/v9n2pVIpisVihwUA6L4qcs3H7NmzI8uy+P73vx8NDQ3x+9//PiZOnBi9em2+u5aWlsjn8+1LoVCoxEgAQBdR9ms+vmrDhg1RLBZj0KBBccopp8T69evjoYce6rBNqVSKUqnUfrtYLEahUHDNBwDUkM5c81FXyUH69esX/fr1i48//jjmz58f11133WbbNDQ0RENDQyXHAAC6kIrEx/z58yPLsthvv/1ixYoVcdFFF8XQoUPjjDPOqMTuAIAaUpFrPlpbW2PKlCkxdOjQ+MUvfhGjR4+O+fPnxw477FCJ3QEANaSi13x8G77nAwBqT1W/5wMAYFvEBwCQlPgAAJISHwBAUuIDAEhKfAAASYkPACAp8QEAJCU+AICkxAcAkJT4AACSEh8AQFLiAwBISnwAAEmJDwAgKfEBACQlPgCApMQHAJCU+AAAkhIfAEBS4gMASEp8AABJiQ8AICnxAQAkJT4AgKTEBwCQlPgAAJISHwBAUuIDAEhKfAAASYkPACAp8QEAJCU+AICkxAcAkJT4AACSEh8AQFLiAwBISnwAAEmJDwAgKfEBACQlPgCApMoeH21tbXHZZZfFkCFDom/fvrH33nvHtGnTIsuycu8KAKhBdeV+wunTp8fNN98ct99+ewwbNiwWL14cZ5xxRuTz+Tj//PPLvTsAoMaUPT6effbZOOGEE2LChAkREbHnnnvG3XffHS+88EK5dwUA1KCyf+wyatSoeOyxx+K1116LiIh//OMf8fTTT8f48eO3uH2pVIpisdhhAQC6r7K/83HJJZdEsViMoUOHRu/evaOtrS2uueaamDRp0ha3b2lpiauuuqrcYwAAXVTZ3/m49957484774y77rorlixZErfffntcf/31cfvtt29x+0svvTRaW1vbl9WrV5d7JACgC8llZf41lEKhEJdccklMmTKlfd3VV18dc+bMiX/961/bfXyxWIx8Ph+tra3R2NhYztEAgArpzPm77O98fPrpp9GrV8en7d27d2zcuLHcuwIAalDZr/k4/vjj45prrok99tgjhg0bFkuXLo0bbrghzjzzzHLvCgCoQWX/2GXdunVx2WWXxQMPPBAffPBBNDc3x8SJE+Pyyy+P+vr67T7exy4AUHs6c/4ue3x8V+IDAGpPVa/5AADYFvEBACQlPgCApMQHAJCU+AAAkhIfAEBS4gMASEp8AABJiQ8AICnxAQAkJT4AgKTEBwCQlPgAAJISHwBAUuIDAEhKfAAASYkPACAp8QEAJCU+AICkxAcAkJT4AACSEh8AQFLiAwBISnwAAEmJDwAgKfEBACQlPgCApMQHAJCU+AAAkhIfAEBS4gMASEp8AABJiQ8AICnxAQAkJT4AgKTEBwCQlPgAAJISHwBAUuIDAEhKfAAASZU9Pvb
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
1 year ago
"plt.plot(X0,y, '*k')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "4f15df79",
"metadata": {},
"outputs": [
{
"data": {
1 year ago
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA3g0lEQVR4nO3de7yVY97H8c+qtEtqp+iwx5Z4kMHTUemAkENPIjNOSaUkhxwa5+YZEmYSxmFITqMzxVBjnEJKSHScwWDKIZuUQe3dgV3t7uePa/SISrvWWvdaa3/er9d6Tdda997r537tWev7un73fV2JKIoiJEmS0qRS3AVIkqSKxfAhSZLSyvAhSZLSyvAhSZLSyvAhSZLSyvAhSZLSyvAhSZLSyvAhSZLSqkrcBfzYhg0bWLJkCTVr1iSRSMRdjiRJ2gZRFLFy5UoKCgqoVGnrcxsZFz6WLFlCYWFh3GVIkqTtUFRUxB577LHVYzIufNSsWRMIxdeqVSvmaiRJ0rYoKSmhsLBw4/f41mRc+Pi+1VKrVi3DhyRJWWZbLpko9wWnM2bMoGvXrhQUFJBIJJg8efLG19atW8fVV1/NwQcfTI0aNSgoKKBXr14sWbKkvG8jSZJyVLnDx+rVq2natCnDhw//yWtr1qxh3rx5XHvttcybN48nn3ySDz74gBNPPDEpxUqSpOyXiKIo2u4fTiSYNGkS3bp12+Ixs2fPpnXr1ixevJg999zzZ39nSUkJ+fn5FBcX23aRJClLlOf7O+XXfBQXF5NIJKhdu/ZmXy8tLaW0tHTjuKSkJNUlSZKkGKV0kbHvvvuOq6++mu7du28xBQ0dOpT8/PyND2+zlSQpt6UsfKxbt47TTjuNKIoYMWLEFo8bNGgQxcXFGx9FRUWpKkmSJGWAlLRdvg8eixcv5uWXX95q7ycvL4+8vLxUlCFJkjJQ0sPH98Fj4cKFTJs2jbp16yb7LSRJUhYrd/hYtWoVixYt2jj++OOPWbBgAXXq1KFhw4accsopzJs3j6effpqysjKWLl0KQJ06dahatWryKpckSVmp3LfaTp8+nSOPPPInz/fu3Zvrr7+exo0bb/bnpk2bRseOHX/293urrSRJ2Selt9p27NiRreWVHVg2RJIkVQApvdVWkiTpxwwfkiRVFGVlcN11cOONsZaRcbvaSpKkFFiyBM48E155BSpVglNPhSZNYinFmQ9JknLdlCnQtGkIHrvsAuPGxRY8wPAhSVLuWr8eBg2C44+Hr76CZs1g7lzo3j3Wsmy7SJKUi4qKQsh4/fUwvvBC+OMfoVq1eOvC8CFJUu55+mno3Ru++QZq1YKHHgrXeGQI2y6SJOWKtWvhiiuga9cQPFq2hHnzMip4gDMfkiTlhk8+gTPOgDffDONLL4VhwyADN281fEiSlO0mT4Y+fWDFCqhdG0aOhG7d4q1pK2y7SJKUrUpLwwzHySeH4NGmDcyfn9HBAwwfkiRlpw8/hPbt4U9/CuPLL4cZM2CvvWIta1vYdpEkKds8/jj06wclJVCnDoweDSecEHdV28yZD0mSssV334X1Ok47LQSP9u1hwYKsCh5g+JAkKTssXAht28KIEWE8aBBMnw6FhbGWtT1su0iSlOkefRT694dVq2D33WHsWDjuuLir2m7OfEiSlKnWrIFzzw270a5aBUccEdosWRw8wPAhSVJmeu+9cOvsQw9BIgHXXQcvvQQFBXFXtsNsu0iSlGlGjw4Xlq5ZA/Xrw/jxcPTRcVeVNM58SJKUKVavhrPPDo81a0LgWLAgp4IHGD4kScoM77wDhxwSZj0qVYIbboApU6BBg7grSzrbLpIkxSmK4OGH4aKLwjoeBQXwyCPh4tIcZfiQJCkuK1fCBReEazoAjj8exowJt9PmMNsukiTF4e9/h1atQvCoXBluvhmeeSbngwc48yFJUnpFEdx/PwwcGHal3WMPmDAhLJVeQRg+JElKl+LisFLpY4+F8QknwKhRULdurGWlm20XSZLSYe5caNEiBI8qVeCPf4SnnqpwwQOc+ZAkKbWiCO65B664AtauhUaNYOLEsHppBWX4kCQpVZYvh3POgUmTwrhbt3Bb7a67xlpW3Gy7SJKUCm++GdoskybBTjvBXXfBk09W+OABhg9JkpIriuD226FDB/jkE9h7b5g5Ey65JGwQJ9sukiQlzddfh31Znn46jE89FR58EPLzYy0r0zjzIUlSMsycCc2bh+CRlwcjRoQLSw0eP2H4kCRpR2zYAMOGweGHQ1ER7LsvzJoF559vm2ULyh0+ZsyYQdeuXSkoKCCRSDB58uRNXn/yySc59thjqVu3LolEggULFiSpVEmSMsy//w1dusA110BZGZx5ZljPo1mzuCvLaOUOH6tXr6Zp06YMHz58i6936NCBYcOG7XBxkiRlrBkzQsh4/nmoVg0eegjGjYOaNeOuLOOV+4LTzp0707lz5y2+3rNnTwA++eST7S5KkqSMVVYGQ4fC4MGh5dKkCTz+OBx0UNyVZY3Y73YpLS2ltLR047ikpCTGaiRJ2oply6BHD5g6NYx794bhw6FGjXjryjKxX3A6dOhQ8vPzNz4KCwvjLkmSpJ+aOhWaNg3/u/POYUO4UaMMHtsh9vAxaNAgiouLNz6KioriLkmSVMHNmTOHo446ijlz5oQ2y+DBcMwxYebjoINg9uww66HtEnvbJS8vj7y8vLjLkCRpozFjxjBt2jT+OmIErT76CKZPDy/06xeWSd9551jry3axhw9JkjLB4sWL+eqrr0gkEkycOJFjgUtHjoQoomznnflm6FB2v+SSuMvMCeUOH6tWrWLRokUbxx9//DELFiygTp067LnnnnzzzTd8+umnLFmyBIAPPvgAgAYNGtCgQYMklS1JUnLttddeAFQGbgQGAUQRC4DT1qxh4aWXEhk+kqLc13zMmTOH5s2b07x5cwAuu+wymjdvznXXXQfAU089RfPmzenSpQsAZ5xxBs2bN+e+++5LYtmSJCXXuHHjaFS5MtP4T/AA7gUOBT6uUoVx48bFV1yOSURRFMVdxA+VlJSQn59PcXExtWrVirscSVJF8cwzrO/RgyrFxRQD5wKP/+eluXPn0qJFixiLy3zl+f6O/W4XSZJitW4dXHklnHACVYqLmQO0SiR4HKhUya/JVPCsSpIqrsWL4bDD4LbbAFjZty+/rl+fXVu14r777qNly5Y0aNCAevXqxVxobvFuF0lSxTR5MvTpAytWQO3a8PDD1Dz5ZP5VWkrVqlVJJBL079+ftWvXuiREkjnzIUmqWNauhYED4eSTQ/Bo3Rrmzw9jwvpTiUQCgEQiYfBIAcOHJKni+OgjaN8+LBQGcPnl8Oqr8J/bbJUetl0kSRXDX/4C55wDJSVQp07Yl6Vr17irqpCc+ZAk5bbvvoMBA+DUU0PwaN8eFiwweMTI8CFJyl0LF0LbtnDvvWF8zTUwbRq4g3qsbLtIknLTo49C//6wahXsthuMHQvHHx93VcKZD0lSrvn22xA6zjwzBI/DDw9tFoNHxjB8SJJyx/vvh1tnH3wQEgm49lqYOhV+8Yu4K9MP2HaRJOWGMWPgggtgzRqoXx/GjYNOneKuSpvhzIckKbutXh1WKu3dOwSPo44KbRaDR8YyfEiSste774Y2y6hRUKkS3HADvPACNGgQd2XaCtsukqTsE0Xw8MNw8cXhAtOGDeGRR6Bjx7gr0zYwfEiSssvKleHajvHjw/jYY8NttO48mzVsu0iSssff/w6tWoXgUbkyDB0Kzz1n8MgyznxIkjJfFMEDD8Cll0JpKeyxR1hErEOHuCvTdjB8SJIyW0kJnHsuPPZYGHfpAqNHQ9268dal7WbbRZKUuebNgxYtQvCoUgVuuw2eesrgkeWc+ZAkZZ4ognvugSuugLVroVEjmDABDj007sqUBIYPSVJmWbECzjkHnnwyjLt1C7fV7rprnFUpiWy7SJIyx1tvQfPmIXjstBPcdVf4t8Ejpxg+JEnxiyK4/XZo3x4++QT23htmzoRLLgkbxCmn2HaRJMXrm2/g7LPhb38L41NOgYcegvz8WMtS6jjzIUmKz8yZ0Kx
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
1 year ago
"Xvec = np.linspace(0,2.5,2)\n",
"Xnew = np.c_[np.ones((2, 1)), Xvec]\n",
"ypre = Xnew.dot(theta)\n",
1 year ago
"plt.plot(X0,y, '*k')\n",
"plt.plot(Xvec, ypre, '-r')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "a986cb4d",
"metadata": {},
"source": [
"# My Linear Regressor\n",
"\n",
"**Instructions:** Create your own function train (estimate) the model's parameter for a Linear Regressor implementing the *Normal Equation*. Test your model with the next generated data."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "5ca2686a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
1 year ago
"[<matplotlib.lines.Line2D at 0xffff60435af0>]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAwKUlEQVR4nO3df3RU5Z3H8U8SJLAsiaL8SieOQFKsVBCrslA9xhiILMuBPXtUOOjSdixZDz2WtbYLfyhw0A20nuq65Shxp8L6A5a2gt3WSpUOUBV/8GsXrKUJZYQRkN2uEIga25m7f9CMTDJJ5k7u3J/v1zlzNJM7M88zN8P9zvN8n+9TZBiGIQAAAJsUO90AAAAQLAQfAADAVgQfAADAVgQfAADAVgQfAADAVgQfAADAVgQfAADAVgQfAADAVv2cbkBnqVRKx44d0+DBg1VUVOR0cwAAQA4Mw9CZM2dUUVGh4uKexzZcF3wcO3ZMlZWVTjcDAADk4ejRowqFQj0e47rgY/DgwZLONb6srMzh1gAAgFy0traqsrIyfR3vieuCj46plrKyMoIPAAA8JpeUCRJOAQCArQg+AACArQg+AACArQg+AACArQg+AACArQg+AACArQg+AACArQg+AACArQg+AACArQg+AACArQg+AAD4s0QioVgspkQi4XRTfM108LFjxw7NnDlTFRUVKioq0ubNmzN+//zzz2vatGm6+OKLVVRUpH379lnUVAAACicajSocDqu2tlbhcFjRaNTpJvmW6eCjra1NEyZM0OrVq7v9/fXXX69Vq1b1uXEAANghkUhowYIFSqVSkqRUKqWGhgZGQArE9K6206dP1/Tp07v9/Z133ilJisfjeTcKAAA7NTc3pwOPDslkUi0tLQqFQg61yr9MBx9Wa29vV3t7e/rn1tZWB1sDAAii6upqFRcXZwQgJSUlqqqqcrBV/uV4wmljY6PKy8vTt8rKSqebBAAImFAopKamJpWUlEg6F3isWbOGUY8CcTz4WLJkiU6fPp2+HT161OkmAQACKBKJKB6PKxaLKR6PKxKJON0k33J82qW0tFSlpaVONwMAAIVCIUY7bOD4yAcAAAgW0yMfZ8+eVUtLS/rnw4cPa9++fRoyZIguvfRS/d///Z+OHDmiY8eOSZIOHjwoSRoxYoRGjBhhUbMBAIBXmR752LVrlyZOnKiJEydKku69915NnDhRDzzwgCTppz/9qSZOnKgZM2ZIkubMmaOJEyfqiSeesLDZAADAq4oMwzCcbsT5WltbVV5ertOnT6usrMzp5gAAgByYuX6T8wEAAGxF8AEAAGxF8AEAAGxF8AEAQIAkEgnFYjFHN80j+AAAICCi0ajC4bBqa2sVDocVjUYdaQerXQAACIBEIqFwONxl87x4PG5JVVdWuwAAgAzNzc0ZgYckJZPJjMKhdiH4AAAgAKqrq1VcnHnZLykpUVVVle1tIfgAACAAQqGQmpqaVFJSIulc4LFmzRpHNtIj5wMAgABJJBJqaWlRVVWVpYGHmeu36Y3lAACAd4VCIUdGO87HtAsAALAVwQcAALAVwQcAALAVwQcAALAVwQcAALAVwQcAALAVwQcAALAVwQcAALAVwQcAALAVwQcAALAVwQcAAHlIJBKKxWJKJBJON8VzCD4AADApGo0qHA6rtrZW4XBY0WjU6SZ5CrvaAgBgQiKRUDgcViqVSt9XUlKieDzu+IZtTjJz/WbkAwAAE5qbmzMCD0lKJpNqaWnJejzTM10RfAAAYEJ1dbWKizMvnyUlJaqqqupyLNMz2RF8AEBA8Y08P6FQSE1NTSopKZF0LvBYs2ZNlymXRCKhBQsWpEdJUqmUGhoaeL9F8AEAgcQ38r6JRCKKx+OKxWKKx+OKRCJdjjE7PWMVLwSVJJwCQMCQMGkPJ97naDSaHm0pLi5WU1NT1sCoEEg4BQB0y6lv5EGT6/SMVbw0zdPP6QYAAOzVkTDZ+Rt5toRJ9E0kElF9fb1aWlpUVVVV0JGlnoJKt41oMfIBAAFj9zfyoAuFQqqpqSn4+2tmFY7TTAcfO3bs0MyZM1VRUaGioiJt3rw54/eGYeiBBx7QyJEjNXDgQNXV1am5udmq9gIALJBLwiS8xUtBpengo62tTRMmTNDq1auz/v673/2uHnvsMT3xxBN68803NWjQINXX1+uTTz7pc2MBANax6xs57OOVoLJPq12Kioq0adMmzZ49W9K5UY+Kigp961vf0n333SdJOn36tIYPH661a9dqzpw5vT4nq10AAPAex1a7HD58WCdOnFBdXV36vvLyck2aNEk7d+7M+pj29na1trZm3AAAgH9ZGnycOHFCkjR8+PCM+4cPH57+XWeNjY0qLy9P3yorK61sEgAAcBnHV7ssWbJEp0+fTt+OHj3qdJMAAEABWRp8jBgxQpL0wQcfZNz/wQcfpH/XWWlpqcrKyjJuAADAvywNPkaNGqURI0Zo69at6ftaW1v15ptvavLkyVa+FAAA8CjTFU7Pnj2bUYL38OHD2rdvn4YMGaJLL71UixYt0oMPPqjq6mqNGjVK999/vyoqKtIrYgAAQLCZDj527dqlm266Kf3zvffeK0maP3++1q5dq+985ztqa2vTggULdOrUKV1//fV66aWXNGDAAOtaDQAAPItdbQEAQJ+xqy0AIJASiYRisZgrd3LFZwg+AAC+EI1GFQ6HVVtbq3A4rGg06nST0A2mXQAAnpdIJBQOhzO2lC8pKVE8Hvfs3jWJRELNzc2qrq72RB+YdgEABEpzc3NG4CFJyWQyY3Wml/h9FIfgAwDgedXV1SouzryklZSUqKqqyqEW5S+RSGjBggXpYCqVSqmhocFXeSwEHwAAzwuFQmpqalJJSYmkc4HHmjVrPDFd0ZnfRnGyMV3nAwAAN4pEIqqvr1dLS4uqqqo8GXhIn43idM5f8eIoTncY+QAA+EYoFFJNTY1nAw+p+1EcSb5ZRkzwAQCAA3qqSRKJRBSPxxWLxRSPxyXJVwmoLLUFAMBm0Wg0nVRaXFyspqYmRSKRrMd6ZRkxS20BAH1GtdDCMLuaxY8JqAQfAJAnP1+c/V5nwklmgwk/LSPuQPABAHnw88U5CHUmnGQ2mPDTMuIOBB8AYJLfL85+HOZ3k3yCic4JqN3lh3gFdT4AwKSeLs5e/jbaIQh1JpyWT02SUCjki78viZEPADDNj3Pw5/PjML8b+aEmSb4IPgDApCBcnP02zO8Xfklyps4HAOQpkUh4vpQ3vMNMbRAnmLl+E3wAAOByXig0RpExAAB8xG8rkAg+AABwOb8lORN8AADgcn5LcibnAwAAj3BzkrOZ6zdFxgAA8Ai/FBpj2gUAANiK4AMA4Dl+KbYVVAQfAABP8fOOwkFBwikAwDO8UGwrqCgyBgDwJb8V2woqgg8AgGf4rdhWUBF8AAA8w2/FtoKKnA8AgOe4udhWUDme83HmzBktWrRI4XBYAwcO1JQpU/T2228X4qUAAAEUCoVUU1ND4OFRBQk+7rrrLr388st6+umntX//fk2bNk11dXV6//33C/FyAADAQyyfdvn44481ePBgvfDCC5oxY0b6/i996UuaPn26HnzwwR4fz7QLAKBDIpFQc3OzqqurGeVwOUenXf70pz8pmUxqwIABGfcPHDhQr776apfj29vb1dramnEDAIBiYv5lefAxePBgTZ48WStWrNCxY8eUTCb1zDPPaOfOnTp+/HiX4xsbG1VeXp6+VVZWWt0kAIDHJBIJLViwIF3TI5VKqaGhgXLqPlGQnI+nn35ahmHoc5/7nEpLS/XYY49p7ty5XdZmS9KSJUt0+vTp9O3o0aOFaBIAwEMoJuZv/QrxpGPGjNH27dvV1tam1tZWjRw5UrfffrtGjx7d5djS0lKVlpYWohkAAI/qKCbWuYw6xcT8oaBFxgYNGqSRI0fqww8/1JYtWzRr1qxCvhwAwCe8WEyMnXZzV5AiY1u2bJFhGBo7dqxaWlr07W9/WwMGDNCvf/1rXXDBBT0+ltUuAIAOXikmFo1G0zkqxcXFampqUiQScbpZtjJz/S5I8LFx40YtWbJEiURCQ4Y
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"np.random.seed(42)\n",
1 year ago
"X0 = 2 * np.random.rand(100, 1)\n",
"X = np.c_[np.ones((100, 1)), X0] # add x0 = 1 to each instance\n",
"y = 4 + 3 * X0 + np.random.randn(100, 1)\n",
"plt.plot(X0, y, '.k')"
]
},
{
"cell_type": "markdown",
"id": "43a9426c",
"metadata": {},
"source": [
"Therefore, the model will become:\n",
"\n",
"$$\n",
" \\begin{bmatrix}\n",
" \\hat{y}^0 \\\\ \n",
" \\hat{y}^1\\\\\n",
" \\vdots \\\\\n",
" \\hat{y}^n\n",
" \\end{bmatrix}\n",
" =\n",
" \\begin{bmatrix}\n",
" 1 & x^0_1 \\\\\n",
" 1 & x^1_1 \\\\\n",
" \\vdots & \\vdots \\\\\n",
" 1 & x^n_1\n",
" \\end{bmatrix}\n",
" \\begin{bmatrix}\n",
" \\theta_0 \\\\\n",
" \\theta_1 \\\\\n",
" \\vdots \\\\\n",
" \\theta_n\n",
" \\end{bmatrix}\n",
"$$\n"
]
},
{
"cell_type": "markdown",
"id": "c07852f4",
"metadata": {},
"source": [
"## Using Sklearn to train the model\n",
"Python already includes a linear regression function within Scikit-Learn."
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 10,
"id": "17481af6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(array([4.21509616]), array([[0. , 2.77011339]]))"
]
},
1 year ago
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.linear_model import LinearRegression\n",
"\n",
"lin_reg = LinearRegression()\n",
"lin_reg.fit(X, y)\n",
"lin_reg.intercept_, lin_reg.coef_"
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 11,
"id": "c5e85264",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[4.21509616],\n",
" [9.75532293]])"
]
},
1 year ago
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_new = np.array([[0], [2]])\n",
"X_new_b = np.c_[np.ones((2, 1)), X_new] #Se agrega x0=1 para cada instancia\n",
"y_Pred=lin_reg.predict(X_new_b)\n",
"y_Pred"
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 13,
"id": "20da996b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
1 year ago
"<matplotlib.legend.Legend at 0xffff453bbb50>"
]
},
1 year ago
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABRIUlEQVR4nO3df5yM5f4/8Nfssj+wu36EXWat3bVtEkoiyrG2ZXMkOb6KI4ePwSYVJaJCftQSR0VOLBtKKZUfJ3WoNEtU8vugaJed7ORXiV27WMxc3z+udo6xv2Z275n7vmdez8djHplr7pn7uvfe7X7Pdb+v92UQQggQEREReUmA2h0gIiIi/8Lgg4iIiLyKwQcRERF5FYMPIiIi8ioGH0RERORVDD6IiIjIqxh8EBERkVcx+CAiIiKvqqF2B25kt9tx4sQJhIWFwWAwqN0dIiIicoEQAhcuXECTJk0QEFDx2Ibmgo8TJ04gOjpa7W4QERFRFeTl5cFoNFa4jeaCj7CwMACy8+Hh4Sr3hoiIiFxRUFCA6Ohox3W8IpoLPkputYSHhzP4ICIi0hlXUiaYcEpERERexeCDiIiIvIrBBxEREXmV5nI+XCGEwLVr12Cz2dTuClVBYGAgatSowanURER+SnfBx5UrV3Dy5ElcvHhR7a5QNdSqVQtRUVEICgpSuytERORlugo+7HY7cnNzERgYiCZNmiAoKIjfnnVGCIErV67gt99+Q25uLhISEiotRkNERL5FV8HHlStXYLfbER0djVq1aqndHaqi0NBQ1KxZE7/88guuXLmCkJAQtbtERERepMuvnPymrH88h0RE/otXACIiIvIqBh9ERETkVQw+VJaUlISxY8eq3Q0iIr9jtVphNpthtVrV7orfcTv42Lp1K3r37o0mTZrAYDBg3bp1Tq+vWbMGPXr0QIMGDWAwGLBv3z6FukpZWVkwGAw4f/682l0hItK1zMxMxMTEIDk5GTExMcjMzFS7S37F7eCjqKgIbdu2xcKFC8t9/d5778Xs2bOr3TlPY9RLROR/rFYrRo4cCbvdDkCWcUhLS+O1wIvcDj569uyJmTNnom/fvmW+PnjwYEyZMgUpKSnV7pwnqRH1FhUV4R//+Afq1KmDqKgo/POf/3R6/d1330X79u0RFhaGyMhI/P3vf8eZM2cAABaLBd26dQMA1KtXDwaDAUOHDgUAbNy4Effeey/q1q2LBg0a4IEHHsDRo0c9fjxERHqUnZ3tCDxK2Gw25OTkqNQj/6N6zkdxcTEKCgqcHp6mVtQ7fvx4bNmyBevXr8cXX3yBrKws7Nmzx/H61atXMWPGDOzfvx/r1q2DxWJxBBjR0dH45JNPAABHjhzByZMn8cYbbwCQQc0zzzyDXbt2YfPmzQgICEDfvn1L/XERERHKLG4YGBiIFi1aqNQj/6N6kbH09HRMmzbNq/usKOo1Go0e2WdhYSEyMzOxcuVK3HfffQCAFStWOO1v2LBhjn/HxcVh/vz5uOuuu1BYWIg6deqgfv36AIBGjRqhbt26jm379evntK+3334bDRs2xI8//ojbbrvNI8dDRKRXRqMRGRkZSEtLg81mQ2BgIBYvXuyx//9TaaqPfEyaNAn5+fmOR15ensf3qUbUe/ToUVy5cgUdO3Z0tNWvXx+JiYmO57t370bv3r3RrFkzhIWFoWvXrgCA48ePV/jZ2dnZGDhwIOLi4hAeHo7mzZu79D4iIn9lMplgsVhgNpthsVhgMpnU7pJfUT34CA4ORnh4uNPD00qi3sDAQADQRNRbVFSE1NRUhIeH47333sPOnTuxdu1aALKsfEV69+6NP/74A0uWLMGOHTuwY8cOl95HROTPjEYjkpKSOOKhAtVvu6jFZDIhNTUVOTk5aNGihcd/+eLj41GzZk3s2LEDzZo1AwCcO3cOP//8M7p27YrDhw/j7NmzmDVrFqKjowEAu3btcvqMkhVgbTabo+3s2bM4cuQIlixZgi5dugAAtm3b5tFjISIiqg63g4/CwkKnjODc3Fzs27cP9evXR7NmzfDHH3/g+PHjOHHiBACZHAkAkZGRiIyMVKjbyjAajV6LeOvUqQOTyYTx48ejQYMGaNSoEV544QXH7Z9mzZohKCgICxYswGOPPYaDBw9ixowZTp8RExMDg8GADRs24K9//StCQ0NRr149NGjQABkZGYiKisLx48cxceJErxwTERFRlQg3mc1mAaDUY8iQIUIIIZYtW1bm61OnTnXp8/Pz8wUAkZ+fX+q1S5cuiR9//FFcunTJ3W5rwoULF8Sjjz4qatWqJRo3bixeffVV0bVrVzFmzBghhBDvv/++aN68uQgODhadOnUS//73vwUAsXfvXsdnTJ8+XURGRgqDweD4mX/55ZeiZcuWIjg4WLRp00ZkZWUJAGLt2rVeP0ZX6f1cEhGRs4qu3zcyCCGEGkFPeQoKChAREYH8/PxS+R+XL19Gbm4uYmNjuQy7zvFcEhH5loqu3zdSPeGUiIiI/AuDDyIiIvIqBh9ERETkVQw+iIiIfJCWF09l8EFERORj1Fg81R0MPoiIiHyIWounuoPBBxERkQ+paPFUrWDwQURE5EPUWDzVXQw+yCVDhw7FQw895PL2WVlZMBgMOH/+vMf6REREpWlx8dQbMfjwot9++w2jRo1Cs2bNEBwcjMjISKSmpmL79u0AgObNm+P11193bC+EwLPPPovw8HBkZWWp02kiItIdk8kEi8UCs9kMi8UCk8mkdpec+O2qtmro168frly5ghUrViAuLg6nT5/G5s2bcfbs2VLb2mw2jBgxAhs2bIDZbMadd96pQo+JiEivvLl4qrs48uEl58+fxzfffIPZs2ejW7duiImJQYcOHTBp0iQ8+OCDTtsWFxejf//++Oqrr/DNN9+UG3iU3Ap55ZVX0LhxY9StWxfTp0/HtWvXMH78eNSvXx9GoxHLli1zet+BAweQnJyM0NBQNGjQACNHjkRhYaHjdZvNhmeeeQZ169ZFgwYNMGHCBNy4BJDdbkd6ejpiY2MRGhqKtm3b4uOPP1bop0VERL5M/8GHEEBRkToPN9bkq1OnDurUqYN169ahuLi43O0KCwvRq1cv/Pjjj9i+fTsSExMr/Nyvv/4aJ06cwNatWzFv3jxMnToVDzzwAOrVq4cdO3bgsccec5piVVRUhNTUVNSrVw87d+7ERx99hK+++gpPPPGE4zP/+c9/Yvny5Xj77bexbds2/PHHH1i7dq3TftPT0/HOO+9g0aJFOHToEJ5++mk8+uij2LJli8s/EyIi8lMeXmHXbRUtyVvmMuyFhULIMMD7j8JCt47t448/FvXq1RMhISGic+fOYtKkSWL//v2O12NiYkRQUJBo0KCBOHPmTKWfN2TIEBETEyNsNpujLTExUXTp0sXx/Nq1a6J27dpi1apVQgghMjIyRL169UThdX3/7LPPREBAgDh16pQQQoioqCjx6quvOl6/evWqMBqNok+fPkIIIS5fvixq1aolvv32W6f+mEwmMXDgQCGEEGazWQAQ586dK7PvZZ5LIiLSrYqu3zfS/8iHjvTr1w8nTpzAv//9b9x///3IyspCu3btsHz5csc2PXr0QFFREV555RVH2/Hjxx0jJ3Xq1HF6rVWrVk5Tqho3bozWrVs7ngcGBqJBgwY4c+YMAOCnn35C27ZtUbt2bcc299xzD+x2O44cOYL8/HycPHkSHTt2dLxeo0YNtG/f3vE8JycHFy9eRPfu3Z369c477+Do0aPK/LCIiMhn6T/htFYt4Lp8Ba/v200hISHo3r07unfvjsmTJ2P48OGYOnUqhg4dCgC477778OSTT6JPnz6w2+1444030KRJE+zbt8/xGfXr13f8u2bNmk6fbzAYymy7seBMdZTkh3z22Wdo2rSp02vBwcGK7YeIiHyT/oMPgwG47lu83tx6661Yt26dU1uPHj3w6aef4sEHH4QQAvPnz1esOEzLli2xfPlyFBUVOUY/tm/fjoCAACQmJiIiIgJRUVHYsWMH/vKXvwAArl27ht27d6Ndu3aOPgcHB+P48ePo2rWrIv0iIiL/of/gQyfOnj2
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
1 year ago
"plt.plot(X0,y, '.k', label='data')\n",
"plt.plot(X_new, y_Pred, '-r', label='SK-model')\n",
"plt.legend()"
]
},
{
"cell_type": "markdown",
"id": "19ad47f6",
"metadata": {},
"source": [
1 year ago
"# Batch Gradient Descent Example for Linear Model\n",
"\n",
"$\\nabla MSE(\\theta)=\\frac{2}{m} X^T(X\\theta-y)$ \n",
"\n",
"$\\theta_N=\\theta-\\eta \\nabla MSE(\\theta)$"
]
},
{
"cell_type": "code",
1 year ago
"execution_count": 19,
"id": "da8103f8",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAr+klEQVR4nO3df3SU1Z3H8c9MMIFlSRB/ANmJ2SipthQprsKC2+NIo9FVKmd3K3jUsu1Ycjx2K8f+QP5A6rpdsOvR6h6OJTIVWn9QuhXtcVutxYmeKlUqsKttlyaWwUwpcranzQ+oWTdz9w82UyY/JjOT58d95nm/zpmjTJ7M3CfPJPf73Pu93xsxxhgBAAB4JOp3AwAAQLgQfAAAAE8RfAAAAE8RfAAAAE8RfAAAAE8RfAAAAE8RfAAAAE8RfAAAAE9N8rsBw2WzWR05ckTTpk1TJBLxuzkAAKAIxhj19fWpvr5e0WjhsQ3rgo8jR46ooaHB72YAAIAydHd3KxaLFTzGuuBj2rRpkk42vra21ufWAACAYvT29qqhoSHXjxdiXfAxNNVSW1tL8AEAQMAUkzJBwikAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPBUycHHyy+/rGXLlqm+vl6RSERPP/107mvvv/++1q5dq3nz5mnq1Kmqr6/XJz/5SR05csTJNgMAgAArOfg4fvy45s+fr82bN4/42okTJ7Rv3z6tX79e+/bt01NPPaWDBw/q4x//uCONBQAAxctkMkqlUspkMn43JU/EGGPK/uZIRLt27dLy5cvHPGbv3r1auHChDh8+rHPOOWfc1+zt7VVdXZ16enrY2wUAgDIlk0mtXr1a2WxW0WhU7e3tSiQSrr1fKf236zkfPT09ikQimj59+qhfHxgYUG9vb94DAACUL5PJ5AIPScpms2pra7NmBMTV4OO9997T2rVrdcMNN4wZBW3cuFF1dXW5R0NDg5tNAgCg4nV2duYCjyGDg4Pq6uryqUX5XAs+3n//fV1//fUyxujhhx8e87h169app6cn9+ju7narSQAAhEJzc7Oi0fwuvqqqSnPmzPGpRflcCT6GAo/Dhw/rhRdeKDj3U1NTo9ra2rwHAAAoXywWU3t7u6qqqiSdDDy2bNmiWCzmc8tOmuT0Cw4FHp2dnUqlUjrjjDOcfgsAADCORCKh1tZWdXV1ac6cOdYEHlIZwUd/f3/enNGhQ4d04MABzZgxQ7Nnz9bf/d3fad++fXr22Wc1ODioo0ePSpJmzJih6upq51oOAAAKisViVgUdQ0peatvR0aHLL798xPOrVq3Sl7/8ZTU1NY36falUSvF4fNzXZ6ktAADBU0r/XfLIRzweV6F4ZQJlQwAAQAiwtwsAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAAPAUwQcAoCJlMhmlUillMhm/m4JhCD4AABUnmUyqsbFRS5cuVWNjo5LJpN9NwikixhjjdyNO1dvbq7q6OvX09Ki2ttbv5gAAAiaTyaixsVHZbDb3XFVVldLptGKxmI8tq2yl9N+MfAAAKkpnZ2de4CFJg4OD6urq8qlFGI7gAwBQUZqbmxWN5ndvVVVVmjNnjk8twnAEHwCAihKLxdTe3q6qqipJJwOPLVu2MOViEXI+AAAVKZPJqKurS3PmzCHw8EAp/fckj9oEAICnYrEYQYelmHYBAIQWtUD8QfABAAglaoH4h5wPAAiQTCajzs5ONTc3M6UwAdQCcR51PgCgAgX5Tt226Y0w1wKx4VoQfABAAGQyGa1evTrXYWazWbW1tVnTmRdiY9A0Vi2QY8eOBeJnWi5brgXBBwAEwFh36g8++KBPLSqOrUHT8Fog0WhU2WxWK1assCZAcppN14LgAwACYLQ7dUm6//77fe/IC7F5eiORSCidTmvnzp0yxmgoBdKWAMlpNl0Lgg8ACIBYLKY77rhjxPPZbNaKjvxUp+YU2F7qPBaL6cwzz9TwtRe2BEhOsulaEHwAQEDcfvvtikQiec/Z1JFLI3MKnn/+eetLndvUKbvJprLzJQcfL7/8spYtW6b6+npFIhE9/fTTeV83xuiuu+7S7NmzNWXKFLW0tKizs9Op9gJAaMViMT3yyCNWdB6jGSunoLW1Vel0WqlUSul0WolEwueW5rOpU3bb0FST39ei5PLqx48f1/z58/XpT39af/M3fzPi61/96lf10EMPafv27WpqatL69evV2tqqn//855o8ebIjjQaAsEokEmptbbVyz5JCOQXxeNyqtg5n88/VaTaUnZ9QkbFIJKJdu3Zp+fLlkk6OetTX1+vzn/+8vvCFL0iSenp6NHPmTG3btk0rV64c9zUpMgYAwUThrnDzrcjYoUOHdPToUbW0tOSeq6ur06JFi7Rnzx4n3woAYJkwTV9gYhzd1fbo0aOSpJkzZ+Y9P3PmzNzXhhsYGNDAwEDu3729vU42CQDgoXKnLygbHy6+r3bZuHGj6urqco+Ghga/mwQAmIBYLFZSjoctVTfhHUeDj1mzZkmS3n333bzn33333dzXhlu3bp16enpyj+7ubiebBACwmE1VN4PMhv1aSuFo8NHU1KRZs2Zp9+7dued6e3v12muvafHixaN+T01NjWpra/MeAIBwsKXqZtA671MFceSo5OCjv79fBw4c0IEDBySdTDI9cOCA3nnnHUUiEa1Zs0b/9E//pO9973t688039clPflL19fW5FTEAAAyxocBXEDvvIUEdOSo5+PjpT3+qBQsWaMGCBZKkO+64QwsWLNBdd90lSfrSl76kf/iHf9Dq1at1ySWXqL+/X8899xw1PgAAI/i9QiaonfcQW0aOSjWhOh9uoM4HAIRPJpPxpcBXKpXS0qVLR30+Ho971o5y2VRbxbc6HwAAlKPUFTJOsWHaZyL8HjkqF8EHACC0bO+8i0mEtWW/llIw7QIACD2/pn0KSSaTuXyUaDSq9vZ2qwOLUvpvgg8AACxjUy5Hscj5AAAgwIK6iqVYBB8A4KMgF7eCe4pNhA3q54fgAwB8EuTiVnBXMYmwQf78kPMBAD7IZDI655xzdOqfYNvn9OG9sRJhbcwJKaX/nuRRmwAAp3jwwQc1/N5vaE6f4ANDYrHYqJ+HQjkhQfj8MO0CAB7LZDK6//77RzwfjUYDU9wK/gp6cTSCDwDw2Gh3rdLJvbKCcNcK/9leHG085HwAgMdsnK9HMNlUHI06HwBgsaDftcIefu2JM1GMfACAT2y6ay1XJpNRZ2enmpubA3sOcAYjHwAQAEG9ax0S5DoTYWBzATKCDwBAyTKZTG7TM0nKZrNqa2uzsqMLI9sDQ4IPAEDJKn3vkSALQmBI8AEAKFnQ60xUsiAEhgQfAICSsWLHXkEIDAk+AABlSSQSSqfTSqVSSqfTSiQSfjcJCkZgyFJbAAAqkNdLudlYDgCAkBtrUzobMO0CAAA8RfABACia04WrbC6EBfcQfAAAiuJ04SrbC2HBPSScAgDG5fROvOzs6wyb9tZhbxcAgKOcLlwVhEJYtgvyyBHBBwBgXE4XrgpCISybBaGEeiEEHwCAcTlduCoIhbBsFvSRI3I+AABFc7pwldeFsCqFjTkzFBkDALjC6cJVNhfCstnQyFFbW5sGBwcDN3LEyAcAAAFl08gRIx8AAFjErSWxQR05IuE
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"X = 2 * np.random.rand(100, 1)\n",
"Xb = np.c_[np.ones((100, 1)), X] # add x0 = 1 to each instance\n",
"y = 4 + 3 * X + np.random.randn(100, 1)\n",
"plt.plot(X, y, '.k')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "c93a4eb5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-0.52563093],\n",
" [ 0.03240701]])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"eta = 0.01 # learning rate\n",
"ni = 20 # iterations\n",
"m = 100 # dataset size\n",
"theta = np.random.randn(2,1) # random initialization\n",
"theta"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "19aaf38d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[4.24631965],\n",
" [2.93425959]])"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for iteration in range(ni):\n",
" gradients = 2/m * Xb.T.dot(Xb.dot(theta) - y)\n",
" theta = theta - eta * gradients\n",
"theta"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "1a8ec08d",
"metadata": {},
1 year ago
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjYuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8o6BhiAAAACXBIWXMAAA9hAAAPYQGoP6dpAABjJklEQVR4nO3dd5gUVdo28LtJQxDGwAxhHRAMoEhQEQR0BUWMCCaQNWBAdxFUdA3opyCvAUVWfFXMCUXEgICLAkrOggQVFQQdcSQNpgmEAWbq++O8PdPVXdVdp+pU6r5/19WX0lN16lRXd52nToxomqaBiIiIyCPV/M4AERERZRYGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROSpGn5nIF5FRQW2bduG+vXrIxKJ+J0dIiIiskDTNJSUlKBp06aoVi153Ubggo9t27YhLy/P72wQERGRDQUFBTjyyCOTbhO44KN+/foAROYbNGjgc26IiIjIiuLiYuTl5VWW48kELviINrU0aNCAwQcREVHIWOkywQ6nRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKengY9GiRejduzeaNm2KSCSCadOmVf7twIEDuPfee9G2bVvUq1cPTZs2xbXXXott27apzDMRERGFmHTwsXv3brRv3x7jx49P+NuePXuwZs0aPPjgg1izZg0++ugjbNy4ERdffLGSzBIREZE1ubm5iEQiyM3N9TsrCSKapmm2d45EMHXqVPTt29d0m1WrVqFTp07YsmULmjVrljLN4uJiZGdno6ioiGu7EBER2WC0voqD4t4SmfLb9YXlioqKEIlEcOihhxr+vaysDGVlZZX/Li4udjtLREREacuspiM3NxeFhYUe58aYqx1O9+3bh3vvvRcDBgwwjYJGjx6N7OzsyldeXp6bWSIiIkpru3btknrfD64FHwcOHEC/fv2gaRpeeOEF0+3uu+8+FBUVVb4KCgrcyhIREVHay8nJkXrfD640u0QDjy1btmDevHlJ236ysrKQlZXlRjaIiIgyTmFhoWGfj6A0uQAu1HxEA49NmzZhzpw5OOKII1QfgoiIiJLQNK2ypiMnJ8f1zqaypGs+SktLsXnz5sp/5+fnY926dTj88MPRpEkTXH755VizZg1mzJiB8vJy7NixAwBw+OGHo1atWupyTkRERKaCVNMRT3qo7YIFC9CjR4+E9wcOHIiHHnoILVq0MNxv/vz56N69e8r0OdSWiIgofFwdatu9e/ek1TdBq9ohIiKiYOHaLkREROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0REROQpBh9ERETkKQYfRERE5CkGH0RElJZyc3MRiUSQm5vrd1YoTg2/M0BERKRaJBKp/P9du3YhEolA0zQfc0SxWPNBRERpxaymgzUgwcHgg4iI0squXbuk3ifvMfggIqK0kpOTI/U+eY/BBxERpZXCwkLL77NTqj8YfBARhQwLzNQ0Taus6cjJyTHsbBqJRCqbYqKdUskbDD6IiEIkrAWmHwFTYWEhNE0zrfEwku4BXVACVw61JSIKCbMCI+jDSIM47DVZp9Tc3FzTppswC9J1YM0HEVFIJBut4feTbLzoE7ZZzYzf+U3W+TRMNUpWBa2mh8EHEVFIpCowgyK2aciM3/m1UrPhd4CkUtCGH0sHH4sWLULv3r3RtGlTRCIRTJs2Tfd3TdMwYsQINGnSBHXq1EHPnj2xadMmVfklIspYyQrMoAwjtVpgByG/sZ1SjfgdIKkUtOHH0sHH7t270b59e4wfP97w72PGjMEzzzyDF198EV988QXq1auHc889F/v27XOcWSKiTGfWRh+UPgpWC+yg5LewsDBwBbMbZIYfe0E6+Dj//PPxyCOP4JJLLkn4m6ZpePrpp/HAAw+gT58+aNeuHd566y1s27YtoYaEiIjssTKM1C+pCmyz/Po5CiNoBbNbgvS9UdrnIz8/Hzt27EDPnj0r38vOzkbnzp2xfPlyw33KyspQXFysexERUXLJhpH6ySw/mqaZ5jcIw4eDVDDLkgncgvK9URp87NixAwDQqFEj3fuNGjWq/Fu80aNHIzs7u/KVl5enMktEROQxmYI8KKMwcnNzsWvXLuTk5PheMMsIQuBmh++jXe677z4UFRVVvgoKCvzOEhEROWT1CTsIozDCWoAHJXCzQ2nw0bhxYwDAzp07de/v3Lmz8m/xsrKy0KBBA92LiIgyg9+dPcNcgAchcLNLafDRokULNG7cGHPnzq18r7i4GF988QW6dOmi8lBERJQG/O7sGeYC3O/AzQnp4KO0tBTr1q3DunXrAIhOpuvWrcMvv/yCSCSCYcOG4ZFHHsHHH3+Mb775Btdeey2aNm2Kvn37Ks46ERGlAz87ewa5AE/VkdTvwM2JiCZ5lRcsWIAePXokvD9w4EC8+eab0DQNI0eOxMsvv4y//voLp59+Op5//nkcd9xxltIvLi5GdnY2ioqK2ARDRESuM+rj4fdoF5k8BaWzrEz5LR18uI3BBxFlmqAUHpksSNcgmpd4QchbMjLlN1e1JSLyUZBWGs1kQSrUrfZDCVLAJMv3obZERJkqzCMtyD1W+qGEdXhwFIMPIiKfhHmkBbknVUfSdAhaGXwQEfkgWUERhJEW5K9kI4DSIWhlnw8iIh8kKyjC1n5P7jD7HuTk5Jh2SA0L1nwQEfkgyPNLyPBzNdpMZWV+j6BfFwYfREQ+CPMEUVFh7/QYZsmaZcJwXRh8EBH5JOzLuMu8T+oZLd4XluvCPh9ERD4KU01HrHTo9JiOwnJdWPNBRETS0qXPSroJy3Vh8EFERNLSoc9KOgrLdWHwQUREtoS5z0o6C8N1YfBBRESWxQ/hNOr06CQ9si7ZZ+f0uriNHU6JiMgS1YvgcVE9+8L+2bHmg4iIUlI9hDMsQ0KDKB0+u4gWsFCpuLgY2dnZKCoqQoMGDfzODhERAUknqrJTjKhOL5ME9bOTKb9Z80FERCmpHsIZliGhQZQOnx1rPoiIyBKjJ25VfT5UpJdJgvjZseaDiIiUUz2EMwxDQlVwY0RP2D87jnYhIiLLVA/dDOpQUFXcHJUS5s+ONR9ERCHFOTKCLR1GpbiFNR9ERCEU9nkeMkFYFnnzA2s+iIhCJuhP1KyREdJhVIpbWPNBRBQyQX6iZo1MlcLCQsNRKWHuq6EKaz6IiBzy+kk/qE/Udmtk/Kgp8eqYYR+V4hbWfBAROeDHk77ZE3U0Pzk5Ob48XdupkfHj8/P6mKzpSMSaDyIim/zsexH/RA1UFfLRAtVrsjUyfnx+Qe8vkykYfBAR2eR334vosulmvC5QzZ7wzd734/Pz+5qlkimddRl8EBGlYFYgBKXvRZAKVJk+Dn58fkG5ZkYikYjvtVde4douRERJpFpDIwhrbOTm5hoGGn71/ZDhx+cXhGsWL8zXMIpruxARKWClf0AQRjPINncEiR+fXxCuWbwg1V55gaNdiIhMWC0QglDIa5pW+fQcpqdlwJ/PL2ifT05OjmnNRzpizQcRkYkg9w8wEu2AGi1YM6XzYjoIc+2VHQw+iIhMhLlAyKTOi+kiiM1BbmHwQUSURBgLBM5lEV7xtVfpin0+iIhSCFtBkGmdFyl8WPNBRKRAkPpXhK2vCmUeBh9E5IogFcZuC1r/ijD3VaHMwOCDiJQLWmHspqD2rwhjXxXKHOzzQURKJSuM0/HJO8j9K9Lx86b0wJoPIlIqyIWxEafNQ+xHkXkyqUnRLaz5ICKlwjRTY2xzULR
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for iteration in range(ni):\n",
" gradients = 2/m * Xb.T.dot(Xb.dot(theta) - y)\n",
" theta = theta - eta * gradients\n",
" Xnew = 2 * np.random.rand(100, 1)\n",
" plt.plot(X,y,'.k')\n",
" Xbw = np.c_[np.ones((100, 1)), Xnew] # add x0 = 1 to each instance\n",
" ypre = Xbw.dot(theta)\n",
" plt.plot(Xnew, ypre, '-r')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}