You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

411 lines
71 KiB
Plaintext

7 months ago
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# OLS Ordinary Least Squares\n",
"\n",
"The OLS general model $\\hat{y}$ is defined by: \n",
"\n",
"$$ \\hat{y} = \\theta_0+\\theta_1 x_1 $$\n",
"\n",
"Applying the partial derivatives with rescpect $\\theta_0$ and equaliting to zero:\n",
"\n",
"$$\\frac{\\partial SSR}{\\partial \\theta_0}=0 $$\n",
"\n",
"here SSR is defined as:\n",
"\n",
"$$ \\sum_{i=1}^n (y^i - \\hat{y}^i)^2 $$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Resulting in:\n",
"\n",
"$$ \\theta_0 = \\frac{\\sum_{i=1}^n y^i}{n} - \\frac{\\theta_1 \\sum_{i=1}^n x^i}{n}$$\n",
"\n",
"or \n",
"\n",
"$$ \\theta_0 = \\bar{y} -\\theta_1 \\bar{x} $$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In a similar way, the partial derivative of SSR with respect of $\\theta_1$ will result in: \n",
"\n",
"$$\\theta_1 = \\frac{\\sum_{i=1}^n x^i(y^i-\\bar{y}) }{\\sum_{i=1}^n x^i(x^i-\\bar{x})}$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Implementing OLS in Python"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 3.9654 , 4.50131579, 5.03723158, 5.57314737, 6.10906316,\n",
" 6.64497895, 7.18089474, 7.71681053, 8.25272632, 8.78864211,\n",
" 9.32455789, 9.86047368, 10.39638947, 10.93230526, 11.46822105,\n",
" 12.00413684, 12.54005263, 13.07596842, 13.61188421, 14.1478 ])"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"x = np.linspace(0,4,20)\n",
"theta0 = 3.9654\n",
"theta1 = 2.5456\n",
"y = theta0+theta1*x\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAj5klEQVR4nO3df4wcZ33H8c/uXO6cBt8Rp8T2ac4TwywGnMQQQiInlWJv3Fqu5cZ/8CtyjQWubCrT2AQBtkQIUaCXVAgCxUp26RCn5UdKARvEr9QN67iQBGyfrzUUwg54zT0Ex6pEb20Hjmh3+sfVq659d7497z6zs/d+SfPHzT17833yZP18NM/sPqkoiiIBAABYko67AAAAMLsQPgAAgFWEDwAAYBXhAwAAWEX4AAAAVhE+AACAVYQPAABgFeEDAABY1RV3AeerVqt6/vnnNXfuXKVSqbjLAQAA0xBFkU6fPq3+/n6l01Pf22i78PH8889rYGAg7jIAAMAMjIyMyHXdKdu0XfiYO3eupPHie3t7Y64GAABMR7lc1sDAQG0en0rbhY9zSy29vb2EDwAAEmY6j0zwwCkAALCK8AEAAKwifAAAAKsIHwAAwCrCBwAAsIrwAQAArCJ8AAAAqwgfAADAKsIHAACwivABAACsInwAADCLGGNUKBRkjImtBsIHAACzRBAE8jxP2WxWnucpCIJY6khFURTFcuVJlMtl9fX1aXR0lI3lAABoEmOMPM9TtVqtnXMcR6VSSa7rXvLfb2T+5s4HAACzQLFYrAseklSpVBSGofVaGg4fBw8e1Lp169Tf369UKqV9+/ZN2vbd7363UqmUHnrooUsoEQAAXKpMJqN0un7adxxHvu9br6Xh8HH27FktW7ZMu3fvnrLd3r179eyzz6q/v3/GxQEAgOZwXVf5fF6O40gaDx65XK4pSy6N6mr0BWvWrNGaNWumbPPrX/9af/M3f6MnnnhCa9eunXFxAACgeTZv3qzVq1crDEP5vh9L8JBmED4uplqtauPGjXr/+9+vpUuXXrT92NiYxsbGaj+Xy+VmlwQAAP6P67qxhY5zmv7A6YMPPqiuri7ddddd02o/ODiovr6+2jEwMNDskgAAQBtpavg4cuSIPvWpT2nPnj1KpVLTes2uXbs0OjpaO0ZGRppZEgAAaDNNDR///u//rlOnTmnRokXq6upSV1eXTpw4ofe973265pprJnxNT0+Pent76w4AANC5mvrMx8aNG7Vq1aq6c6tXr9bGjRv1zne+s5mXAgAACdVw+Dhz5kzdF5IcP35cw8PDmjdvnhYtWqSrrrqqrv1ll12mBQsWaMmSJZdeLQAASLyGw8fhw4e1cuXK2s933323JGnTpk3as2dP0woDAACdqeHwsWLFCjWyHUypVGr0EgAAoIOxtwsAALCK8AEAAKwifAAAAKsIHwAAtBFjjAqFgowxcZfSMoQPAADaRBAE8jxP2WxWnucpCIK4S2qJVNTIR1csKJfL6uvr0+joKN92CgCYNYwx8jxP1Wq1ds5xHJVKpdg3gpuORuZv7nwAANAGisViXfCQpEqlUvfFnp2C8AEAQBvIZDJKp+unZcdx5Pt+TBW1DuEDAIA24Lqu8vm8HMeRNB48crlcIpZcGsUzHwAAtBFjjMIwlO/7iQoejczfTd3VFgAAXBrXdRMVOmaCZRcAAGAV4QMAAFhF+AAAAFYRPgAAgFWEDwAAYBXhAwAAWEX4AAAAVhE+AACAVYQPAABgFeEDAABYRfgAAKABxhgVCgUZY+IuJbEIHwAATFMQBPI8T9lsVp7nKQiCuEtKJHa1BQBgGowx8jxP1Wq1ds5xHJVKpY7fCG46Gpm/ufMBAMA0FIvFuuAhSZVKRWEYxlRRchE+AACYhkwmo3S6ftp0HEe+78dUUXIRPgAAmAbXdZXP5+U4jqTx4JHL5VhymQGe+QAAoAHGGIVhKN/3CR7/TyPzd5elmgAA6Aiu6xI6LhHLLgAAwCrCBwAAsIrwAQAArCJ8AAAAqwgfAADAKsIHAACwivABAACsInwAAACrCB8AAMAqwgcAALCK8AEA6DjGGBUKBRlj4i4FEyB8AAA6ShAE8jxP2WxWnucpCIK4S8J52NUWANAxjDHyPE/VarV2znEclUolNoNrsUbmb+58AAA6RrFYrAseklSpVBSGYUwVYSKEDwBAx8hkMkqn66c2x3Hk+35MFWEihA8AQMdwXVf5fF6O40gaDx65XI4llzbDMx8AgI5jjFEYhvJ9n+BhSSPzd5elmgAAsMZ1XUJHG2PZBQAAWEX4AAAAVhE+AACAVYQPAABgFeEDAABYRfgAAABWET4AAIBVhA8AAGBVw+Hj4MGDWrdunfr7+5VKpbRv377a71566SV98IMf1HXXXacrrrhC/f39esc73qHnn3++mTUDAIAEazh8nD17VsuWLdPu3bsv+N2LL76ooaEh3XPPPRoaGtLXvvY1Pffcc/qLv/iLphQLAOgMxhgVCgUZY+IuBTG4pL1dUqmU9u7dq/Xr10/a5tChQ7rpppt04sQJLVq06KJ/k71dAKCzBUGgLVu2qFqtKp1OK5/Pa/PmzXGXhUvUyPzd8mc+RkdHlUql9PKXv3zC34+NjalcLtcdAIDOZIypBQ9Jqlar2rp1K3dAZpmWho/f//73+uAHP6g777xz0hQ0ODiovr6+2jEwMNDKkgAAMSoWi7XgcU6lUlEYhjFVhDi0LHy89NJLeutb36ooivTwww9P2m7Xrl0aHR2tHSMjI60qCQAQs0wmo3S6fupxHEe+78dUEeLQkvBxLnicOHFC+/fvn3Ltp6enR729vXUHAKAzua6rfD4vx3EkjQePXC4n13Vjrgw2dTX7D54LHsViUYVCQVdddVWzLwEASLDNmzdr9erVCsNQvu8TPGahhsPHmTNn6tbmjh8/ruHhYc2bN08LFy7Um9/8Zg0NDemb3/ymKpWKTp48KUmaN2+euru7m1c5ACCxXNcldMxiDX/U9sCBA1q5cuUF5zdt2qSPfOQjWrx48YSvKxQKWrFixUX/Ph+1BQAgeRqZvxu+87FixQpNlVcu4WtDAADALMDeLgAAwCrCBwAAsIrwAQAArCJ8AAAAqwgfAADAKsIHAACwivABAACsInwAAACrCB8AgAsYY1QoFGSMibsUdCDCBwCgThAE8jxP2WxWnucpCIK4S0KHaXhvl1ZjbxcAiI8xRp7nqVqt1s45jqNSqcRGcJhSI/M3dz4AADXFYrEueEhSpVKp280cuFSEDwBATSaTUTpdPzU4jiPf92OqCJ2I8AEAqHFdV/l8Xo7jSBoPHrlcjiUXNBXPfAAALmCMURiG8n2f4IFpaWT+7rJUEwAgQVzXJXSgZVh2AQAAVhE+AACAVYQPAABgFeEDAABYRfgAAABWET4AAIBVhA8AAGAV4QMAAFhF+AAAAFYRPgAAgFWEDwBIIGOMCoWCjDFxlwI0jPABAAkTBIE8z1M2m5XneQqCIO6SgIawqy0AJIgxRp7nqVqt1s45jqNSqcRGcIhVI/M3dz4AIEGKxWJd8JCkSqWiMAxjqghoHOEDABIkk8kona7/p9txHPm+H1NFQOMIHwCQIK7rKp/Py3EcSePBI5fLseSCROGZDwBIIGOMwjCU7/sED7SFRubvLks1AQCayHVdQgcSi2UXAABgFeEDAABYRfgAAABWET4AAIBVhA8AAGAV4QMAAFhF+AAAAFYRPgAAgFWEDwAAYBXhAwBaxBijQqEgY0zcpQBthfABAC0QBIE8z1M2m5XneQqCIO6SgLbBxnIA0GTGGHmep2q1WjvnOI5KpRL7saBjNTJ/c+cDAJqsWCzWBQ9JqlQqCsMwpoqA9kL4AIAmy2QySqfr/3l1HEe+78dUEdBeCB8A0GSu6yqfz8txHEnjwSOXy7HkAvwfnvkAgBYxxigMQ/m+T/BAx2tk/u6yVBMAzDqu6xI6gAmw7AIAAKwifAAAAKsIHwAAwCrCBwAAsKrh8HHw4EGtW7dO/f39SqVS2rdvX93voyjShz/8YS1cuFCXX365Vq1apWKx2Kx6AQBAwjUcPs6ePatly5Z
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt \n",
"plt.plot(x,y, '.k')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAq20lEQVR4nO3df3DT933H8dfXKBZZF7shjjA+C0PbjDg/SrAMOcjukhBWjnEs7LZ16anUdN6RdnSEpQslvSW5HutouF7TrONCttI4WKR2txbWS0YzljiwNomKpLCRppeGBqtOKDHeMZuQVQT82R9UqmVkY0lffb/6Ss/HnU+29LX0+fJF6MXnx/tjGWOMAAAAHFLjdgMAAEB1IXwAAABHET4AAICjCB8AAMBRhA8AAOAowgcAAHAU4QMAADiK8AEAABzlc7sB442Ojur48eO64oorZFmW280BAABTYIzR6dOn1dTUpJqayfs2yi58HD9+XMFg0O1mAACAAgwMDKi5uXnSY8oufFxxxRWSLjS+rq7O5dYAAICpGBkZUTAYzHyOT6bswkd6qKWuro7wAQCAx0xlygQTTgEAgKMIHwAAwFGEDwAA4CjCBwAAcBThAwAAOIrwAQAAHEX4AAAAjiJ8AAAARxE+AACAowgfAAB4QCwW09KlSxWLxdxuStEIHwAAeMCuXbvU19en7u5ut5tStLLb2wUAAFyQTCY1NDQky7LU29srSerp6VFHR4eMMWpoaFBLS4vLrcwf4QMAgDI1Z86czPfpDdtOnjypUCiUud8Y43SzisawCwAAZSoSicjnu9BPkA4Z6Vufz6dIJOJa24pBzwcAAGUqHA6rtbU1q6cjLRqNqq2tzYVWFY+eDwAAPKCmpibr1su8fwYAAFSwQCCgxsZGhUIh7dixQ6FQSI2NjQoEAm43rWCWKbOZKiMjI6qvr9fw8LDq6urcbg4AAK5LpVKqra2VZVkyxujs2bPy+/1uNytLPp/fzPkAAKDMjQ0almWVXfDIF8MuAAB4kJcrnhI+AADwIC9XPGXYBQAAj6iUiqeEDwAAPKJSKp4y7AIAgEdUSsVTej4AAPCISql4Ss8HAAAe5OWKp95rMQAAVawSKp5S4RQAAI8px4qnVDgFAKCCeb3iKcMuAADAUYQPAADgKMIHAABwFOEDAAA4Ku/wcfDgQa1atUpNTU2yLEt79+6d8NjPfOYzsixLX//614toIgAAqCR5h48zZ85o/vz52r59+6TH7dmzRy+//LKampoKbhwAAKg8eS+1XbFihVasWDHpMW+//bb+8i//Us8++6xWrlxZcOMAAEDlsb3Ox+joqNasWaP77rtP119//SWPT6VSSqVSmZ9HRkbsbhIAACgjtk84ffjhh+Xz+bRhw4YpHb9161bV19dnvoLBoN1NAgAAZcTW8BGPx/Xoo4+qq6tLlmVN6Xfuv/9+DQ8PZ74GBgbsbBIAACgztoaP//zP/9Tg4KBmz54tn88nn8+nZDKpz3/+85ozZ07O3/H7/aqrq8v6AgDAbrFYTEuXLlUsFnO7KVXP1jkfa9as0bJly7LuW758udasWaNPf/rTdr4UAAB52bVrl/r6+tTd3a329na3m1PV8g4f7777ro4ePZr5+dixYzp8+LBmzJih2bNn66qrrso6/rLLLlNjY6PmzZtXfGsBAMhDMpnU0NCQLMtSb2+vJKmnp0cdHR0yxqihoUEtLS0ut7L65B0+YrGYbr/99szP9957rySpo6NDXV1dtjUMAIBijR3yT89FPHnypEKhUOZ+Y4zTzap6eYeP2267La8L1d/fn+9LAABgi0gkorVr1+rcuXOZz670rc/n4z/NLrG9zgcAAOUiHA6rtbU1q6cjLRqNqq2tzYVWgY3lAABVoaamJusW7uEKAAAqWiAQUGNjo0KhkHbs2KFQKKTGxkYFAgG3m1a1LFNmM21GRkZUX1+v4eFhan4AAGyRSqVUW1sry7JkjNHZs2fl9/vdblZFyefzmzkfAICKNzZoWJZF8HAZwy4AAMBRhA8AAOAowgcAAHAU4QMAgCpQThvrET4AAKgCYzfWcxvhAwCAMmRHT0UymVQ8HlcikchsrNfd3a2FCxequ7tbyWTSrubmhaW2AACUobE9Fe3t7QU9R66N9U6dOqVYLKZPfepTktzZWI+eDwAAykSunoqenh4lEgnF4/G8eyoikYh8vgv9DLlCxpYtWwp63mJR4RQAgDKR7p1If2+Mydym5fuxnUgkcm6sN16xcSCfz296PgAAKJDdK0hy9VSkb30+nyKRiC2vM1apnncyhA8AAApk9wqScDisaDSa87FoNKpwOJz3c6Y31rvuuutsfd5iMOEUAIA8JJNJDQ0NybKsrHkZHR0dMsaooaFBLS0tRb9OTU2NRkdHM7eFam5uVn9/v1599VW1t7dnhnGKfd5iED4AAMhDrhUkJ0+ezJpXUcz8iXRPRTAYVGdnp3bu3KmBgQEFAoGCn9Pv92vmzJm2P2+hmHAKAEAedu/erbVr1+rcuXMXPebz+dTV1VX0MEYqlVJtbW2ml+Ls2bO27MRbqueV8vv8pucDAIA8hMNhtba25lxBEo1G1dbWVvRrjA0ElmXZFhBK9bz5YsIpAAAFqqmpybrF1PCnBQBAntLzMkKhkHbs2KFQKKTGxkZX5k94EXM+AAAoQCnnT3gRcz4AACixcpk/4UUMuwAAAEcRPgAAgKMIHwAAwFGEDwAA4CjCBwAAcBThAwCAMWKxmJYuXapYLOZ2UyoW4QMAUNacDgO7du1SX1+furu7HXm9akSdDwBAWRsbBtrb20vyGslkUkNDQ7IsS729vZKknp4edXR0yBijhoYGtbS0lOS1qxEVTgEAZWdsGFixYoUGBwcVCAS0b9++koQBy7KyvjfGZG7TyuzjsuxQ4RQA4Glz5szJfJ8OBidPnszaSdbOMBCJRLR27VqdO3cu87zpW5/Pp66uLtteC8z5AACUoUgkIp/vwv+Pc4WBSCRi6+uFw2FFo9Gcj0WjUYXDYVtfr9oRPgAAZcfNMFBTU5N1C/vxJwsAKGvpEJAefnnttddK8jqBQECNjY0KhULasWOHQqGQGhsbFQgESvJ61YzwAQAoS+PDwNVXXy1J6uvrK8nrNTc3q7+/X9FoVHfffbei0aj6+/vV3NxckterZkw4BQCUTCwW06ZNm7Rt27a8l8k2NzfrwIEDGhkZyRoCefrpp5VIJEqy6sXv92e+tywr62fYh/ABACiZYmt0zJs3L/O9E6te4AyGXQAAtkomk4rH40okElkFuxKJhOLxuJLJ5JSfy+lVL3AGRcYAALayu2BXIpHI6ulIi8fjamtry7t9xQwFYWL5fH7T8wEAsFWpeivsWgLL3i3uY84HAMBW4XBYra2tOXsrotFo3r0V6VUvwWBQnZ2d2rlzpwYGBvJaAsveLeWF8AEAKJmamhqNjo5mbguRXgJbW1sry7K0bt06nT17Nq+VKE6Xa8fkGHYBANjO7oJdfr8/ExoKWQLLxNXywoRTAEBJpFKpTG+FMSbv3gq72T1xFdmYcAoAcF2xvRWlwt4t7uNPHgBQFdi7pXww7AIAqBrlNhRUSfL5/Ga1CwCgarB3S3lg2AUAADiK8AEAABxF+AAAAI4ifAAAAEflHT4OHjyoVatWqampSZZlae/evZnH3n//fX3hC1/QjTfeqA984ANqamrSpz71KR0/ftzONgMAAA/LO3ycOXNG8+fP1/bt2y967L333lMikdADDzygRCKh733ve3r99df1B3/wB7Y0FgAAeF9RdT4sy9KePXu0evXqCY85dOiQFi1apGQyqdmzZ1/yOanzAQCA95RVefXh4WFZlqUPfvCDpX4pAADgASUtMvarX/1KX/jCF/SJT3xiwhSUSqWUSqUyP4+MjJSySQAAwGUl6/l4//339fGPf1zGGD322GMTHrd161bV19dnvoLBYKmaBAAAykBJwkc6eCSTSe3fv3/SsZ/7779fw8PDma+BgYFSNAkAAJQJ24dd0sHjjTfeUF9fn6666qpJj/f7/dTWBwCgiuQdPt59910dPXo08/OxY8d0+PB
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x = 4*np.random.rand(50, 1)\n",
"y = theta0 + theta1*x+0.5*np.random.randn(50, 1)\n",
"plt.plot(x,y, '*k')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Implementing with `for` \n",
"$$\\theta_1 = \\frac{\\sum_{i=1}^n x^i(y^i-\\bar{y}) }{\\sum_{i=1}^n x^i(x^i-\\bar{x})}$$"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[2.4717291]\n"
]
}
],
"source": [
"# for implementation for computing theta1:\n",
"xAve = x.mean()\n",
"yAve = y.mean()\n",
"num = 0\n",
"den = 0\n",
"for i in range(len(x)):\n",
" num = num + x[i]*(y[i]-yAve)\n",
" den = den + x[i]*(x[i]-xAve)\n",
"theta1Hat = num/den\n",
"print(theta1Hat)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[4.18459936]\n"
]
}
],
"source": [
"# for implementation for theta0:\n",
"# $$ \\theta_0 = \\bar{y} -\\theta_1 \\bar{x} $$\n",
"theta0Hat = yAve - theta1Hat*xAve\n",
"print(theta0Hat)\n",
"#real values are\n",
"#theta0 = 3.9654\n",
"#theta1 = 2.5456"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([2.27654582])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"total = 0\n",
"for i in range(len(x)):\n",
" total = total + x[i]\n",
"total/len(x)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Implementing OLS by numpy methods"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.4717291029649546\n"
]
}
],
"source": [
"# For theta1:\n",
"# $$\\theta_1 = \\frac{\\sum_{i=1}^n x^i(y^i-\\bar{y}) }{\\sum_{i=1}^n x^i(x^i-\\bar{x})}$$\n",
"num2 = np.sum(x*(y-y.mean()))\n",
"den2 = np.sum(x*(x-x.mean()))\n",
"theta1Hat2 = num2/den2\n",
"print(theta1Hat2)\n",
"\n",
"# Efficacy --> time\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4.184599360470533"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"theta0Hat2 = yAve-theta1Hat2*xAve\n",
"theta0Hat2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Comparing Model and Data"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAABCwUlEQVR4nO3df3yT5b3/8Xea0iIIRTyWUlODW6tOZSCggNOBFe1XORydP7Z5EFB61upggG4IuE3kOAXcjhMUpXWRUmBTEWF7bFPmXMGBWmkrjk3U1pFCBoigowW0xeb+/nHRlpa0Tdr87uv5ePQRcuduct2ENh+u6/p8PjbLsiwBAACESUKkBwAAALoXgg8AABBWBB8AACCsCD4AAEBYEXwAAICwIvgAAABhRfABAADCiuADAACEVWKkB9Ca1+vV3r171adPH9lstkgPBwAA+MGyLNXW1io9PV0JCe3PbURd8LF3715lZGREehgAAKAT9uzZI4fD0e45URd89OnTR5IZfN++fSM8GgAA4I+amhplZGQ0fY63J+qCj8allr59+xJ8AAAQY/zZMsGGUwAAEFYEHwAAIKwIPgAAQFhF3Z4Pf1iWpS+//FINDQ2RHkrMsdvtSkxMJI0ZABAxMRd81NfXa9++fTp27FikhxKzevXqpYEDByopKSnSQwEAdEMxFXx4vV7t2rVLdrtd6enpSkpK4n/wAbAsS/X19frkk0+0a9cuZWVldVgIBgCAYIup4KO+vl5er1cZGRnq1atXpIcTk0477TT16NFD1dXVqq+vV8+ePSM9JABANxOT/+3lf+tdw98fACCS+BQCAABhRfABAADCiuAjQj777DMtWLBA+/bti/RQAAAxwuPxqKSkRB6PJ9JD6RKCjwiZOXOm3n77bd19992RHgoAIAa4XC45nU5lZ2fL6XTK5XJFekidRvARAX/4wx9UW1urP/zhD+rXr5/WrFkT6SEBAKKYx+NRXl6evF6vJFN6Ij8/P2ZnQGIq1TZejB8/XuPHj5ckFRUVRXYwAICoV1lZ2RR4NGpoaFBVVZUcDkeERtV5zHwAABDlfBWFtNvtyszMjNCIuobgAwCAKOdwOFRYWCi73S7JBB4FBQUxOeshdffgo6xMys42t2HgcDj01FNPtTj2xhtvqFevXqqurg7LGAAAsSk3N1dut1slJSVyu93Kzc2N9JA6rXvv+SgulkpKpFWrpBEjQv5yI0eO1LZt25ruW5alWbNm6Z577pHT6Qz56wMAYpvD4YjZ2Y6TxX7wYVlSIB1ud++WDh2SbDbpuefMsd/8Rvr2t81znXmmdM45/j1Xr17mefw0atQorVy5sun+qlWrtGfPHs2bN8//8QMAEONiP/g4dkw6/fSuPccnn0hXXBH49x05IvXu7ffpo0aN0ty5c3XkyBHZbDbdf//9+tnPfqbTuzp+AEC34/F4VFlZqaysrJibDeneez7CbPjw4UpISFBFRYUWL16ss846S3feeWekhwUAiDGxXnAs9mc+evUyMxCB2L7d90zHli3S0KGBvXYAevXqpcGDB2vdunV65pln9Mc//pEOswCAgLRVcCwnJydmZkBiP/iw2QJa+pAknXaauU1IkLze5tvTTgv8uQI0atQoPfHEE7rhhhs0duzYkL4WACD+xEPBse753+7UVCktTRo+XFq+3NympZnjITZkyBD16NFDP//5z0P+WgCA+BMPBce6Z/DhcEhut1RaKuXnm1u32xwPseeee07Tp0+PqX8kAIDoEQ8Fx2J/2aWzkpOb/2yztbwfZF6vV5988olcLpcqKyv129/+NmSvBQCIf7m5ucrJyVFVVZUyMzNjKvCQunPwEUavv/66srOzdcEFF2jdunXq27dvpIcEAIhxsVxwjOAjDMaOHXvK5iAAALqr7rnnAwAARAzBBwAACCuCDwAAupMwd3T3heADAIDu5OSO7hEScPDx+uuva8KECUpPT5fNZtOGDRvaPPeuu+6SzWbT448/3oUhAgCALqmulsrLpYoKafVqc+y558z98nLzeBgFnO1y9OhRDRkyRFOnTtVNN93U5nnr16/XW2+9pfT09C4NEAAAdNGgQace++QTU+G7kWWFbTgBBx/XXXedrrvuunbP+de//qUf/OAH2rhxo8aPH9/pwQEAgCB49FHpvvtaHmsMNhITpaKisA4n6HU+vF6vJk2apNmzZ+uiiy7q8Py6ujrV1dU13a+pqQn2kAAA6J4sS3rqKWn+/LbPKS2Vhg0L35gUgg2nixcvVmJiombMmOHX+QsXLlRKSkrTV0ZGRrCHBABA97N/vzR+vDR9uvT559LIkeZ4Y1O6hMjlnAT1lcvLy7VkyRIVFRXJZrP59T3z5s3T4cOHm7727NkTzCFFtbFjx2rWrFmRHgYAIN6sXy9dfLH08sumd9mSJdILL0Sso3trQV12+etf/6oDBw7onHPOaTrW0NCgH/7wh3r88cfldrtP+Z7k5GQlh7CpW7zYtGmTrrrqKn322Wfq169fpIcDAIhGtbXSzJnSihXm/tChJrulcRuE2y0lJZmGqnl5Un19SBurtiWowcekSZM0bty4FsdycnI0adIk3XnnncF8qaDweDyqrKxUVlZWzDbnAQAEJm5/97/xhjRpkvTPf5rg4r77pAULWgYXYezo3p6Al12OHDmi7du3a/v27ZKkXbt2afv27dq9e7fOPPNMXXzxxS2+evToobS0NJ1//vnBHnuXuFwuOZ1OZWdny+l0yuVyhfT1jh49qsmTJ+v000/XwIED9X//938tHl+1apVGjBihPn36KC0tTf/93/+tAwcOSJLcbreuuuoqSdIZZ5whm82mO+64Q5L0yiuv6IorrlC/fv105pln6j//8z/10UcfhfRaACBWhft3f1gcPy795CfSlVeawMPplDZtkhYtilhw0ZGAg4+ysjJdcskluuSSSyRJ9957ry655BI98MADQR9cqHg8HuXl5TV1mvV6vcrPz5fH4wnZa86ePVubN2/Wb3/7W/3pT3/Spk2bVFFR0fT48ePH9dBDD+ndd9/Vhg0b5Ha7mwKMjIwMrVu3TpL0wQcfaN++fVqyZIkkE9Tce++9Kisr02uvvaaEhAR961vfoosuALQSid/9Iff++9Lo0dLDD0ter5n5ePdd6ZvfjPTI2hXwssvYsWNlBVCIxNc+j0irrKw85cO5oaFBVVVVIZmCO3LkiFwul1avXq2rr75akrRy5coWrzV16tSmP3/lK1/R0qVLdemll+rIkSM6/fTT1b9/f0lSampqiz0fN998c4vXevbZZ3XWWWfpvffe08UXXxz0awGAWBXu3/0hZVnS009LP/qRyWQ54wypoEC69dZIj8wv3bK3S1ZWlhJapRjZ7XZlZmaG5PU++ugj1dfXa2RjmpOk/v37t1iKKi8v14QJE3TOOeeoT58+GjNmjCRp9+7d7T53ZWWlbrvtNn3lK19R3759NehEFbuOvg8Auptw/+4PmcYU2mnTTOBxzTXSjh0xE3hI3TT4cDgcKiwslN1ul2T+8RUUFEQs8j169KhycnLUt29frVmzRtu2bdP69eslSfX19e1+74QJE/Tpp5/qmWeeUWlpqUpLS/36PgDobqLtd3+n+EqhfeUV6eyzIz2ygAS9wmmsyM3NVU5OjqqqqpSZmRnSf3xf/epX1aNHD5WWljalIX/22Wf68MMPNWbMGL3//vs6dOiQFi1a1FRkraxVq+OkpCRJZoqw0aFDh/TBBx/omWee0ZVXXilJ2rJlS8iuAwBiXTh/9wdVRym0MabbBh+SiYLD8Q/v9NNPV25urmbPnq0zzzxTqamp+vGPf9w0/XfOOecoKSlJTzzxhO666y79/e9/10MPPdTiOZxOp2w2m37/+9/r+uuv12mnnaYzzjhDZ555pgoLCzVw4EDt3r1bc+fODfn1AEAsC9fv/k4rKzNpso8+Ko0Y4V8KbYzplssukfDzn/9cV155pSZMmKBx48bpiiuu0PAT3QTPOussFRUVae3atbrwwgu1aNEi/eIXv2jx/WeffbYWLFiguXPnasCAAZo+fboSEhL03HPPqby8XBdffLH
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"xNew = np.linspace(0,4,20)\n",
"yHat = theta0Hat + theta1Hat*xNew\n",
"plt.plot(xNew, yHat, '-*r', label=\"$\\hat{y}$\")\n",
"plt.plot(x,y,'.k', label=\"data\")\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Functions for data and OLS"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"def DataGen(xn: float,n: int, disp,theta0=3.9654,theta1=2.5456):\n",
" x = xn*np.random.rand(n, 1)\n",
" #theta0 = 3.9654\n",
" #theta1 = 2.5456\n",
" y = theta0+theta1*x+disp*np.random.randn(n,1)\n",
" return x,y"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"x,y = DataGen(9, 100, 1, 0,1)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAiIAAAGdCAYAAAAvwBgXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8g+/7EAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAoUElEQVR4nO3db4xcVf3H8c/OlG4bsrsRtC3NDENhJ0H5D62l1OiwNqwEiBiDYiAWnHQ3pEBrE7VVgQcWlgJqf7RQumQCGKiKMShigJA6gJUipYiByJ9ZYWBvmxY0ulOqLmRmfg/aHbq7M7NzZ++9596571cyD3q7M/d0BvZ85pzvOaetXC6XBQAAYEDEdAMAAEB4EUQAAIAxBBEAAGAMQQQAABhDEAEAAMYQRAAAgDEEEQAAYAxBBAAAGDPDdAPqKZVK2rt3rzo6OtTW1ma6OQAAoAHlclkHDhzQ/PnzFYnUH/PwdRDZu3ev4vG46WYAAIAmDA8PKxaL1f0ZXweRjo4OSYf+IZ2dnYZbAwAAGlEoFBSPxyv9eD2+DiJj0zGdnZ0EEQAAAqaRsgqKVQEAgDEEEQAAYAxBBAAAGEMQAQAAxhBEAACAMQQRAABgDEEEAAAYQxABAADGEEQAAIAxBBEAAGAMQQQAgICyLEvZbFaWZZluStMIIgAABFAmk1EikVBPT48SiYQymYzpJjWlrVwul003opZCoaCuri6NjIxw6B0AAIdZlqVEIqFSqVS5Fo1Glc/nFYvFDLbsEDv9NyMiAAAETC6XGxdCJKlYLGpoaMhQi5pHEAEAIGCSyaQikfFdeDQaVXd3t6EWNY8gAgBAwMRiMQ0ODioajUo6FEK2bt3qi2kZu6gRAQAgoCzL0tDQkLq7u30VQuz03zM8ahMAAHBYLBbzVQBpBlMzAADAGIIIAAAwhiACAACMIYgAAABjCCIAAMAYgggAADCGIAIAAIwhiAAAYJBlWcpms7Isy3RTjGg6iDz77LO65JJLNH/+fLW1tek3v/nNuL8vl8u68cYbddxxx2n27NlatmyZcrncdNsLAEDLyGQySiQS6unpUSKRUCaTMd0kzzUdRA4ePKgzzjhDd911V9W/v+2223TnnXfqnnvu0Z///GcdffTR6u3t1f/+97+mGwsAQNDUGvGwLEt9fX2VU3RLpZL6+/tDNzLSdBC58MILtX79en3lK1+Z9HflclkbN27UD3/4Q335y1/W6aefrp/97Gfau3fvpJETAABaVb0Rj1wuVwkhY4rFooaGhiSFZ8rGlRqRt99+W/v27dOyZcsq17q6urR48WLt3LnTjVsCAOArU414JJNJRSLju+FoNKru7u5QTdm4EkT27dsnSZo7d+6463Pnzq38XTWjo6MqFArjHgAABNFUIx6xWEyDg4OKRqOSDoWQrVu3SlKopmx8tWpmYGBAXV1dlUc8HjfdJAAAmlJvxGNMOp1WPp9XNptVPp9XOp2eMsC0GleCyLx58yRJ+/fvH3d9//79lb+rZt26dRoZGak8hoeH3WgeAACuqzXiEYvFJv1cKpWqXG8kwLQSV4LIggULNG/ePG3fvr1yrVAo6M9//rOWLFlS83nt7e3q7Owc9wAAIKiqjXhMpdEA0ypmNPvEDz74YNww0dtvv62XX35ZxxxzjI4//nitXr1a69evVzKZ1IIFC3TDDTdo/vz5uvTSS51oNwAAgRCLxWyHiHQ6rd7eXg0NDam7u7tlQ4g0jSDy4osv6vzzz6/8ec2aNZKk5cuX6/7779d3v/tdHTx4UH19ffr3v/+tz33uc3riiSc0a9as6bcaAIAW10yACaK2crlcNt2IWgqFgrq6ujQyMsI0DQAADrMsS7lcTslk0tHQY6f/9tWqGQAAMDUnNjvzy14lBBEAAALEiQDhp+3lCSIAAASEUwHCT3uVEEQAAAgIpwKEn/YqIYgAAOAQtw+qcypA+GmvEoIIAAAO8KL408kA0cxma25g+S4AANNkWZYSicS4aZNoNKp8Pu/KKINlWb7e7MxO/930hmYAAOCQerUbbgSFVtrsjKkZAACmyU/Fn0FDEAEAYJr8VPwZNNSIAADgEL/XbniFGhEAAAxopdoNrzA1AwAAjCGIAAAAYwgiAADAGIIIAAAwhiACAACMIYgAANAC3D5wzy0EEQAAAs6LA/fcwoZmAAAEmNcH7jXCTv/NiAgAAAFW78C9ICCIAAAQYEE/cI8gAgBAgAX9wD1qRAAAaAF+OnCPQ+8AADDMsizlcjklk0lPgkFQD9xjagYAAIcFeTmt15iaAQDAQX5cTus1lu8CAGBI0JfTeo0gAgBAFdW2TG9kG/WgL6f1GkEEAIAJqtV4NFr3EfTltF6jRgQAgCNUq/EYG+GwU/fhp+W0XmP5LgAATapW4zHxz9LHdR+1QkZQl9N6jakZAEBFUI+Sd1K1Go9IJELdh0sIIgAASex9MaZajcfg4CB1Hy6hRgQAwN4XVVSr8Qhz3Ycd1IgAAGypt/eF3Q7X663N3VKtxoO6D+cxNQMAcGzvC6Z3YBdBBADgyN4XlmWpr6+vMrJSKpXU398f6sJXTI2pGQCAJCmdTqu3t7fpGggnp3ca1SrTQGHm2ohIsVjUDTfcoAULFmj27Nk66aST9KMf/Ug+ro0FgNCLxWJKpVJNdepeb23ONFBrcC2IbNiwQVu2bNHmzZv12muvacOGDbrtttu0adMmt24JADDIy63NmQZqHa5NzTz33HP68pe/rIsuukiSdMIJJ+jnP/+5XnjhBbduCQAwbLrTO40yMQ0Ed7g2InLeeedp+/btevPNNyVJf/3rX7Vjxw5deOGFNZ8zOjqqQqEw7gEACJbpTO80qto0kCTt2rXLtXvCHa4FkbVr1+ryyy/XySefrKOOOkpnnXWWVq9erSuuuKLmcwYGBtTV1VV5xONxt5oHAAiwWCymDRs2TLq+bt06pmcCxrUg8vDDD+uhhx7Stm3b9NJLL+mBBx7QHXfcoQceeKDmc9atW6eRkZHKY3h42K3mAQA84tb5Neecc86ka2PTMwgO12pEvvOd71RGRSTptNNO0zvvvKOBgQEtX7686nPa29vV3t7uVpMAAB7LZDKVotJIJKLBwUGl02lHXntsembitvRhP4guaEuaXRsR+c9//lN1GVe1o5QBIMg4sbY6t1e2eLlKJyiCuKTZtSByySWX6Oabb9bvf/975fN5PfLII/rJT36ir3zlK27dEgA8F8Rf/F6pt7KlHjvBLp1OK5/PK5vNKp/POzbaEkRBXdLs2um7Bw4c0A033KBHHnlE7733nubPn69vfOMbuvHGGzVz5syGXoPTdwH4GSfW1tfM++PmVE6ry2az6unpqXo9lUp52hY7/bdrIyIdHR3auHGj3nnnHf33v//V3//+d61fv77hEAIAftfsN/6wsDt1EtRv9H7h9c62TuHQOwBoUlB/8XvJztQJwW56glozw6F3ANCksV/8/f39KhaLgfnF77VYLNbQe9LMKpigrRBxm1c72zqJEREAmAaKJZ0z8Rt9JBLRrbfeWreehELhybzY2dZJrhWrOoFiVQAIn9tvv13f+973VC6XaxasUijsb74oVgUAwC7LsrR27VqNfUeuVbBKPUnrIIgAAHyj0YBBoXDrIIgAAHyj0YAR1BUimIwgAgAwbmw3VUkNB4ywFAq3+hECBBEAaFCrdwimTFz9IqnhgBG0FSJ2hWFlEKtmAKABbD3uDla/1Bbk94ZVMwDgILYet6/R0SNWv9QWlveGIAIAUwhLh+AUO9MJrH6pLSzvDUEEAKYQlg7BCXZHj1j9UltY3huCCABMISwdghOaGT0Ky+qXZoThvaFYFQAaZFlWoA4TMyHIBZZwDsWqAOCCVl8q6gRGj2AXIyIAAMcxehRudvrvGR61CQAQIrFYzLEAYlmWcrmckskkoaYFMTUDAPCtMOwsGnZMzQBAi3Bq5MAvIxAUvgYXxaoAEDJOjRz4aQSCjeTCgRERAAg4OyMH9UY7/DYC4bf2oHGMiABAiDQ6cjDVaIepEYha59KwFDgcGBEBgICZOKrRyMiBUz/jtEZONWYpcPAwIgIALar
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.plot(x,y,'.k')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"def MyOLS(x,y):\n",
" # for implementation for computing theta1:\n",
" xAve = x.mean()\n",
" yAve = y.mean()\n",
" num = 0\n",
" den = 0\n",
" for i in range(len(x)):\n",
" num = num + x[i]*(y[i]-yAve)\n",
" den = den + x[i]*(x[i]-xAve)\n",
" theta1Hat = num/den\n",
" theta0Hat = yAve - theta1Hat*xAve\n",
" return theta0Hat, theta1Hat"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1.12539439])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"the0, the1 = MyOLS(x,y)\n",
"the1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TODO - Students\n",
"- [ ] Efficacy --> time: For method Vs. Numpy"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}