ols-video/Readme.md

# OLS Ordinary Least Squares

The OLS general model $\hat{y}$ is defined by: 

$$ \hat{y} = \theta_0+\theta_1 x_1 $$

Applying the partial derivatives with rescpect $\theta_0$ and equaliting to zero:

$$\frac{\partial SSR}{\partial \theta_0}=0 $$

here SSR is defined as:

$$ \sum_{i=1}^n (y^i - \hat{y}^i)^2 $$

Resulting in:

$$ \theta_0 = \frac{\sum_{i=1}^n y^i}{n} - \frac{\theta_1 \sum_{i=1}^n x^i}{n}$$

or 

$$ \theta_0 = \bar{y} -\theta_1 \bar{x} $$

In a similar way, the partial derivative of SSR with respect of $\theta_1$ will result in: 

$$\theta_1 = \frac{\sum_{i=1}^n x^i(y^i-\bar{y}) }{\sum_{i=1}^n x^i(x^i-\bar{x})}$$

# Implementing OLS in Python


```python
import numpy as np
x = np.linspace(0,4,20)
theta0 = 3.9654
theta1 = 2.5456
y = theta0+theta1*x
y
```


    array([ 3.9654    ,  4.50131579,  5.03723158,  5.57314737,  6.10906316,
            6.64497895,  7.18089474,  7.71681053,  8.25272632,  8.78864211,
            9.32455789,  9.86047368, 10.39638947, 10.93230526, 11.46822105,
           12.00413684, 12.54005263, 13.07596842, 13.61188421, 14.1478    ])


```python
import matplotlib.pyplot as plt 
plt.plot(x,y, '.k')
plt.show()
```


![png](main_files/main_5_0.png)
    

```python
x = 4*np.random.rand(50, 1)
y = theta0 + theta1*x+0.5*np.random.randn(50, 1)
plt.plot(x,y, '*k')
plt.show()
```


![png](main_files/main_6_0.png)
    

## Implementing with `for` 
$$\theta_1 = \frac{\sum_{i=1}^n x^i(y^i-\bar{y}) }{\sum_{i=1}^n x^i(x^i-\bar{x})}$$


```python
# for implementation for computing theta1:
xAve = x.mean()
yAve = y.mean()
num = 0
den = 0
for i in range(len(x)):
    num = num + x[i]*(y[i]-yAve)
    den = den + x[i]*(x[i]-xAve)
theta1Hat = num/den
print(theta1Hat)
```

    [2.4717291]


```python
# for implementation for theta0:
# $$ \theta_0 = \bar{y} -\theta_1 \bar{x} $$
theta0Hat = yAve - theta1Hat*xAve
print(theta0Hat)
#real values are
#theta0 = 3.9654
#theta1 = 2.5456
```

    [4.18459936]


```python
total = 0
for i in range(len(x)):
    total = total + x[i]
total/len(x)
```


    array([2.27654582])


## Implementing OLS by numpy methods


```python
# For theta1:
# $$\theta_1 = \frac{\sum_{i=1}^n x^i(y^i-\bar{y}) }{\sum_{i=1}^n x^i(x^i-\bar{x})}$$
num2 = np.sum(x*(y-y.mean()))
den2 = np.sum(x*(x-x.mean()))
theta1Hat2 = num2/den2
print(theta1Hat2)

# Efficacy --> time

```

    2.4717291029649546


```python
theta0Hat2 = yAve-theta1Hat2*xAve
theta0Hat2
```


    4.184599360470533


# Comparing Model and Data


```python
xNew = np.linspace(0,4,20)
yHat = theta0Hat + theta1Hat*xNew
plt.plot(xNew, yHat, '-*r', label="$\hat{y}$")
plt.plot(x,y,'.k', label="data")
plt.legend()
plt.show()
```


![png](main_files/main_15_0.png)
    

# Functions for data and OLS


```python
def DataGen(xn: float,n: int, disp,theta0=3.9654,theta1=2.5456):
    x = xn*np.random.rand(n, 1)
    #theta0 = 3.9654
    #theta1 = 2.5456
    y = theta0+theta1*x+disp*np.random.randn(n,1)
    return x,y
```


```python
x,y = DataGen(9, 100, 1, 0,1)
```


```python
plt.plot(x,y,'.k')
plt.show()
```


![png](main_files/main_19_0.png)
    

```python
def MyOLS(x,y):
    # for implementation for computing theta1:
    xAve = x.mean()
    yAve = y.mean()
    num = 0
    den = 0
    for i in range(len(x)):
        num = num + x[i]*(y[i]-yAve)
        den = den + x[i]*(x[i]-xAve)
    theta1Hat = num/den
    theta0Hat = yAve - theta1Hat*xAve
    return theta0Hat, theta1Hat
```


```python
the0, the1 = MyOLS(x,y)
the1
```


    array([1.12539439])


# TODO - Students
- [ ] Efficacy --> time: For method Vs. Numpy
Readme file added 9 months ago			`# OLS Ordinary Least Squares`

			`The OLS general model $\hat{y}$ is defined by:`

			`$$ \hat{y} = \theta_0+\theta_1 x_1 $$`

			`Applying the partial derivatives with rescpect $\theta_0$ and equaliting to zero:`

			`$$\frac{\partial SSR}{\partial \theta_0}=0 $$`

			`here SSR is defined as:`

			`$$ \sum_{i=1}^n (y^i - \hat{y}^i)^2 $$`

			`Resulting in:`

			`$$ \theta_0 = \frac{\sum_{i=1}^n y^i}{n} - \frac{\theta_1 \sum_{i=1}^n x^i}{n}$$`

			`or`

			`$$ \theta_0 = \bar{y} -\theta_1 \bar{x} $$`

			`In a similar way, the partial derivative of SSR with respect of $\theta_1$ will result in:`

			`$$\theta_1 = \frac{\sum_{i=1}^n x^i(y^i-\bar{y}) }{\sum_{i=1}^n x^i(x^i-\bar{x})}$$`

			`# Implementing OLS in Python`


			```python
			`import numpy as np`
			`x = np.linspace(0,4,20)`
			`theta0 = 3.9654`
			`theta1 = 2.5456`
			`y = theta0+theta1*x`
			`y`
			```




			`array([ 3.9654 , 4.50131579, 5.03723158, 5.57314737, 6.10906316,`
			`6.64497895, 7.18089474, 7.71681053, 8.25272632, 8.78864211,`
			`9.32455789, 9.86047368, 10.39638947, 10.93230526, 11.46822105,`
			`12.00413684, 12.54005263, 13.07596842, 13.61188421, 14.1478 ])`




			```python
			`import matplotlib.pyplot as plt`
			`plt.plot(x,y, '.k')`
			`plt.show()`
			```



			`![png](main_files/main_5_0.png)`




			```python
			`x = 4*np.random.rand(50, 1)`
			`y = theta0 + theta1x+0.5np.random.randn(50, 1)`
			`plt.plot(x,y, '*k')`
			`plt.show()`
			```



			`![png](main_files/main_6_0.png)`



			## Implementing with `for`
			`$$\theta_1 = \frac{\sum_{i=1}^n x^i(y^i-\bar{y}) }{\sum_{i=1}^n x^i(x^i-\bar{x})}$$`


			```python
			`# for implementation for computing theta1:`
			`xAve = x.mean()`
			`yAve = y.mean()`
			`num = 0`
			`den = 0`
			`for i in range(len(x)):`
			`num = num + x[i]*(y[i]-yAve)`
			`den = den + x[i]*(x[i]-xAve)`
			`theta1Hat = num/den`
			`print(theta1Hat)`
			```

			`[2.4717291]`



			```python
			`# for implementation for theta0:`
			`# $$ \theta_0 = \bar{y} -\theta_1 \bar{x} $$`
			`theta0Hat = yAve - theta1Hat*xAve`
			`print(theta0Hat)`
			`#real values are`
			`#theta0 = 3.9654`
			`#theta1 = 2.5456`
			```

			`[4.18459936]`



			```python
			`total = 0`
			`for i in range(len(x)):`
			`total = total + x[i]`
			`total/len(x)`
			```




			`array([2.27654582])`



			`## Implementing OLS by numpy methods`


			```python
			`# For theta1:`
			`# $$\theta_1 = \frac{\sum_{i=1}^n x^i(y^i-\bar{y}) }{\sum_{i=1}^n x^i(x^i-\bar{x})}$$`
			`num2 = np.sum(x*(y-y.mean()))`
			`den2 = np.sum(x*(x-x.mean()))`
			`theta1Hat2 = num2/den2`
			`print(theta1Hat2)`

			`# Efficacy --> time`

			```

			`2.4717291029649546`



			```python
			`theta0Hat2 = yAve-theta1Hat2*xAve`
			`theta0Hat2`
			```




			`4.184599360470533`



			`# Comparing Model and Data`


			```python
			`xNew = np.linspace(0,4,20)`
			`yHat = theta0Hat + theta1Hat*xNew`
			`plt.plot(xNew, yHat, '-*r', label="$\hat{y}$")`
			`plt.plot(x,y,'.k', label="data")`
			`plt.legend()`
			`plt.show()`
			```



			`![png](main_files/main_15_0.png)`



			`# Functions for data and OLS`


			```python
			`def DataGen(xn: float,n: int, disp,theta0=3.9654,theta1=2.5456):`
			`x = xn*np.random.rand(n, 1)`
			`#theta0 = 3.9654`
			`#theta1 = 2.5456`
			`y = theta0+theta1x+dispnp.random.randn(n,1)`
			`return x,y`
			```


			```python
			`x,y = DataGen(9, 100, 1, 0,1)`
			```


			```python
			`plt.plot(x,y,'.k')`
			`plt.show()`
			```



			`![png](main_files/main_19_0.png)`




			```python
			`def MyOLS(x,y):`
			`# for implementation for computing theta1:`
			`xAve = x.mean()`
			`yAve = y.mean()`
			`num = 0`
			`den = 0`
			`for i in range(len(x)):`
			`num = num + x[i]*(y[i]-yAve)`
			`den = den + x[i]*(x[i]-xAve)`
			`theta1Hat = num/den`
			`theta0Hat = yAve - theta1Hat*xAve`
			`return theta0Hat, theta1Hat`
			```


			```python
			`the0, the1 = MyOLS(x,y)`
			`the1`
			```




			`array([1.12539439])`



			`# TODO - Students`
			`- [ ] Efficacy --> time: For method Vs. Numpy`