Logistic-Regressor-From-Scr.../README.md

# Development of a Modular Python Library from Scratch for Automated ROI Segmentation in Thermal Images

# Module 2: Logistic Regressor From Scratch

Author: Sofia Samaniego Lopez

Institution: Universidad Autonoma de Baja California (UABC)

Advisor: Dr. Gerardo Marx Chavez Campos

This notebook presents **Module 2** of the library's development: the implementation of a **Logistic Regression Classifier from scratch**.

To ensure a deep understanding of the underlying mechanics, this module avoids high-level machine learning "black-box" libraries. Instead, it builds the optimization algorithm using fundamental mathematical operations via **NumPy**. It covers the definition of the Sigmoid activation function, the formulation of the Log-Loss (Cross-Entropy) cost function, and the iterative optimization of weights using Gradient Descent.

The classic Iris dataset is utilized to evaluate the model's capacity to estimate probabilities and establish a linear decision boundary for binary classification based on morphological features.

## 1. Environment Setup & Data Loading
Importing core libraries for data manipulation (`pandas`), mathematical operations (`numpy`), and visualization (`matplotlib`). The Iris dataset is loaded to extract the target variables.


```python
!pip3 install pandas
!pip3 install numpy
!pip3 install matplotlib

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
```

    Requirement already satisfied: pandas in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (3.0.3)
    Requirement already satisfied: numpy>=2.3.3 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from pandas) (2.5.0)
    Requirement already satisfied: python-dateutil>=2.8.2 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from pandas) (2.9.0.post0)
    Requirement already satisfied: tzdata in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from pandas) (2026.2)
    Requirement already satisfied: six>=1.5 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)
    Requirement already satisfied: numpy in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (2.5.0)
    Requirement already satisfied: matplotlib in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (3.11.0)
    Requirement already satisfied: contourpy>=1.0.1 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (1.3.3)
    Requirement already satisfied: cycler>=0.10 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (0.12.1)
    Requirement already satisfied: fonttools>=4.22.0 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (4.63.0)
    Requirement already satisfied: kiwisolver>=1.3.1 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (1.5.0)
    Requirement already satisfied: numpy>=1.25 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (2.5.0)
    Requirement already satisfied: packaging>=20.0 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (26.2)
    Requirement already satisfied: pillow>=9 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (12.2.0)
    Requirement already satisfied: pyparsing>=3 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (3.3.2)
    Requirement already satisfied: python-dateutil>=2.7 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from matplotlib) (2.9.0.post0)
    Requirement already satisfied: six>=1.5 in c:\Users\sofia\Logistic-Regressor-From_Scratch\.venv\Lib\site-packages (from python-dateutil>=2.7->matplotlib) (1.17.0)


```python
df = pd.read_csv('iris_basic.csv')
df
```


<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>sl</th>
      <th>sw</th>
      <th>pl</th>
      <th>pw</th>
      <th>target</th>
      <th>tNames</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>5.1</td>
      <td>3.5</td>
      <td>1.4</td>
      <td>0.2</td>
      <td>0</td>
      <td>setosa</td>
    </tr>
    <tr>
      <th>1</th>
      <td>4.9</td>
      <td>3.0</td>
      <td>1.4</td>
      <td>0.2</td>
      <td>0</td>
      <td>setosa</td>
    </tr>
    <tr>
      <th>2</th>
      <td>4.7</td>
      <td>3.2</td>
      <td>1.3</td>
      <td>0.2</td>
      <td>0</td>
      <td>setosa</td>
    </tr>
    <tr>
      <th>3</th>
      <td>4.6</td>
      <td>3.1</td>
      <td>1.5</td>
      <td>0.2</td>
      <td>0</td>
      <td>setosa</td>
    </tr>
    <tr>
      <th>4</th>
      <td>5.0</td>
      <td>3.6</td>
      <td>1.4</td>
      <td>0.2</td>
      <td>0</td>
      <td>setosa</td>
    </tr>
    <tr>
      <th>...</th>
      <td>...</td>
      <td>...</td>
      <td>...</td>
      <td>...</td>
      <td>...</td>
      <td>...</td>
    </tr>
    <tr>
      <th>145</th>
      <td>6.7</td>
      <td>3.0</td>
      <td>5.2</td>
      <td>2.3</td>
      <td>2</td>
      <td>virginica</td>
    </tr>
    <tr>
      <th>146</th>
      <td>6.3</td>
      <td>2.5</td>
      <td>5.0</td>
      <td>1.9</td>
      <td>2</td>
      <td>virginica</td>
    </tr>
    <tr>
      <th>147</th>
      <td>6.5</td>
      <td>3.0</td>
      <td>5.2</td>
      <td>2.0</td>
      <td>2</td>
      <td>virginica</td>
    </tr>
    <tr>
      <th>148</th>
      <td>6.2</td>
      <td>3.4</td>
      <td>5.4</td>
      <td>2.3</td>
      <td>2</td>
      <td>virginica</td>
    </tr>
    <tr>
      <th>149</th>
      <td>5.9</td>
      <td>3.0</td>
      <td>5.1</td>
      <td>1.8</td>
      <td>2</td>
      <td>virginica</td>
    </tr>
  </tbody>
</table>
<p>150 rows × 6 columns</p>
</div>


## 2. Binary Classification Setup & Data Visualization
Extracting the 'Petal Width' ($pw$) as the independent feature ($x$) and the target class ($y$).


```python
x = df['pw'].to_numpy().reshape(-1, 1)
y = df['target'].to_numpy().reshape(-1, 1)

# Convert target to binary: 1 if setosa (class 0), 0 otherwise
y = (y==0).astype(float)

```


```python
# Adding visual noise (Jitter) to observe point density
yJitter = y+np.random.uniform(-0.2,0.2,size=y.shape)
plt.plot(x,yJitter,'og', alpha=0.1)
plt.show()
```


![png](README_files/README_9_0.png)


**Understanding the Jitter Plot:**
In binary classification, true labels are strictly `0` or `1`. If plotted directly, data points overlap perfectly, masking the true density of the samples. By adding uniform random noise (*jitter*) to the y-axis, the points spread out vertically, allowing us to visually inspect the data distribution and density for both classes.

## 3. The Sigmoid Activation Function
The mathematical core of logistic regression. Linear regression outputs continuous values from $-\infty$ to $+\infty$. The Sigmoid function smoothly maps any real-valued number into a probability range bounded between $0$ and $1$.
Formula:
$$\sigma(z) = \frac{1}{1+e^{-z}}$$
*(Note: `np.clip` is used to bound extreme values and prevent overflow errors during exponential calculation).*


```python
def sigmoid(z):
    sig= 1/(1+np.exp(-z))
    return sig
```


```python
xNew = np.linspace(-5,5,100)
model = sigmoid(xNew)
plt.plot(xNew,model)
plt.plot(x,yJitter,'og', alpha=0.1)
plt.show()
```


![png](README_files/README_13_0.png)


This plot illustrates the model's core activation function alongside the empirical distribution of the Iris dataset.

* **Sigmoid Activation Curve:** The green line represents the non-linear transformation $\sigma(z) = \frac{1}{1+e^{-z}}$. This function maps input features into a probability space between $0$ and $1$, providing the mathematical foundation for the model's confidence levels.
* **Data Distribution (Jittered):** The green markers represent the actual feature values. As binary classes are constrained to $\{0, 1\}$, random vertical noise (jitter) is applied to the data points to prevent overlap, revealing the density and separation between the two classes.

By overlaying the Sigmoid curve on the jittered data, we can visually inspect how well the model's probability estimates align with the observed class clusters.


```python
def sigmoid(z):
    # Clip limits z to avoid exp overflow
    z = np.clip(z, -500, 500)
    sig= 1/(1+np.exp(-z))
    return sig
```

## 4. Cost Function: Log-Loss (Cross-Entropy)
This function calculates the error between the model's predicted probabilities ($p$) and the true binary labels ($y$).
Probabilities are clipped using a tiny epsilon ($\epsilon$) to prevent mathematical undefined errors (like $\log(0)$), which would break the algorithm.


```python
def logLoss(y, p, eps=1e-12):
    p = np.clip(p, eps, 1-eps)
    loss = -np.mean(y*np.log(p) + (1-y)*np.log(1-p))
    return loss
```

## 5. Model Training via Gradient Descent
Instead of solving an equation directly, the model learns iteratively.
1. **Initialization:** Random weights ($\theta_0$ for bias, $\theta_1$ for the feature) are generated.
2. **Forward Pass:** Predictions are computed using the dot product and the Sigmoid function.
3. **Gradient Calculation:** The error gradient is calculated across all samples.
4. **Update:** Weights are adjusted in the opposite direction of the gradient, scaled by the learning rate (`lr`).


```python
lr = 0.2
epochs = 1000

# Add a column of ones to X for the bias term (intercept)
X = np.column_stack([np.ones_like(x), x])
m = X.shape[0]

# Random weight initialization
theta = np.random.rand(2,1)
theta
```


    array([[0.83841703],
           [0.1412671 ]])


```python
# Training loop
for i in range(epochs):
   z = X @ theta
   h = sigmoid(z) # Predicted probability

   # Gradient computation and weight update
   grad = X.T @ (h-y) / m
   theta -= lr * grad

theta0, theta1 = theta[0,0], theta[1,0]
print(f"Optimized Bias (Theta 0): {theta0}")
print(f"Optimized Weight (Theta 1): {theta1}")
```

    Optimized Bias (Theta 0): 4.309159539504179
    Optimized Weight (Theta 1): -6.028218019470694


## 6. Inference and Decision Boundary Visualization
Functions to compute continuous probabilities and absolute binary classes based on a `0.5` threshold.


```python
def predictProba(x, theta0, theta1):
    x = np.array(x, float).reshape(-1)
    model = sigmoid(theta0 + theta1 * x)
    return model

def predict(x, theta0, theta1, thresh=0.5):
    model = (predictProba >= thresh).astype(int)
    # Returns 1 if probability >= threshold, else 0


```


```python
# Plotting the empirical data alongside the optimized Sigmoid curve
xNew = np.linspace(0,2.5,100)
p = predictProba(xNew, theta0, theta1)
plt.plot(xNew, p, ':g')
plt.plot(x,yJitter,'og', alpha=0.1)
plt.show()
```


![png](README_files/README_23_0.png)


**Understanding the Final Plot:**
The green dotted line represents the trained Sigmoid curve. It illustrates the model's probability estimation across different Petal Widths. Where the curve crosses the $0.5$ probability mark on the y-axis, the model sets its hard mathematical boundary, switching its classification verdict from Class 0 to Class 1.