Introduction to Perceptron & Generalized Linear Model

Perceptron

The Perceptron is one of the simplest types of artificial neural networks and serves as a fundamental building block for more complex neural network architectures. It is a binary classifier that determines whether an input, represented as a vector of numbers, belongs to a specific class or not. The perceptron makes its decision based on a linear combination of the input features.

Structure

A perceptron consists of:

  1. Input layer: A set of input neurons that receive features of the data.

  2. Weights: A set of weights associated with each input feature.

  3. Bias: A bias term that shifts the decision boundary.

  4. Activation Function: A function that outputs a binary decision based on the weighted sum of inputs.

Mathematical Representation

The output of a perceptron is given by:

y=sign(w⋅x+b)y = \text{sign}(\mathbf{w} \cdot \mathbf{x} + b)y=sign(w⋅x+b)

where:

  • x\mathbf{x}x is the input vector.

  • w\mathbf{w}w is the vector of weights.

  • bbb is the bias.

  • sign\text{sign}sign is the activation function that returns 1 if the argument is positive and -1 otherwise.

Training

The perceptron learning algorithm adjusts the weights and bias based on the error between the predicted output and the actual output. The weights are updated using the rule:

w←w+η(y−y^)x\mathbf{w} \leftarrow \mathbf{w} + \eta (y - \hat{y}) \mathbf{x}w←w+η(y−y^​)x

where:

  • η\etaη is the learning rate.

  • yyy is the actual label.

  • y^\hat{y}y^​ is the predicted label.

Generalized Linear Model (GLM)

A Generalized Linear Model (GLM) is a flexible generalization of ordinary linear regression that allows for the dependent variable to have a distribution other than a normal distribution. GLMs are used for various types of regression models, including logistic regression, Poisson regression, and more.

Structure

GLMs consist of three main components:

  1. Linear Predictor: A linear combination of input features. η=x⋅β\eta = \mathbf{x} \cdot \boldsymbol{\beta}η=x⋅β

  2. Link Function: A function that relates the linear predictor to the mean of the response variable. g(μ)=ηg(\mu) = \etag(μ)=η where μ\muμ is the expected value of the response variable and ggg is the link function.

  3. Distribution: The response variable's distribution from the exponential family (e.g., normal, binomial, Poisson).

Examples of GLMs

  • Linear Regression: Assumes a normal distribution for the response and uses the identity link function (g(μ)=μg(\mu) = \mug(μ)=μ).

  • Logistic Regression: Used for binary outcomes, assumes a binomial distribution, and uses the logit link function (g(μ)=log⁡(μ1−μ)g(\mu) = \log(\frac{\mu}{1 - \mu})g(μ)=log(1−μμ​)).

  • Poisson Regression: Used for count data, assumes a Poisson distribution, and uses the log link function (g(μ)=log⁡(μ)g(\mu) = \log(\mu)g(μ)=log(μ)).

Mathematical Representation

The general form of a GLM is:

g(E[Y∣X])=X⋅βg(\mathbb{E}[Y|\mathbf{X}]) = \mathbf{X} \cdot \boldsymbol{\beta}g(E[Y∣X])=X⋅β

where:

  • YYY is the response variable.

  • X\mathbf{X}X is the vector of predictor variables.

  • β\boldsymbol{\beta}β is the vector of coefficients.

Estimation

GLMs are typically estimated using Maximum Likelihood Estimation (MLE), where the likelihood of the observed data is maximized to find the best-fitting model parameters.

EXAMPLES

  • Industry: The sector in which the problem is situated.

  • Problem: The specific challenge or task to be addressed.

  • Objective: The goal or target of the modeling process.

  • Input Layer: Features used as input to the model.

  • Weights & Bias: Parameters used in the model, typically initialized and adjusted during training.

  • Activation Function: Function applied to the linear predictor's output, transforming it into the model's output.

  • Learning Rate: A parameter that controls the adjustment of weights during training (mainly relevant for models like the Perceptron and Logistic Regression).

  • Actual Label: The true value or class label.

  • Predicted Label: The output generated by the model.

  • Linear Predictor: The expression involving the input features, weights, and bias.

  • Link Function: Function that relates the linear predictor to the expected value of the response variable.

  • Distribution: The assumed statistical distribution of the response variable.

  • Model Type: The classification of the model being used.

  • Response Variable: The type of output variable the model predicts.