Introduction to Perceptron & Generalized Linear Model
Perceptron
The Perceptron is one of the simplest types of artificial neural networks and serves as a fundamental building block for more complex neural network architectures. It is a binary classifier that determines whether an input, represented as a vector of numbers, belongs to a specific class or not. The perceptron makes its decision based on a linear combination of the input features.
Structure
A perceptron consists of:
Input layer: A set of input neurons that receive features of the data.
Weights: A set of weights associated with each input feature.
Bias: A bias term that shifts the decision boundary.
Activation Function: A function that outputs a binary decision based on the weighted sum of inputs.
Mathematical Representation
The output of a perceptron is given by:
y=sign(w⋅x+b)y = \text{sign}(\mathbf{w} \cdot \mathbf{x} + b)y=sign(w⋅x+b)
where:
x\mathbf{x}x is the input vector.
w\mathbf{w}w is the vector of weights.
bbb is the bias.
sign\text{sign}sign is the activation function that returns 1 if the argument is positive and -1 otherwise.
Training
The perceptron learning algorithm adjusts the weights and bias based on the error between the predicted output and the actual output. The weights are updated using the rule:
w←w+η(y−y^)x\mathbf{w} \leftarrow \mathbf{w} + \eta (y - \hat{y}) \mathbf{x}w←w+η(y−y^)x
where:
η\etaη is the learning rate.
yyy is the actual label.
y^\hat{y}y^ is the predicted label.
Generalized Linear Model (GLM)
A Generalized Linear Model (GLM) is a flexible generalization of ordinary linear regression that allows for the dependent variable to have a distribution other than a normal distribution. GLMs are used for various types of regression models, including logistic regression, Poisson regression, and more.
Structure
GLMs consist of three main components:
Linear Predictor: A linear combination of input features. η=x⋅β\eta = \mathbf{x} \cdot \boldsymbol{\beta}η=x⋅β
Link Function: A function that relates the linear predictor to the mean of the response variable. g(μ)=ηg(\mu) = \etag(μ)=η where μ\muμ is the expected value of the response variable and ggg is the link function.
Distribution: The response variable's distribution from the exponential family (e.g., normal, binomial, Poisson).
Examples of GLMs
Linear Regression: Assumes a normal distribution for the response and uses the identity link function (g(μ)=μg(\mu) = \mug(μ)=μ).
Logistic Regression: Used for binary outcomes, assumes a binomial distribution, and uses the logit link function (g(μ)=log(μ1−μ)g(\mu) = \log(\frac{\mu}{1 - \mu})g(μ)=log(1−μμ)).
Poisson Regression: Used for count data, assumes a Poisson distribution, and uses the log link function (g(μ)=log(μ)g(\mu) = \log(\mu)g(μ)=log(μ)).
Mathematical Representation
The general form of a GLM is:
g(E[Y∣X])=X⋅βg(\mathbb{E}[Y|\mathbf{X}]) = \mathbf{X} \cdot \boldsymbol{\beta}g(E[Y∣X])=X⋅β
where:
YYY is the response variable.
X\mathbf{X}X is the vector of predictor variables.
β\boldsymbol{\beta}β is the vector of coefficients.
Estimation
GLMs are typically estimated using Maximum Likelihood Estimation (MLE), where the likelihood of the observed data is maximized to find the best-fitting model parameters.
EXAMPLES
Industry: The sector in which the problem is situated.
Problem: The specific challenge or task to be addressed.
Objective: The goal or target of the modeling process.
Input Layer: Features used as input to the model.
Weights & Bias: Parameters used in the model, typically initialized and adjusted during training.
Activation Function: Function applied to the linear predictor's output, transforming it into the model's output.
Learning Rate: A parameter that controls the adjustment of weights during training (mainly relevant for models like the Perceptron and Logistic Regression).
Actual Label: The true value or class label.
Predicted Label: The output generated by the model.
Linear Predictor: The expression involving the input features, weights, and bias.
Link Function: Function that relates the linear predictor to the expected value of the response variable.
Distribution: The assumed statistical distribution of the response variable.
Model Type: The classification of the model being used.
Response Variable: The type of output variable the model predicts.