Generative and Discriminative Models
Machine Learning
It’s a good idea if you understand what we achieve after implementing the machine learning algorithms.
Classification is a process by which a system predicts the class label of a given data point. For classification, we calculate the conditional probability of each class label given the data point.

where, ‘C’ represents the class label and ‘X’ represents the given data point.
You can either model the value of P(C|X) while iterating over data (will be discussed in future) multiple times or model P(C,X) i.e. the joint probability of ‘C’ and ‘X’ and then use this to calculate P(C|X) via Bayes Theorem.
Before going to an example, let’s see how both of these probabilities work.
P(C,X) represents the probability of co existence of the given data point ‘X’ and class label ‘C’. This is why it is referred to as the joint probability.
P(C|X), read as “probability C given X”, is the conditional probability of the existence of the class label, given that data point.
Let’s take an example to understand further:
Assume 5 points: (0,1),(1,0),(0,0),(1,1),(0,0)

In the first table we calculate P(X=0,Y=0), which is 2/5 since the point (0,0) has occurred 2 out of 5 times. In the second table we calculate P(Y=0|X=0), which means probability that Y was 0 when X was 0. Hence, it’s 2/3 since Y is 0 in 2 out 3 points with X=0. The rest values can be calculated in a similar fashion.
Now that you’re clear on what we can use to classify data, let’s see how.
The algorithms that model P(C,X) and then use this to calculate P(C|X) using Baye’s rule, are called generative algorithms. Some of the famous generative algorithms are Naive Bayes, Hidden Markov Models etc.
The algorithms that directly model P(C|X) by updating internal parameters, are called discriminative algorithms. Some of the famous discriminative algorithms are logistic regression, neural networks, scalar vector machine etc.
If you’re given to check if a model is generative or discriminative, all you do is check if it is modelling P(C|X) or P(C,X).
Link to other posts: