Perceptron Algorithm

Perceptron Algorithm | Pegasos Algorithm | Support Vector Machine | Linear Classifier Among Dataset | Deep Learning | Machine Learning

Perceptron Algorithm is one of the most used algorithm in classifying data.It frequently seems little tough while learning first time.

In today's post , We'll try to understand it well in brief with full explanation.

Perceptron Algorithm is used whenever given dataset can be classified into two parts or it has only two labels.(One can consider it as +1, -1)

There are three version of Perceptron Algorithm:

Simple Perceptron

Average Percepron

Pegasos Algorithm or Support Vector Machine

Relax! Terms are sounding dangerous but they are really very easy.Let me explain you.

1.Simple Perceptron:

The single target of perceptron algorithm is to find a Linear classifier( say Linear equation) on one side of which are all positively labelled point and on another side all negatively labelled points are there.

As we know any linear equation can be written as,

Y=θ.X + θ₀

Using Perceptron Our Aim is to find those value of θ vector and θ₀ , with which formed line divide our dataset into two part , one part of which will be positively labelled and another part will be negatively labelled.See below pic:

It is also an iterative method.We'll start with θ = 0 vector and θ₀₌₀

_{In each iteration ,for every data point ,}

if ((yⁱ)*(θ.xⁱ+θ₀)<=0) :

Update (θ=θ_old+ (yⁱ.xⁱ))
(θ₀=θ₀+ yⁱ)

	import numpy as np
	def perceptron(features,labels):
	theta=np.array([0]*len(features[0]))
	theta0=0
	t=10
	while(t):
	t=t-1
	for i in range(len(features)):
	if (np.dot(features[i],theta)+theta0)*labels[i]<=0:
	#print('mistake')
	theta=theta+np.multiply(labels[i],features[i])
	theta0=theta0+labels[i]
	#print(theta)
	return (list(theta),theta0)
	if __name__=="__main__":
	x=[[1,2],[2,3],[3,4],[4,5]]
	features=x
	y=[1,1,-1,-1]
	theta,theta0=perceptron(features,y)
	print('theta={},theta0={}'.format(theta,theta0))

view raw Petceptron_Algorithm.py hosted with ❤ by GitHub

Reason For above updates can be derived using stochastic gradient descent algorithm.We'll derive above updation.If you are not familiar with Stochastic Gradient Descent ,Click Below Link.

Link to Stochastic Gradient Descent

2. Averaged Perceptron :

In Averaged Perceptron , Our Averaged Perceptron Function returns the average value of θ and θ_{0 ,updated in every iteration.While in "Only Perceptron Algorithm" , We return the last updated value of θ and}_θ₀ .

_{See the Below Code of a sample implementation}

	import numpy as np
	def perceptron(features,labels):
	theta=np.array([0]*len(features[0]))
	theta0=0
	t=10
	count=1
	sum_theta=theta
	sum_theta_0=theta0
	while(t):
	t=t-1
	for i in range(len(features)):
	if (np.dot(features[i],theta)+theta0)*labels[i]<=0:
	#print('mistake')
	theta=theta+np.multiply(labels[i],features[i])
	theta0=theta0+labels[i]
	sum_theta=sum_theta+theta
	sum_theta_0=sum_theta_0_theta_0
	count+=1
	#print(theta)
	#We return Average theta value and Average theta0 value
	return (list(theta/count),theta0/count)
	if __name__=="__main__":
	x=[[1,2],[2,3],[3,4],[4,5]]
	features=x
	y=[1,1,-1,-1]
	theta,theta0=perceptron(features,y)
	print('theta={},theta0={}'.format(theta,theta0))

view raw averaged_perceptron.py hosted with ❤ by GitHub

3. Pegasos Algorithm/Support Vector Machine:

In above two algorithm, we have taken care of getting best linear classifier, Which can well classify our dataset, but what if there are many classifier existing that can classify the dataset in two part.See Below pic

In above picture, It can be seen quite clearly that all the classifier line is well classifying our training set.

If you think little wisely, you can say the classifier with a high margin boundary will be said to best fit.

In such a case we should choose, which have large margin(d) on both side (positively labelled and negatively labelled) .

Now, there are two things :

Loss : Loss on misclassifying a data

Regularization : Responsible for maintaining large boundary

Remember we are concerned only about its correct classification.If our classifier classifies any point wrongly , we will give some penalty to our classifier.

Also Notice if any point will be classified correctly and not on decision boundary, then :

label*Y_predicted >0

Since we want no point inside margin boundary too, We'll give penalty, when it will inside margin boundary too.So there will be no loss if :

(yⁱ(θ.Xⁱ+θ₀))>=1

Otherwise we'll give some penalty to our model, this is known as Hinge Loss, and it is defined as:

H(z)=max{0,1-z}, z=(yⁱ(θ.Xⁱ+θ₀))

means , Loss will be zero if z>1 else 1-z

So, Our Objective function will be

J = Loss + Regularization

J(θ,θ₀) = (1/n)[Σ_iⁿ₌₁H(yⁱ(θ.Xⁱ+θ₀)) + (λ/2)||θ||²]

Our target is to find those value of θ,θ₀For which value of objective function is least or we have to minimize Objective function.We'll do it using Stochastic gradient descent Algorithm

So, in each iteration we will consider only one point and will calculate cost only for that and update.

J(θ,θ₀) = H(yⁱ(θ.Xⁱ+θ₀ⁱ)) + (λ/2)||θ||²

^Since,∂H(z)/∂θ = {0 if z>1 ,-∂z/∂θ otherwise}

So,the iteration formula from stochastic Gradient Descent:

if z<=1:

θ_new=θ_old - η[-yⁱ.xⁱ+λ.θ_old]

θ_new=θ_old +ηyⁱ.xⁱ-η.λ.θ_old

θo_new=θo_old + η.yⁱ

else:

θ_new=θ_old - λ.θ_old

θo_new=θo_old

We will make η getting decreased as iteration will increase.

Try to implement and Uderstand Below code

	import numpy as np
	import math
	import matplotlib.pyplot as plt
	#----------Pegasos Code definition------------------------#
	def pegasos(feature_matrix,labels,T):
	theta=np.array([0]*feature_matrix.shape[1])
	theta_0=0
	count=0
	L=2 #value for lambda
	for i in range(T):
	for j in range(len(feature_matrix)):
	count+=1
	eta=1/math.sqrt(count) #decreasing eta value as iteration increase
	z=labels[j]*(np.dot(feature_matrix[j],theta)+theta_0)
	if(z<=1):
	theta=theta+etalabels[j]feature_matrix[j]-etaLtheta
	theta_0=theta_0+eta*labels[j]
	else:
	theta=theta-etaLtheta
	return theta,theta_0
	#---------------Ends Pegsos----------------------------------#

	if __name__=="__main__":
	x=np.array([[1,2],[2,3],[3,4],[4,5]])
	y=np.array([1,1,-1,-1])
	theta,theta_0=pegasos(x,y,100)
	print("theta ={} theta_0= {}".format(theta,theta_0))
	#----------visualization-----------------------------#
	pos=[[],[]]
	neg=[[],[]]
	for i in range(len(x)):
	if(y[i]==1):
	pos[0].append(x[i][0])
	pos[1].append(x[i][1])
	else:
	neg[0].append(x[i][0])
	neg[1].append(x[i][1])
	plt.scatter(neg[0],neg[1])
	plt.scatter(pos[0],pos[1])
	print(x,y)
	x=[i for i in range(0,5)]
	y=[-(theta[0]*i+theta_0)/theta[1] for i in x]
	plt.plot(x,y)
	plt.show()

view raw pegasos(SVM).py hosted with ❤ by GitHub

If we remove the regularization term , and take η=1 , it becomes Simple Perceptron Algorithm.

Linear Regression(With Gradient Descent ) Fully Explained| Machine Learning

Linear Regression (Gradient Descent Method) If you don't have any idea of Gradient Descent Algorithm , please check our previous post , there I have explained Gradient Descent Algorithm very well explained in brief. Now moving toward our current topic Linear Regression . In the previous post , we have just discussed the theory behind the Gradient Descent . Today we will learn Linear Regression where we will use Gradient Descent to minimize the cost function. WHAT IS LINEAR REGRESSION: Suppose you are given a equation: y=2x1+3x2+4x3+1 and you are said to find the value at any point (1,1,1) corresponds to x1, x2, x3 respectively. You'll simply put the value of x1, x2, x3 into equation and tell me the answer :10,Right? But What if you are given different set of (x1, x2, x3,y) and you are said to find the equation. Here's what,Linear Regression Comes into picture.It helps us to find out or fit a Linear equation to datasets given. Above equation can be easily tra...

4WallSpace ||A place to go parallel with latest emerging technologies

Search This Blog

Perceptron Algorithm | Pegasos Algorithm | Support Vector Machine | Linear Classifier Among Dataset | Deep Learning | Machine Learning

PERCEPTRON ALGORITHM

Perceptron Algorithm is one of the most used algorithm in classifying data.It frequently seems little tough while learning first time.

In today's post , We'll try to understand it well in brief with full explanation.

Perceptron Algorithm is used whenever given dataset can be classified into two parts or it has only two labels.(One can consider it as +1, -1)

There are three version of Perceptron Algorithm:

Simple Perceptron

Average Percepron

Pegasos Algorithm or Support Vector Machine

Relax! Terms are sounding dangerous but they are really very easy.Let me explain you.

1.Simple Perceptron:

The single target of perceptron algorithm is to find a Linear classifier( say Linear equation) on one side of which are all positively labelled point and on another side all negatively labelled points are there.

As we know any linear equation can be written as,

Using Perceptron Our Aim is to find those value of θ vector and θ₀ , with which formed line divide our dataset into two part , one part of which will be positively labelled and another part will be negatively labelled.See below pic:

It is also an iterative method.We'll start with θ = 0 vector and θ₀₌₀

_{In each iteration ,for every data point ,}

Reason For above updates can be derived using stochastic gradient descent algorithm.We'll derive above updation.If you are not familiar with Stochastic Gradient Descent ,Click Below Link.

Link to Stochastic Gradient Descent

2. Averaged Perceptron :

In Averaged Perceptron , Our Averaged Perceptron Function returns the average value of θ and θ_{0 ,updated in every iteration.While in "Only Perceptron Algorithm" , We return the last updated value of θ and}_θ₀ .

_{See the Below Code of a sample implementation}

3. Pegasos Algorithm/Support Vector Machine:

In above two algorithm, we have taken care of getting best linear classifier, Which can well classify our dataset, but what if there are many classifier existing that can classify the dataset in two part.See Below pic

In above picture, It can be seen quite clearly that all the classifier line is well classifying our training set.

If you think little wisely, you can say the classifier with a high margin boundary will be said to best fit.

In such a case we should choose, which have large margin(d) on both side (positively labelled and negatively labelled) .

label*Y_predicted >0

Since we want no point inside margin boundary too, We'll give penalty, when it will inside margin boundary too.So there will be no loss if :

(yⁱ(θ.Xⁱ+θ₀))>=1

Otherwise we'll give some penalty to our model, this is known as Hinge Loss, and it is defined as:

H(z)=max{0,1-z}, z=(yⁱ(θ.Xⁱ+θ₀))

So, Our Objective function will be

J = Loss + Regularization

Our target is to find those value of θ,θ₀For which value of objective function is least or we have to minimize Objective function.We'll do it using Stochastic gradient descent Algorithm

So, in each iteration we will consider only one point and will calculate cost only for that and update.

Labels

Comments

Post a Comment

Popular posts from this blog

Gradient Descent Algorithm Fully Explained| Machine Learning

What is Kernel Function | Fully Explained

Linear Regression(With Gradient Descent ) Fully Explained| Machine Learning

4WallSpace ||A place to go parallel with latest emerging technologies

Perceptron Algorithm | Pegasos Algorithm | Support Vector Machine | Linear Classifier Among Dataset | Deep Learning | Machine Learning

PERCEPTRON ALGORITHM

Perceptron Algorithm is one of the most used algorithm in classifying data.It frequently seems little tough while learning first time.

In today's post , We'll try to understand it well in brief with full explanation.

Perceptron Algorithm is used whenever given dataset can be classified into two parts or it has only two labels.(One can consider it as +1, -1)

There are three version of Perceptron Algorithm:

Simple Perceptron Average Percepron Pegasos Algorithm or Support Vector Machine Relax! Terms are sounding dangerous but they are really very easy.Let me explain you.

1.Simple Perceptron:

The single target of perceptron algorithm is to find a Linear classifier( say Linear equation) on one side of which are all positively labelled point and on another side all negatively labelled points are there.

As we know any linear equation can be written as,

Using Perceptron Our Aim is to find those value of θ vector and θ0 , with which formed line divide our dataset into two part , one part of which will be positively labelled and another part will be negatively labelled.See below pic:

It is also an iterative method.We'll start with θ = 0 vector and θ0=0 In each iteration ,for every data point ,

Reason For above updates can be derived using stochastic gradient descent algorithm.We'll derive above updation.If you are not familiar with Stochastic Gradient Descent ,Click Below Link.

Link to Stochastic Gradient Descent

2. Averaged Perceptron :

In Averaged Perceptron , Our Averaged Perceptron Function returns the average value of θ and θ0 ,updated in every iteration.While in "Only Perceptron Algorithm" , We return the last updated value of θ and θ0 .

See the Below Code of a sample implementation

3. Pegasos Algorithm/Support Vector Machine:

In above two algorithm, we have taken care of getting best linear classifier, Which can well classify our dataset, but what if there are many classifier existing that can classify the dataset in two part.See Below pic

In above picture, It can be seen quite clearly that all the classifier line is well classifying our training set.

If you think little wisely, you can say the classifier with a high margin boundary will be said to best fit.

In such a case we should choose, which have large margin(d) on both side (positively labelled and negatively labelled) .

label*Ypredicted >0

Since we want no point inside margin boundary too, We'll give penalty, when it will inside margin boundary too.So there will be no loss if :

(yi(θ.Xi+θ0))>=1

Otherwise we'll give some penalty to our model, this is known as Hinge Loss, and it is defined as:

H(z)=max{0,1-z}, z=(yi(θ.Xi+θ0))

So, Our Objective function will be

J = Loss + Regularization

Our target is to find those value of θ,θ0 For which value of objective function is least or we have to minimize Objective function.We'll do it using Stochastic gradient descent Algorithm

So, in each iteration we will consider only one point and will calculate cost only for that and update.

Labels

Comments

Post a Comment

Popular posts from this blog

Gradient Descent Algorithm Fully Explained| Machine Learning

What is Kernel Function | Fully Explained

Linear Regression(With Gradient Descent ) Fully Explained| Machine Learning

Simple Perceptron

Average Percepron

Pegasos Algorithm or Support Vector Machine

Relax! Terms are sounding dangerous but they are really very easy.Let me explain you.

Using Perceptron Our Aim is to find those value of θ vector and θ₀ , with which formed line divide our dataset into two part , one part of which will be positively labelled and another part will be negatively labelled.See below pic:

It is also an iterative method.We'll start with θ = 0 vector and θ₀₌₀

_{In each iteration ,for every data point ,}

In Averaged Perceptron , Our Averaged Perceptron Function returns the average value of θ and θ_{0 ,updated in every iteration.While in "Only Perceptron Algorithm" , We return the last updated value of θ and}_θ₀ .

_{See the Below Code of a sample implementation}

label*Y_predicted >0

(yⁱ(θ.Xⁱ+θ₀))>=1

H(z)=max{0,1-z}, z=(yⁱ(θ.Xⁱ+θ₀))

Our target is to find those value of θ,θ₀For which value of objective function is least or we have to minimize Objective function.We'll do it using Stochastic gradient descent Algorithm