Kernel Function
In this post, We'll know what exactly is kernel function.Kernel Function is used to transform n - dimensional input to m- dimensional input, where m is much higher than n then find the dot product in higher dimensional efficiently.It also helps sometimes to do calculation easily in infinite dimensional space without going to infinite dimensions.
Mathematical definition:
K(x, y) = <f(x), f(y)>. Here K is the kernel function, x, y are n dimensional inputs. f is a map from n-dimension to m-dimension space. < x,y> denotes the dot product. usually m is much larger than n.
Intuition:
Normally calculating <f(x), f(y)> requires us to calculate f(x), f(y) first, and then do the dot product. These two computation steps can be quite expensive as they involve manipulations in m dimensional space, where m can be a large number. But after all the trouble of going to the high dimensional space, the result of the dot product is really a scalar: we come back to one-dimensional space again! Now, the question we have is: do we really need to go through all the trouble to get this one number? do we really have to go to the m-dimensional space? The answer is no, if you find a clever kernel.
Simple Example:
x = (x1, x2, x3); y = (y1, y2, y3). Then for the function f(x) = (x1x1, x1x2, x1x3, x2x1, x2x2, x2x3, x3x1, x3x2, x3x3), the kernel is K(x, y ) = (<x, y>)^2.
Let's plug in some numbers to make this more intuitive: suppose x = (1, 2, 3); y = (4, 5, 6). Then:
f(x) = (1, 2, 3, 2, 4, 6, 3, 6, 9)
f(y) = (16, 20, 24, 20, 25, 30, 24, 30, 36)
<f(x), f(y)> = 16 + 40 + 72 + 40 + 100+ 180 + 72 + 180 + 324 = 1024
f(x) = (1, 2, 3, 2, 4, 6, 3, 6, 9)
f(y) = (16, 20, 24, 20, 25, 30, 24, 30, 36)
<f(x), f(y)> = 16 + 40 + 72 + 40 + 100+ 180 + 72 + 180 + 324 = 1024
A lot of algebra. Mainly because f is a mapping from 3-dimensional to 9 dimensional space.
Now let us use the kernel instead:
K(x, y) = (4 + 10 + 18 ) ^2 = 32^2 = 1024
Same result, but this calculation is so much easier.
K(x, y) = (4 + 10 + 18 ) ^2 = 32^2 = 1024
Same result, but this calculation is so much easier.
Thanks to Lili Jiangs's Answer .
Use Of Kernel Function:
Kernel function is used to find a non linear classifier or non linear regression line.The Idea behind using kernel function is ' A Linear Classifier in higher dimension works as non linear classifier in lower dimension
Ex:
Suppose a 2D line satisfy all the points [x,y] if we transform it to [x^2,x,y] and it becomes a plane which satisfy all the points in 3D , at the same time it is a parabola in 2D, which is non-linear.
Comments
Post a Comment