Return to page

WIKI

Support Vector Machine

What is support vector machine?

Support vector machine (SVM) is a supervised machine learning algorithm that can be used in either classification or regression problems. A support vector machine algorithm’s main objective is to find a hyperplane, which splits data into two classes in an N-dimensional space by plotting data items as points. The N in an N-dimensional space represents the number of features that explicitly classify data points.

 

How do support vector machines work?

Support vector machines categorize text using sets of pre-labeled training data. SVM basics are best described using an example. 

Imagine there are two tags: squares and triangles, and the data has two features: x and y. Using the labeled training data, the points are plotted on a plane with classifiers that output data as either squares or triangles. The SVM places the data points on a two-dimensional hyperplane. There is a line called the decision boundary separating the tags, and on one side the output is squares and the other is triangles.

 

Advantages of support vector machines

The advantages of support vector machines are:

  • Works with classes that have a clear margin of separation.

  • Effective in spaces that are highly dimensional.

  • Useful when there is a higher number of dimensions than number of samples.

  • Memory efficient because it uses a subset of training points in the decision function (called support vectors).

     

Disadvantages of support vector machines

Disadvantages of support vector machines are:

  • Does not work well for large data sets.

  • Does not do well if the data set has significant noise such as target classes overlapping.

  • Will underperform if the number of features for each data point is greater than the number of training data samples.

  • Because the support vector classifier enters the data points above and below the decision boundary, there is no probabilistic explanation for the classification.

     

Conclusion

Support vector machines are effective supervised learning algorithms for classification, regression, and outlier detection. SVM works best with clear margins of separation, highly dimensional spaces, and is memory efficient.