Last updated on April 6, 2021, 3:13 p.m. by mayank
In the paper titled ‘Financial Data Mining based on Support Vector Machines and Ensemble Learning’, the authors Shi Lei, Ma Xinming, Xi Lei, Hu Xiaohong have classified the financial data using support vector machines and the ensemble learning methods. The datasets used are the German credit dataset and the Credit approval dataset. The authors have given an overview of the SVMs and Adaboost used for ensemble learning followed by the results of their experimentations.
SVMs (Support Vector Machines):
SVMs are based on minimizing the risk, and find an optimal hyperplane for the given classes. For binary classification, the ‘y’ values are considered as {+1,-1}. The hyperplane equation is given by w*x+b=0, where ‘w’ is the vector normal to the hyperplane and b is the offset. The constraint relationship derived with respect to the hyperplane is yi(w*xi+b) >=1. We try to maximize the distance between support vectors i.e. 2/|w|, i.e. minimize 0.5*|w|2. This is achieved with the help of Lagrange’s multipliers, giving us the following equation-
Subject to the two linear constraints-
This is further solved by the quadratic programming. For non-linear boundaries, kernels are be used.
Ensemble Learning
Boosting is used in Ensemble learning approach. Adaboost, used here, basically changes the weights of the training data-points after each instance based on the misclassification by the algorithm. In short, Adaboost calculates the weighted training error, sets alpha and updates the weights. A brief of the algorithm is given below.
Experimentation and Results:
The accuracy is calculated as
Important Sentences
maximum must be found.