Support Vector Regression(SVR)

The Support Vector Regression (SVR) is adopted from support Vector Machine (SVM) for the regression type data to predict the value. While dealing with real number data, the SVM changes its variant as regression.


                 𝑓(𝑥) = 𝒘𝑇𝜑(𝑥) + 𝑏               ---------------(1)

where

–𝑤 is the weight vector 𝑙 dimension

–𝜑(𝑥) is a function that maps 𝑥 to the feature space with 𝑙 dimensions

– 𝑏 is biased

–𝜀 as the distance between the hyperplane and the two boundary lines

The coefficients 𝑤 and 𝑏 are estimated by minimizing the risk function. Therefore, to maximize the margin 𝛿, a minimum ‖𝑤‖ is needed. Optimization of problem-solving as shown in function (3.2).

  𝑚in1/2 ‖𝑤‖² --------------------------(2)

with the condition, 𝑦𝑖 − 𝑤𝑇𝜑(𝑥𝑖) − 𝑏 ≤ 𝜀, 𝑓or 𝑖 = 1, … , 𝑙 and 𝑤𝑇𝜑(𝑥𝑖) − 𝑦𝑖 + 𝑏 ≤ 𝜀, 𝑓or 𝑖 = 1, … , 𝑙. Where 𝑦𝑖 is the actual value of 𝑖 period, and 𝜑(𝑥𝑖) is the estimated value of 𝑖 period.



𝑚in1/2 ‖𝑤‖² + ∁ ∑𝑖=1 (𝜉𝑖 + 𝜉𝑖) --------------------------(3.3)

with the condition

𝑦𝑖 − 𝑤𝑇𝜑(𝑥𝑖) − 𝑏 − 𝜉𝑖 ≤ 𝜀, 𝑓or 𝑖 = 1, … , 𝑙

 𝑤𝑇𝜑(𝑥𝑖) − 𝑦𝑖 + 𝑏- 𝜉𝑖≤ 𝜀,  𝑓or 𝑖 = 1, … , 𝑙

𝜉𝑖 , 𝜉𝑖  ≥ 0

The ‖𝑤‖² factor is called regulation. Minimizing ‖𝑤‖² will make a function as thin (flat) as possible so that it can control the functional capacity. All points outside the margin/limit 𝜀 will be penalized.


𝐶 > 0 constant determines how much the error deviation is from the tolerable limit 𝜀𝜀. The formula above is a Convex Linear Programming NLP Optimization Problem which functions to minimize the quadratic function to be converted into a constraint. This limitation can be solved by using the Lagrange Multiplier function. The process of deriving formulas is very long and complicated. After going through mathematical stages, a new equation is obtained with the function: 


𝑓(𝑥) = ∑li=1 (𝑎𝑖 − 𝑎𝑖 ). (𝑥𝑖. 𝑥) + 𝑏  ----(4)


Where xi is the support vector and 𝑥 is the test vector. The above functions can be used to solve linear problems. Whereas for non-linear problems the values of 𝑥𝑖, and 𝑥 are first transformed into a high-dimensional feature space by mapping the vectors 𝑥𝑖 and 𝑥 into the kernel function so that the final function becomes:

𝑓(𝑥) = ∑li=1 (𝑎𝑖 − 𝑎𝑖 ). k(𝑥𝑖. 𝑥) + 𝑏  ----(4)

The function k(𝑥𝑖. 𝑥) is the Kernel. The table 1 below shows the kernels used in the SVR calculation (Scholkopf & Smola, 2018)