Multiclass SVM

Crammer and Singer (2001) have extended the binary SVM classifier to classification problems with more than two classes. The training problem of the Crammer-Singer multiclass SVM can be expressed as a QP

\mbox{maximize}   &  -(1/2)  \mathbf{Tr}(U^TQU)  + \mathbf{Tr}(E^TU) \\
\mbox{subject to} & U \preceq \gamma E \\
                  & U \mathbf{1}_m = 0

with variable U \in \mathbf{R}^{N \times m} where N is the number of training examples and m the number of classes. The N \times N kernel matrix Q is given by Q_{ij} = K(x_i, x_j) where K is a kernel function and x_i^T is the i’th row of the N \times n data matrix X, and d is an N-vector with labels (i.e. d_i \in \{ 0,1,\ldots,m-1 \}).


A custom solver for the multiclass support vector machine training problem is available as a Python module mcsvm. The module implements the following function:

mcsvm(X, labels, gamma, kernel = 'linear', sigma = 1.0, degree = 1)

Solves the problem (1) using a custom KKT solver.

The input arguments are the data matrix X (with the N training examples x_i^T as rows), the label vector d, and the positive parameter \gamma.

Valid kernel functions are:

the linear kernel: K(x_i,x_j) = x_i^Tx_j
the polynomial kernel: K(x_i,x_j) = (x_i^Tx_j/\sigma)^d

The kernel parameters \sigma and d are specified using the input arguments sigma and degree, respectively.

Returns the function classifier(). If Y is M \times n then classifier(Y) returns a list with as its k’th element

\operatorname*{arg\,max}_{j=0,\ldots,m-1}  \sum_{i=1}^N U_{ij} K(x_i, y_k)

where y_k^T is row k of Y, x_i^T is row k of X, and U is the optimal solution of the QP (1).