[keywords]: matrix completion, side information, multi-label learning
Matrix completion: low rank, a limited number of observed entries
(1) s.t.Application: collaborative filtering, dimensionality reduction, multi-class learning, clustering, etc.
-
Side information:
- and with observed matrix ,
- e.g. textual description of items, attributes of users, correlation matrix
-
Assumption:
- the target matrix and the side information matrices share the same latent information, i.e. the column and row vectors in lie in the subspaces spanned by the column and row vectors in and (so the optimization problem is reduced to searching for an optimal matrix of size ).
-
Other Approaches and limitations:
- Optimization problem of MC: require to update the full matrix at each iteration
- Matrix Factorization (low rank): cannot guarantee perfect recovery of the target matrix
- With side information and based on graphical models by assuming the special distribution of latent factors: non-convex and no theoretical guarantee on matrix recovery
- MC with infinite dimensional side information: no guarantee of perfect recovery
- MC-b and MC-1: high computational cost
-
Techniques:
-
transductive incomple multi-label learning (aims to assign multiple labels to individual instances in a transductive learning setting)
(: feature vectors of instances : label correlation matrix)
(2) s.t. -
SVT(Singular Vector Thresholding) (approximate (2) by an unconstrained optimization problem)
(3) - Maxide: Supplementary
-
transductive incomple multi-label learning (aims to assign multiple labels to individual instances in a transductive learning setting)
Algorithm Maxide (Matrix Completion with Side Information)
- Initialization: , and stopping criterion
- while do
- #solved by SVT
- while do
- end while
- end while
-
Experiments (Matlab+C)
- Synthetic dataset
- Real dataset
-
Advantages:
- required entries: reduced to
- convex optimization problem, i.e. unique optimizer
- guarantee on perfect recovery
-
Disadvantages:
- no linear convergence rate