# Variety and Veracity of the Data in Matrix Completion

Beyond volume, variety and veracity are two important issues of the modern data. In this talk we discuss these questions in the context of the matrix completion problem. First, we considers the problem of estimation of a low-rank matrix when most of its entries are not observed and some of the observed entries are corrupted. The observations are noisy realizations of a sum of a low-rank matrix, which we wish to estimate, and a second matrix having a complementary sparse structure such as elementwise sparsity or columnwise sparsity. We analyze a class of estimators obtained as solutions of a constrained convex optimization problem combining the nuclear norm penalty and a convex relaxation penalty for the sparse constraint. In practical situations, data is often obtained from multiple sources which results in a collection of matrices rather a single one. In the second part, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. We first investigate the setting where, for each source, the matrix entries are sampled from an exponential family distribution. Then we deal with the distribution- free setting. The estimation procedures are based on the penalized nuclear norm estimators. We prove that the proposed estimators achieve fast rates of convergence under the two considered setting.

KLOPP, O., LOUNICI, K., TSYBAKOV, A. and ALAYA, M. (2018). Variety and Veracity of the Data in Matrix Completion. In: The 40th Conference on Stochastic Processes and their Applications. Gothenburg.

Mots clés : #high, #dimensional-prediction, #matrix-completion, #low, #rank-matrix-estimation, #robust estimation