Large-Scale Clustering by Matrix Factorization Models: A Non-convex Penalty Approach
字体大小： 【小】 【中】 【大】
题目：Large-Scale Clustering by Matrix Factorization Models: A Non-convex Penalty Approach
报告人： 张纵辉 教授 香港中文大学（深圳），深圳市大数据研究院
摘要：The non-negative matrix factorization (NMF) model with an additional orthogonality constraint on one of the factor matrices, called the orthogonal NMF (ONMF), has been found a promising clustering model and can outperform the classical K-means. However, solving the ONMF model is a challenging optimization problem because the coupling of the orthogonality and non-negativity constraints introduces a mixed combinatorial aspect into the problem due to the determination of the correct status of the variables (positive or zero). Most of the existing methods directly deal with the orthogonality constraint in its original form via various optimization techniques but are not scalable for large-scale problems. In this paper, we propose a new ONMF based clustering formulation that equivalently transforms the orthogonality constraint into a set of norm-based non-convex equality constraints. We then apply a non-convex penalty (NCP) approach to add them to the objective as penalty terms, leading to a problem that is efficiently solvable. One smooth penalty formulation and one non-smooth penalty formulation are respectively studied. We build theoretical conditions for the penalized problems to provide feasible stationary solutions to the ONMF based clustering problem, as well as proposing efficient algorithms for solving the penalized problems of the two NCP methods. Experimental results based on both synthetic and real datasets are presented to show that the proposed NCP methods are computationally time efficient, and either match or outperform the existing K-means and ONMF based methods in terms of the clustering performance.