PuSH - Publikationsserver des Helmholtz Zentrums München

Maurus, S.* ; Plant, C.

Ternary matrix factorization.

In: Proceedings (IEEE International Conference on Data Mining (ICDM), 14-17 December 2014, Shenzhen, China). Piscataway, NJ: IEEE, 2014. 400-409
DOI
Can we learn from the unknown? Logical data sets of the ternary kind are often found in information systems. They contain unknown as well as true/false values. An unknown value may represent a missing entry (lost or indeterminable) or something with meaning, like a "Don't Know" response in a questionnaire. In this paper we introduce an effectively- and efficiently-superior algorithm for reducing the dimensionality of logical data (categorical data in general) in the context of a new data mining challenge: Ternary Matrix Factorization (TMF). For a ternary data matrix, TMF exploits ternary logic to produce a basis matrix (which holds the major patterns in the data) and a usage matrix (which maps patterns to original observations). Both matrices are interpretable, and their ternary matrix product approximates the original matrix. TMF has applications in 1) finding targeted structure in ternary data, 2) imputing values through pattern-discovery in highly-incomplete categorical data sets, and 3) solving instances of its encapsulated Binary Matrix Factorization (BMF) problem. Our elegant algorithm Faster (Fast Ternary Matrix Factorization) has linear run-time complexity with respect to the dimensions of the data set and is parameter-robust. Experiments on synthetic and real-world data sets show that we are able to efficiently and effectively outperform state-of-the-art techniques in all three TMF applications.
Altmetric
Weitere Metriken?
Zusatzinfos bearbeiten [➜Einloggen]
Publikationstyp Artikel: Konferenzbeitrag
ISBN 978-1-4799-4303-6
Konferenztitel IEEE International Conference on Data Mining (ICDM)
Konferzenzdatum 14-17 December 2014
Konferenzort Shenzhen, China
Konferenzband Proceedings
Quellenangaben Band: , Heft: , Seiten: 400-409 Artikelnummer: , Supplement: ,
Verlag IEEE
Verlagsort Piscataway, NJ