Abstract
Data stream values are often associated with multiple aspects. For example, each value from environmental sensors may have an associated type (e.g., temperature, humidity, etc.) as well as location. Aside from time stamp, type and location are the two additional aspects. How to model such streams? How to simultaneously find patterns within and across the multiple aspects? How to do it incrementally in a streaming fashion? In this paper, all these problems are addressed through a general data model, tensor streams, and an effective algorithmic framework, window-based tensor analysis (WTA). Two variations of WTA, independent-window tensor analysis (IW) and moving-window tensor analysis (MW), are presented and evaluated extensively on real data sets. Finally, we illustrate one important application, Multi-Aspect Correlation Analysis (MACA), which uses WTA and we demonstrate its effectiveness on an environmental monitoring application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. Arasu, G.S. Manku, Approximate counts and quantiles over sliding windows. In: PODS, pp. 286–296, 2004.
B. Babcock, M. Datar, R. Motwani, Sampling from a moving window over streaming data. In: SODA, pp. 633–634, 2002
B.W. Bader, T.G. Kolda, Matlab tensor toolbox version 2.0, http://csmr.ca.sandia.gov/tgkolda/tensortoolbox/, 2006.
J.D. Carroll, J. Chang, Analysis of individual differences in multidimensional scaling via an n-way generalization of ‘eckart-young’ decomposition. Psychometrika, 35(3), 1970.
M. Datar, A. Gionis, P. Indyk, R. Motwani, Maintaining stream statistics over sliding windows: (extended abstract). In: SODA, pp. 635–644, 2002.
L. de Lathauwer, Signal processing based on multilinear algebra. PhD thesis, Katholieke, University of Leuven, Belgium, 1997.
A. Dempster, N. Laird, D. Rubin, Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 39(1):1–38, 1977.
C. Ding, J. Ye, Two-dimensional singular value decomposition (2dsvd) for 2d maps and images. In: SDM, 2005.
P.B. Gibbons, S. Tirthapura, Distributed streams algorithms for sliding windows. In: SPAA, pp. 63–72, 2002.
G.H. Golub, C.F. Van-Loan, Matrix Computations, 2nd edn. The Johns Hopkins University Press, Baltimore, 1989.
R. Harshman, Foundations of the parafac procedure: model and conditions for an “explanatory” multi-mode factor analysis. UCLA working papers in phonetics, 16, 1970.
X. He, D. Cai, P. Niyogi, Tensor subspace analysis. In: NIPS, 2005.
M.-J. Hsieh, M.-S. Chen, P.S. Yu, Integrating dct and dwt for approximating cube streams. In: CIKM, 2005.
A. Kapteyn, H. Neudecker, T. Wansbeek, An approach to n-mode component analysis. Psychometrika, 51(2), 1986.
T.G. Kolda, B.W. Bader, J.P. Kenny, Higher-order web link analysis using multilinear algebra. In: ICDM, 2005.
L.K. Lee, H.F. Ting, Maintaining significant stream statistics over sliding windows. In: SODA, pp. 724–732, 2006.
R.J. Little, D.B. Rubin, Statistical Analysis with Missing Data, 2nd edn. Wiley, New York, 2002.
M.W. Mahoney, M. Maggioni, P. Drineas, M. w. mahoney, m. maggioni, and p. drineas. In: KDD, 2006.
S. Muthukrishnan, Data Streams: Algorithms and Applications, vol. 1. Foundations and Trends in Theoretical Computer Science, 2005.
S. Papadimitriou, J. Sun, C. Faloutsos, Streaming pattern discovery in multiple time-series. In: VLDB, 2005.
J. Sun, S. Papadimitriou, C. Faloutsos, Distributed pattern discovery in multiple streams. In: Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2006.
J. Sun, D. Tao, C. Faloutsos, Beyond streams and graphs: dynamic tensor analysis. In: KDD, 2006.
L.R. Tucker, Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 1966.
N. Viereck, M. Dyrby, S.B. Engelsen, Monitoring Thermal Processes by NMR Technology. Elsevier, Amsterdam, 2006.
L. Wei, E.J. Keogh, H.V. Herle, A. Mafra-Neto, Atomic wedgie: efficient query filtering for streaming times series. In: ICDM, 2005.
F. Yates, The analysis of replicated experiments when the field results are incomplete. J. Exp. Agriculture, 1, 1933.
J. Ye, Generalized low rank approximations of matrices. Machine Learning, 61, 2004.
Y. Zhu, D. Shasha, Statstream: Statistical monitoring of thousands of data streams in real time. In: VLDB, 2002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Sun, J., Papadimitriou, S., Yu, P.S. (2007). Tensor Analysis on Multi-aspect Streams. In: Gama, J., Gaber, M.M. (eds) Learning from Data Streams. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-73679-4_11
Download citation
DOI: https://doi.org/10.1007/3-540-73679-4_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73678-3
Online ISBN: 978-3-540-73679-0
eBook Packages: Computer ScienceComputer Science (R0)