Abstract
Efficient data mining algorithms are crucial for effective knowledge discovery. We present the Multi-Stream Dependency Detection (msdd) data mining algorithm that performs a systematic search for structure in multivariate time series of categorical data. The systematicity of msdd's search makes implementation of both parallel and distributed versions straightforward. Distributing the search for structure over multiple processors or networked machines makes mining of large numbers of databases or very large databases feasible. We present results showing that msdd efficiently finds complex structure in multivariate time series, and that the distributed version finds the same structure in approximately 1/n of the time required by msdd, where n is the number of machines across which the search is distributed.
Chapter PDF
Similar content being viewed by others
Keywords
- Association Rule
- Systematic Search
- Inductive Logic Programming
- Multivariate Time Series
- Data Mining Algorithm
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
R. Agrawal and J. C. Shafer. Parallel mining of association rules: Design, implementation and experience. Technical Report RJ 10004, IBM, 1996.
John M. Aronis and Foster J. Provost. Efficiently constructing relational features from background knowledge for inductive machine learning. In Working Notes of the Knowledge Discovery in Databases Workshop, pages 347–358, 1994.
Donald J. Berndt and James Clifford. Using dynamic time warping to find patterns in time series. In Working Notes of the Knowledge Discovery in Databases Workshop, pages 359–370, 1994.
Luc Dehaspe and Luc De Raedt. Parallel inductive logic programming. In Proceedings of the MLnet Familiarization Workshop on Statistics, Machine Learning and Knowledge Discovery in Databases, 1995.
Marcel Holsheimer and Martin L. Kersten. Architectural support for data mining. In Working Notes of the Knowledge Discovery in Databases Workshop, pages 217–228, 1994.
Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Discovering frequent episodes in sequences. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pages 210–215, 1995.
S. Muggleton. Inverse entailment and progol. New Generation Computing, 13:245–286, 1995.
Tim Oates and Paul R. Cohen. Searching for structure in multiple streams of data. In Proceedings of the Thirteenth International Conference on Machine Learning, pages 346–354, 1996.
Tim Oates, Dawn E. Gregory, and Paul R. Cohen. Detecting complex dependencies in categorical data. In Preliminary Papers of the Fifth International Workshop on Artificial Intelligence and Statistics, pages 417–423, 1994.
Patricia Riddle, Richard Segal, and Oren Etzioni. Representation design and bruteforce induction in a boeing manufacturing domain. Applied Artificial Intelligence, 8:125–147, 1994.
Ron Rymon. Search through systematic set enumeration. In Proceedings of the Third International Conference on Principles of Knowledge Representation and Reasoning, 1992.
Jeffrey C. Schlimmer. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proceedings of the Tenth International Conference on Machine Learning, pages 284–290, 1993.
Padhraic Smyth and Rodney M. Goodman. An information theoretic approach to rule induction from databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301–316, 1992.
Geoffrey I. Webb. OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research, 3:45–83, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oates, T., Schmill, M.D., Cohen, P.R. (1997). Parallel and distributed search for structure in multivariate time series. In: van Someren, M., Widmer, G. (eds) Machine Learning: ECML-97. ECML 1997. Lecture Notes in Computer Science, vol 1224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62858-4_84
Download citation
DOI: https://doi.org/10.1007/3-540-62858-4_84
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62858-3
Online ISBN: 978-3-540-68708-5
eBook Packages: Springer Book Archive