Early Prediction of Temporal Sequences Based on Information Transfer

Yang, Ning; Peng, Jian; Chen, Yu; Tang, Changjie

doi:10.1007/978-3-642-23535-1_46

Ning Yang²¹,
Jian Peng²¹,
Yu Chen²¹ &
…
Changjie Tang²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6897))

Included in the following conference series:

International Conference on Web-Age Information Management

1727 Accesses
1 Citations

Abstract

In recent years, early prediction for ongoing sequences has been more and more valuable in a large variety of time-critical applications which demand to classify an ongoing sequence in its early stage. There are two challenging issues in early prediction, i.e. why an ongoing sequence is early predictable and how to reasonably determine the parameter k _optimal, the minimum number of elements that must be observed before an accurate classification can be made. To address these issues, this paper investigates the kinetic regularity of the information transfer in sequence data set. As a result, a new concept of Accumulatively Transferred Information (ATI) and its kinetic model in early predictable sequences are proposed. This model shows that the information transfer in early predictable sequences follows Inverse Heavy-tail Distribution(IHD), and the most uncertainty of an early predictable sequence is eliminated by only few of its preceding elements, which is exactly the intrinsic and theoretically sound ground of the feasibility of early prediction. Based on the kinetic model, a heuristic algorithm is proposed to learn the parameter k _optimal. The experiments are conducted on real data sets and the results validate the reasonableness and effectiveness of the proposed theory and algorithm.

Supported by the Basic Research Foundation for Central Universities under Grant No. 2010SCU11053.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dongand, G., Pei, J.: Sequence Data Mining. Springer, Heidelberg (2007)
Google Scholar
Lesh, M.O.N., Zaki, M.J.: Mining features for sequence classification. In: Proc. of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 342–346. ACM, New York (1999)
Google Scholar
Srikant, R.R.: Mining sequential patterns: Generalizations and performance improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 3–17. Springer, Heidelberg (1996)
Google Scholar
Pei, J., Han, J.: Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proc. of the 17th International Conference on Data Engineering, pp. 215–226. IEEE, Los Alamitos (2001)
Google Scholar
Ayres, T.J., Flannick, J.: Sequential pattern mining using a bitmap representation. In: Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 429–435. ACM, New York (2002)
Google Scholar
Zaki, M.J.: Spade: An efficient algorithm for mining frequent sequences. Machine Learning 42(1), 31–60 (2001)
Article MATH Google Scholar
Cheng, J.H., Yan, X.: Seqindex: Indexing sequences by sequential pattern analysis. In: Proc. of 2005 SIAM International Conference on Data Mining, pp. 601–605. SIAM, Philadelphia (2005)
Chapter Google Scholar
Parker, P.C., Fern, A.: Gradient boosting for sequence alignment. In: Proc. of the 21st National Conference on Artificial Intelligence, pp. 452–457 (2006)
Google Scholar
Karwath, N.A.: Boosting relational sequence alignments. In: Proc. of the 8th IEEE International Conference on Data Mining, pp. 857–862. IEEE, Los Alamitos (2008)
Google Scholar
Tseng, M.: Cbs: A new classification method by using sequential patterns. In: Proc. of 2005 SIAM International Conference on Data Mining, pp. 596–600. SIAM, Philadelphia (2005)
Chapter Google Scholar
Exarchos, T.P., Tsipouras, M.G.: A two-stage methodology for sequence classification based on sequential pattern mining and optimization. Data and Knowledge Engineering 66(3), 467–487 (2008)
Article Google Scholar
Wu, C., Berry, M.: Neural networks for full-scale protein sequence classification: Sequence encoding with singular value decomposition. Machine Learning 21(1), 177–193 (1995)
Google Scholar
Ma, Q., Wang, J.T.L.: Dna sequence classication via an expectation maximization algorithm and neural networks: a case study. IEEE Transactions on Systems, Man and Cybernetics 31(4), 468–475 (2001)
Article Google Scholar
Park, M.K.J.: Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs. Bioinformatics 19(13), 1656–1663 (2003)
Article Google Scholar
She, R., Chen, F.: Frequent-subsequence-based prediction of outer membrane proteins. In: Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436–445. ACM, New York (2003)
Google Scholar
Sonnenburg, S., Rätsch, G., Schäfer, C.: Learning interpretable sVMs for biological sequence classification. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 389–407. Springer, Heidelberg (2005)
Chapter Google Scholar
Alonso, C.J., Rodriguez, J.J.: Boosting interval based literals: Variable length and early classification. In: Data Mining in Time Series Databases. World Scientific, Singapore (2004)
Google Scholar
Xing, Z., Pei, J.: Mining sequence classifiers for early prediction. In: Proc. of 2008 SIAM International Conference on Data Mining, pp. 644–655. SIAM, Philadelphia (2008)
Chapter Google Scholar
Cover, T.: Elements of Information Theory, 2nd edn. John Wiley, Chichester (2006)
MATH Google Scholar
http://www.cs.unm.edu/~immsec/data/
http://www.ll.mit.edu/cst.html
Cohen, W.W., Singer, Y.: A simple, Fast, and Effective Learner. In: Proc. of the 16th National Conference on Artificial Intelligence, pp. 335–342 (1999)
Google Scholar
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science, Sichuan University, Chengdu, China
Ning Yang, Jian Peng, Yu Chen & Changjie Tang

Authors

Ning Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Changjie Tang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Rd., Haidian District, 100190, Beijing, China
Haixun Wang
Computer School, Wuhan University, 16 Luojiashan Road, 430072, Hubei, China
Shijun Li
Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, 060-0814, Hokkaido, Sapporo, Japan
Satoshi Oyama
College of Information Science and Technology, Drexel University, 19104, Philadelphia, PA, USA
Xiaohua Hu
State Key Laboratory of Software Engineering, Wuhan University, 16 Luojiashan Road, 430072, Wuhan, Hubei, China
Tieyun Qian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, N., Peng, J., Chen, Y., Tang, C. (2011). Early Prediction of Temporal Sequences Based on Information Transfer. In: Wang, H., Li, S., Oyama, S., Hu, X., Qian, T. (eds) Web-Age Information Management. WAIM 2011. Lecture Notes in Computer Science, vol 6897. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23535-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-642-23535-1_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23534-4
Online ISBN: 978-3-642-23535-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics