Abstract
Given a large collection of time-evolving online user activities, such as Google Search queries for multiple keywords of various categories (celebrities, events, diseases, etc.), which consist of \(d\) keywords/activities, for \(l\) countries/locations of duration \(n\), how can we find patterns and rules? For example, assume that we have the online search volume for “Harry Potter”, “Barack Obama”, and “Amazon”, for 232 countries/territories, from 2004 to 2015, which include external shocks, sudden change of search volume, and more. How do we go about capturing nonlinear evolutions of local activities and forecasting future patterns? In this paper, we present \(\varDelta \)-SPOT, a unifying analytical nonlinear model for analyzing large-scale web search data, which is sensemaking, automatic, scalable, and free of parameters. \(\varDelta \)-SPOT can also forecast long-range future dynamics of the keywords/queries. We use the Google Search, Twitter, and MemeTracker dataset for extensive experiments, which show that our method outperforms other effective methods of nonlinear mining in terms of accuracy and in both fitting and forecasting.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Here, the parameter values are \(\beta =5.014\times 10^{-1}\), \(\delta =4.675\times 10^{-1}\), \(\gamma =5.211\times 10^{-1}\), \(\eta _{0}=1.605\times 10^{-1}\), \(t_{\eta }=343\) ( the growth effect starts from time-tick 343).
- 2.
Here, \(\log ^*\) is the universal code length for integers, defined as \(\log ^*(x) \approx \log _2(x) + \log _2\log _2(x)+\dots \), where only the positive terms are included [21].
- 3.
We used \(4\times 8\) bits in our setting.
- 4.
Here, \(\mu \) and \(\sigma ^2\) are the mean and variance of the distance between the original and estimated values, and they need \(2c_{F}\) bits, but we can eliminate them because they are constant values and independent of our modeling.
- 5.
- 6.
- 7.
- 8.
Meme\(\#3\):“yes we can yes we can”
Meme\(\#16\):“joe satriani is a great musician but he did not write or have any influence on the song viva la vida we respectfully ask him to accept our assurances of this and wish him well with all future endeavours”.
References
H. Choi and H. R. Varian. Predicting the present with google trends. The Economic Record, 88(s1):2–9, 2012.
J. Ginsberg, M. Mohebbi, R. Patel, L. Brammer, M. Smolinski, and L. Brilliant. Detecting influenza epidemics using search engine query data. Nature, 457:1012–1014, 2009.
S. Goel, J. Hofman, S. Lahaie, D. Pennock, and D. Watts. Predicting consumer behavior with web search. PNAS, 2010.
D. Gruhl, R. Guha, R. Kumar, J. Novak, and A. Tomkins. The predictive power of online chatter. In KDD, pages 78–87, 2005.
A. Jain, E. Y. Chang, and Y.-F. Wang. Adaptive stream resource management using kalman filters. In SIGMOD, pages 11–22, 2004.
K. Levenberg. A method for the solution of certain non-linear problems in least squares. Quarterly Journal of Applied Mathmatics, II(2):164–168, 1944.
L. Li, C.-J. M. Liang, J. Liu, S. Nath, A. Terzis, and C. Faloutsos. Thermocast: A cyber-physical forecasting model for data centers. In KDD, 2011.
L. Li, J. McCann, N. Pollard, and C. Faloutsos. Dynammo: Mining and summarization of coevolving sequences with missing values. In KDD, 2009.
L. Li, B. A. Prakash, and C. Faloutsos. Parsimonious linear fingerprinting for time series. PVLDB, 3(1):385–396, 2010.
A. M. D. Livera, R. J. Hyndman, and R. D. Snyder. Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American Statistical Association, 106(496):1513–1527, 2011.
Y. Matsubara, Y. Sakurai, and C. Faloutsos. Autoplait: automatic mining of co-evolving time sequences. In SIGMOD, pages 193–204, 2014.
Y. Matsubara, Y. Sakurai, and C. Faloutsos. The web as a jungle: Non-linear dynamical systems for co-evolving online activities. In WWW, 2015.
Y. Matsubara, Y. Sakurai, and C. Faloutsos. Non-linear mining of competing local activities. In WWW, 2016.
Y. Matsubara, Y. Sakurai, C. Faloutsos, T. Iwata, and M. Yoshikawa. Fast mining and forecasting of complex time-stamped events. In KDD, pages 271–279, 2012.
Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and fall patterns of information diffusion: model and implications. In KDD, pages 6–14, 2012.
Y. Matsubara, Y. Sakurai, W. G. van Panhuis, and C. Faloutsos. FUNNEL: automatic mining of spatially coevolving epidemics. In KDD, pages 105–114, 2014.
S. Papadimitriou, A. Brockwell, and C. Faloutsos. Adaptive, hands-off stream mining. In VLDB, pages 560–571, 2003.
B. A. Prakash, A. Beutel, R. Rosenfeld, and C. Faloutsos. Winner takes all: competing viruses or ideas on fair-play networks. In WWW, pages 1037–1046, 2012.
T. Preis, H. S. Moat, and H. E. Stanley. Quantifying trading behavior in financial markets using google trends. Sci. Rep., 3, 04 2013.
T. Rakthanmanon, B. J. L. Campana, A. Mueen, G. E. A. P. A. Batista, M. B. Westover, Q. Zhu, J. Zakaria, and E. J. Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In KDD, pages 262–270, 2012.
J. Rissanen. A Universal Prior for Integers and Estimation by Minimum Description Length. Ann. of Statist., 11(2):416–431, 1983.
L. Stone, R. Olinky, and A. Huppert. Seasonal dynamics of recurrent epidemics. Nature, 446:533–536, 2007.
Y. Tao, C. Faloutsos, D. Papadias, and B. Liu. Prediction and indexing of moving objects with unknown motion patterns. In SIGMOD, pages 611–622, 2004.
Acknowledgements
This work was partially supported by JSPS KAKENHI Grant-in-Aid for Scientific Research Number JP15H02705, JP17H04681, JP16K12430, PRESTO JST, the MIC/SCOPE #162110003, and the ICT infrastructure establishment for clinical and medical research from Japan Agency for Medical Research and development, AMED.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Do, T.M., Matsubara, Y., Sakurai, Y. (2019). Nonlinear Time-series Mining of Social Influence. In: Lee, W., Leung, C. (eds) Big Data Applications and Services 2017. BIGDAS 2017. Advances in Intelligent Systems and Computing, vol 770. Springer, Singapore. https://doi.org/10.1007/978-981-13-0695-2_16
Download citation
DOI: https://doi.org/10.1007/978-981-13-0695-2_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0694-5
Online ISBN: 978-981-13-0695-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)