Shape Identification in Temporal Data Sets

Gregory, Machon; Shneiderman, Ben

doi:10.1007/978-1-4471-2804-5_17

Machon Gregory⁶ &
Ben Shneiderman⁷

2748 Accesses
8 Citations

Abstract

Shapes are a concise way to describe temporal variable behaviors. Some commonly used shapes are spikes, sinks, rises, and drops. A spike describes a set of variable values that rapidly increase, then immediately rapidly decrease. The variable may be the value of a stock or a person’s blood sugar levels. Shapes are abstract. Details such as the height of spike or its rate increase, are lost in the abstraction. These hidden details make it difficult to define shapes and compare one to another. For example, what attributes of a spike determine its “spikiness”? The ability to define and compare shapes is important because it allows shapes to be identified and ranked, according to an attribute of interest. Work has been done in the area of shape identification through pattern matching and other data mining techniques, but ideas combining the identification and comparison of shapes have received less attention. This paper fills the gap by presenting a set of shapes and the attributes by which they can identified, compared, and ranked. Neither the set of shapes, nor their attributes presented in this paper are exhaustive, but it provides an example of how a shape’s attributes can be used for identification and comparison. The intention of this paper is not to replace any particular mathematical method of identifying a particular behavior, but to provide a toolset for knowledge discovery and an intuitive method of data mining for novices. Spikes, sinks, rises, drops, lines, plateaus, valleys, and gaps are the shapes presented in this paper. Several attributes for each shape are defined. These attributes will be the basis for constructing definitions that allow the shapes to be identified and ranked. The second contribution is an information visualization tool, TimeSearcher: Shape Search Edition (SSE), which allows users to explore data sets using the identification and ranking ideas in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Psaila, G., Wimmers, E. L., & Zaot, M. (1995). Querying shapes of histories. In Proc. 21st international conference on very large databases (pp. 502–514). San Mateo: Morgan Kaufmann Publishers.
Google Scholar
Balog, K., Mishne, G., & Rijke, M. (2006). Why are they excited? Identifying and explaining spikes in blog mood levels. In Proc. 11th meeting of the European chapter of the association for computational linguistics.
Google Scholar
Das, G., Lin, K., Mannila, H., Renganathan, G., & Smyth, P. (1998). Rule discovery from time series. In Proc. of the 4th international conference on knowledge discovery and data mining (pp. 16–22).
Google Scholar
Dettki, H., & Ericsson, G. (2008). Screening radiolocation datasets for movement strategies with time series segmentation. Journal of Wildlife Management, 72, 535–542.
Article Google Scholar
Dubinko, M., Kumar, R., Magnani, J., Novak, J., Raghavan, P., & Tomkins, A. (2006). Visualizing tags over time. In Proc. of the 15th international WWW conference.
Google Scholar
Fu, T. C., Chung, F. L., Ng, V., & Luk, R. (2001). Pattern discovery for stock time series using self-organizing maps. In Workshop on temporal data mining, 7th international conference on knowledge discovery and data mining (pp. 27–37). New York: ACM Press.
Google Scholar
Garofalakis, M. N., Rastogi, R., & Shim (1999). Spirit: Sequential pattern mining with regular expression constraints. In Proc. of the 25th international conference on very large databases (pp. 223–234).
Google Scholar
Guralnik, V., & Srivastava, J. (1999). Event detection from time series data. In Proc. of the fifth international conference on knowledge discovery and data mining (pp. 33–42).
Chapter Google Scholar
Han, J., Dong, G., & Yin, Y. (1998). Efficient mining of partial periodic patterns in time series database. In Proc. of the fourth international conference on knowledge discovery and data mining (pp. 214–218). Menlo Park: AAAI Press.
Google Scholar
Hochheiser, H. (2002). Visual queries for finding patterns in time series data. PhD diss. University of Maryland Computer Science Dept.
Google Scholar
Keogh, E., Hochheiser, H., & Shneiderman, B. (2002). An augmented visual query mechanism for finding patterns in time series data. In LNAI. Proc. of the 5th international conference on flexible query answering systems (pp. 240–250). Berlin: Springer.
Chapter Google Scholar
Padmanabhan, B., & Tuzhilin, A. (1996). Pattern discovery in temporal databases: A temporal logic approach. In Proc. of the 2nd international conference on knowledge discovery and data mining.
Google Scholar
Ryall, K., Lesh, N., Miyashita, H., Makino, S., Lanning, T., Lanning, T., Leigh, D., & Leigh, D. (2005). Querylines: Approximate query for visual browsing. In Extended abstracts of the conference on human factors in computing systems (pp. 1765–1768). New York: ACM Press.
Google Scholar
Seo, J., & Shneiderman, B. (2004). A rank-by-feature framework for unsupervised multidimensional data exploration using low dimensional projections. In Proc. of the IEEE symposium on information visualization (pp. 65–72). New York: IEEE Press.
Google Scholar
Shamma, D., Kennedy, L., & Churchill, E. F. (2011). Peaks and persistence: Modeling the shape of microblog conversations. In Proc. of computer supported cooperative work 2011 (pp. 355–358). New York: ACM Press.
Google Scholar
Wattenberg, M. (2001). Sketching a graph to query a time series database. In Proc. of the 2001 conference human factors in computing systems, extended abstracts (pp. 381–382). New York: ACM Press.
Chapter Google Scholar
Yang, J., Wang, W., & Yu, P. S. (2003). Stamp: Discovery of statistically important pattern repeats in a long sequence. In Proc. of the 3rd SIAM international conference on data mining (pp. 224–238). Philadelphia: SIAM.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Maryland, College Park, MD, 20742, USA
Machon Gregory
Dept of Computer Science & Human and Computer Interaction Lab, University of Maryland, College Park, MD, 20742, USA
Ben Shneiderman

Authors

Machon Gregory
View author publications
You can also search for this author in PubMed Google Scholar
Ben Shneiderman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Machon Gregory .

Editor information

Editors and Affiliations

School of Interactive Arts & Technology, Simon Fraser University, 102 Ave 250-13450, Surrey, V3T 0A3, British Columbia, Canada
John Dill
School of Computing, Informatics & Media, Centre for Visual Computing, University of Bradford, Richmond Road, Bradford, BD7 1DP, United Kingdom
Rae Earnshaw
The Boeing Company, South Trenton Street, Seattle, 98124, Washington, USA
David Kasik
Bournemouth Media School, National Centre for Computer Animation, Bournemouth University, Talbot Campus, Poole, BH12 5BB, Dorset, United Kingdom
John Vince
Pacific Northwest National Laboratory, Battelle Boulevard 902, Richland, 99352, Washington, USA
Pak Chung Wong

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gregory, M., Shneiderman, B. (2012). Shape Identification in Temporal Data Sets. In: Dill, J., Earnshaw, R., Kasik, D., Vince, J., Wong, P. (eds) Expanding the Frontiers of Visual Analytics and Visualization. Springer, London. https://doi.org/10.1007/978-1-4471-2804-5_17

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2804-5_17
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2803-8
Online ISBN: 978-1-4471-2804-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics