Employing Inductive Databases in Concrete Applications

Meo, Rosa; Lanzi, Pier Luca; Matera, Maristella; Careggio, Danilo; Esposito, Roberto

doi:10.1007/11615576_14

Rosa Meo²¹,
Pier Luca Lanzi²²,
Maristella Matera²²,
Danilo Careggio²¹ &
…
Roberto Esposito²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3848))

308 Accesses

Abstract

In this paper we present the application of the inductive database approach to two practical analytical case studies: Web usage mining in Web logs and financial data. As far as concerns the Web domain, we have considered the enriched XML Web logs, that we call conceptual logs, produced by specific Web applications. These ones have been built by using a conceptual model, namely WebML, and its accompanying CASE tool, WebRatio. The Web conceptual logs integrate the usual information about user requests with meta-data concerning the Web site structure. As far as concerns the analysis of financial data, we have considered the trade stock exchange index Dow Jones and studied its component stocks from 1997 to 2002 using the so-called technical analysis. Technical analysis consists in the identification of the relevant (graphical) patterns that occur in the plot of evolution of a stock quote as time proceeds, often adopting different time granularities. On the plots the correlations between distinctive variables of the stocks quote are pointed out, such as the quote trend, the percentage variation and the volume of the stocks exchanged. In particular we adopted candle-sticks, a figurative pattern representing in a condensed diagram the evolution of the stock quotes in a daily stock exchange. In technical analysis, candle-sticks have been frequently used by practitioners to predict the trend of the stocks quotes in the market.

We then apply a data mining language, namely MINE RULE, to these data in order to identify different types of patterns. As far as Web data is concerned, recurrent navigation paths, page contents most frequently visited, and anomalies such as intrusion attempts or a harmful usage of the resources are among the most important patterns. As far as concerns the financial domain, we searched for the sets of stocks which frequently exhibited a positive daily exchange in the same days, so as to constitute a collection of quotes for the constitution of the customers’ portfolio, or the candle-sticks frequently associated to certain stocks, or finally the most similar stocks, in the sense that they mostly presented in the same dates the same typology of candle-stick, that is the same behaviour in time.

The purpose of this paper is to show that the exploitation of the nuggets of information embedded in the data and of the specialised mining constructs provided by the query languages, enables the rapid customization of the mining procedures following to the users’ need. Given our experience, we also claim that the use of queries in advanced languages, as opposed to ad-hoc heuristics, eases the specification and the discovery of a large spectrum of patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Botta, M., Meo, R., Malangone, C.: Association rules extraction with mine rule operator. Technical report, RT73-2003, Dipartimento di Informatica, University of Torino, Italy (April 2003)
Google Scholar
Ceri, S., Fraternali, P., Bongio, A.: Web modeling language (webml): a modeling language for designing web sites. In: Proc. of WWW9 Conference (May 2000)
Google Scholar
Ceri, S., Fraternali, P., Bongio, A., Brambilla, M., Comai, S., Matera, M.: Designing Data-Intensive Web Applications. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Apache Cocoon. Cocoon, http://xml.apache.org/cocoon/
Cooley, R.: Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota (2000)
Google Scholar
Cooley, R., Tan, P.N., Srivastava, J.: Discovery of Interesting Usage Patterns from Web Data. LNCS (LNAI). Springer, Heidelberg (2000)
Google Scholar
Das, G., Lin, K.-I., Mannila, H., Renganathan, G., Smyth, P.: Rule discovery from time series. In: Proceedings of the 1997 ACM SIGKDD International Conference, ACM SIGKDD (1997)
Google Scholar
Brown, D., Jennings, R.: On technical analysis. Review of Finance Studies 2, 527–551 (1989)
Article Google Scholar
Facca, F.M., Lanzi, P.L.: Mining interesting knowledge from weblogs: A survey. Technical Report 2003.15, Dipartimento di Elettronica e Informazione. Politecnico di Milano. (April 2003)
Google Scholar
Farrell, J.: Portfolio Management: Theory and Application. McGraw-Hill, New York (1997)
Google Scholar
Fraternali, P., Matera, M., Maurino, A.: Conceptual-level log analysis for the evaluation of web application quality. In: Proceedings of LA-Web 2003, Santiago, Chile, November 2003. IEEE Computer Society, Los Alamitos (2003)
Google Scholar
Fu, T.-C., Chung, F.L., Ng, V., Luk, R.: Pattern discovery from stock time series using self-organizing maps. In: Proceedings of the 1997 ACM SIGKDD International Conference, ACM SIGKDD (2001)
Google Scholar
Ramazan, G.: The predictability of security returns with simple trading rules. The Journal of Empirical Finance 5, 347–359 (1998)
Article Google Scholar
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Coomunications of the ACM 39(11), 58–64 (1996)
Article Google Scholar
Ito, A.: Empirical evaluation of technical analysis: A synthesis. Technical report, International University of Japan (November 1999)
Google Scholar
Jensen, M.C.: Random walks and technical theories: Some additional evidence. The Journal of Finance (25), 469–482 (1970)
Google Scholar
Kohavi, R., Parekh, R.: Ten supplementary analyses to improve e-commerce web sites. In: Proceedings of the Fifth WEBKDD Workshop: Webmining as a premise to effective and intelligent Web Applications, ACM SIGKDD, Washington, DC, USA. Springer, Heidelberg (2003)
Google Scholar
Blume, L., Easley, D., O’Hara, M.: Market statistics and technical analysis: the role of trading volumes. The Journal of Finance 49, 153–181 (1994)
Article Google Scholar
Lo, A.W., Mamaysky, H., Wang, J.: Foundations of technical analysis: Computational algorithms, statistical inference, and empirical implementation. The Journal of Finance LV(4), 1705–1765 (2000)
Google Scholar
Meo, R., Psaila, G., Ceri, S.: An extension to SQL for mining association rules. Journal of Data Mining and Knowledge Discovery 2(2) (1998)
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)
Article Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Mining bases for association rules using closed sets. In: Proceedings of the 16th International Conference on Extending Databases. IEEE, Los Alamitos (2000)
Google Scholar
Pirolli, P., Pitkow, J., Rao, R.: Silk from a sow’s ear: Extracting usable structures form the web. In: Proc. of CHI 96 Conference. ACM Press, New York (April 1996)
Google Scholar
Pring, M.: An introduction to Technical Analysis. McGraw-Hill, New York (1997)
Google Scholar
Punin, J.R., Krishnamoorthy, M.S., Zaki, M.J.: Logml: Log markup language for web usage mining. In: Kohavi, R., Masand, B., Spiliopoulou, M., Srivastava, J. (eds.) WebKDD 2001. LNCS (LNAI), vol. 2356, pp. 88–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations 1(2), 12–23 (2000)
Article Google Scholar
Teltzrow, M., Berendt, B.: Web-usage-based success metrics for multi-channel businesses. In: Proceedings of the Fifth WEBKDD Workshop: Webmining as a premise to effective and intelligent Web Applications, ACM SIGKDD, Washington, DC, USA. Springer, Heidelberg (2003)
Google Scholar
Wille, R.: Concept lattices and conceptual knowledge systems. Computers and Mathematics with Applications 23, 493 (1992)
Article MATH Google Scholar
Zaki, M.: Mining non-redundant association rules. Data Mining and Knowledge Discovery 9, 223–248 (2004)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università di Torino, corso Svizzera 185, I-10149, Torino, Italy
Rosa Meo, Danilo Careggio & Roberto Esposito
Dipartimento di Elettronica e Informazione, Politecnico di Milano, Piazza Leonardo da Vinci, 32, I-20133, Milano, Italy
Pier Luca Lanzi & Maristella Matera

Authors

Rosa Meo
View author publications
You can also search for this author in PubMed Google Scholar
Pier Luca Lanzi
View author publications
You can also search for this author in PubMed Google Scholar
Maristella Matera
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Careggio
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Esposito
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Heverlee, Belgium
Luc De Raedt
HIIT, Helsinki University of Technology and, University of Helsinki, Finland
Heikki Mannila

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meo, R., Lanzi, P.L., Matera, M., Careggio, D., Esposito, R. (2006). Employing Inductive Databases in Concrete Applications. In: Boulicaut, JF., De Raedt, L., Mannila, H. (eds) Constraint-Based Mining and Inductive Databases. Lecture Notes in Computer Science(), vol 3848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11615576_14

Download citation

DOI: https://doi.org/10.1007/11615576_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31331-1
Online ISBN: 978-3-540-31351-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics