Abstract
In this paper we present the application of the inductive database approach to two practical analytical case studies: Web usage mining in Web logs and financial data. As far as concerns the Web domain, we have considered the enriched XML Web logs, that we call conceptual logs, produced by specific Web applications. These ones have been built by using a conceptual model, namely WebML, and its accompanying CASE tool, WebRatio. The Web conceptual logs integrate the usual information about user requests with meta-data concerning the Web site structure. As far as concerns the analysis of financial data, we have considered the trade stock exchange index Dow Jones and studied its component stocks from 1997 to 2002 using the so-called technical analysis. Technical analysis consists in the identification of the relevant (graphical) patterns that occur in the plot of evolution of a stock quote as time proceeds, often adopting different time granularities. On the plots the correlations between distinctive variables of the stocks quote are pointed out, such as the quote trend, the percentage variation and the volume of the stocks exchanged. In particular we adopted candle-sticks, a figurative pattern representing in a condensed diagram the evolution of the stock quotes in a daily stock exchange. In technical analysis, candle-sticks have been frequently used by practitioners to predict the trend of the stocks quotes in the market.
We then apply a data mining language, namely MINE RULE, to these data in order to identify different types of patterns. As far as Web data is concerned, recurrent navigation paths, page contents most frequently visited, and anomalies such as intrusion attempts or a harmful usage of the resources are among the most important patterns. As far as concerns the financial domain, we searched for the sets of stocks which frequently exhibited a positive daily exchange in the same days, so as to constitute a collection of quotes for the constitution of the customers’ portfolio, or the candle-sticks frequently associated to certain stocks, or finally the most similar stocks, in the sense that they mostly presented in the same dates the same typology of candle-stick, that is the same behaviour in time.
The purpose of this paper is to show that the exploitation of the nuggets of information embedded in the data and of the specialised mining constructs provided by the query languages, enables the rapid customization of the mining procedures following to the users’ need. Given our experience, we also claim that the use of queries in advanced languages, as opposed to ad-hoc heuristics, eases the specification and the discovery of a large spectrum of patterns.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Botta, M., Meo, R., Malangone, C.: Association rules extraction with mine rule operator. Technical report, RT73-2003, Dipartimento di Informatica, University of Torino, Italy (April 2003)
Ceri, S., Fraternali, P., Bongio, A.: Web modeling language (webml): a modeling language for designing web sites. In: Proc. of WWW9 Conference (May 2000)
Ceri, S., Fraternali, P., Bongio, A., Brambilla, M., Comai, S., Matera, M.: Designing Data-Intensive Web Applications. Morgan Kaufmann, San Francisco (2002)
Apache Cocoon. Cocoon, http://xml.apache.org/cocoon/
Cooley, R.: Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota (2000)
Cooley, R., Tan, P.N., Srivastava, J.: Discovery of Interesting Usage Patterns from Web Data. LNCS (LNAI). Springer, Heidelberg (2000)
Das, G., Lin, K.-I., Mannila, H., Renganathan, G., Smyth, P.: Rule discovery from time series. In: Proceedings of the 1997 ACM SIGKDD International Conference, ACM SIGKDD (1997)
Brown, D., Jennings, R.: On technical analysis. Review of Finance Studies 2, 527–551 (1989)
Facca, F.M., Lanzi, P.L.: Mining interesting knowledge from weblogs: A survey. Technical Report 2003.15, Dipartimento di Elettronica e Informazione. Politecnico di Milano. (April 2003)
Farrell, J.: Portfolio Management: Theory and Application. McGraw-Hill, New York (1997)
Fraternali, P., Matera, M., Maurino, A.: Conceptual-level log analysis for the evaluation of web application quality. In: Proceedings of LA-Web 2003, Santiago, Chile, November 2003. IEEE Computer Society, Los Alamitos (2003)
Fu, T.-C., Chung, F.L., Ng, V., Luk, R.: Pattern discovery from stock time series using self-organizing maps. In: Proceedings of the 1997 ACM SIGKDD International Conference, ACM SIGKDD (2001)
Ramazan, G.: The predictability of security returns with simple trading rules. The Journal of Empirical Finance 5, 347–359 (1998)
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Coomunications of the ACM 39(11), 58–64 (1996)
Ito, A.: Empirical evaluation of technical analysis: A synthesis. Technical report, International University of Japan (November 1999)
Jensen, M.C.: Random walks and technical theories: Some additional evidence. The Journal of Finance (25), 469–482 (1970)
Kohavi, R., Parekh, R.: Ten supplementary analyses to improve e-commerce web sites. In: Proceedings of the Fifth WEBKDD Workshop: Webmining as a premise to effective and intelligent Web Applications, ACM SIGKDD, Washington, DC, USA. Springer, Heidelberg (2003)
Blume, L., Easley, D., O’Hara, M.: Market statistics and technical analysis: the role of trading volumes. The Journal of Finance 49, 153–181 (1994)
Lo, A.W., Mamaysky, H., Wang, J.: Foundations of technical analysis: Computational algorithms, statistical inference, and empirical implementation. The Journal of Finance LV(4), 1705–1765 (2000)
Meo, R., Psaila, G., Ceri, S.: An extension to SQL for mining association rules. Journal of Data Mining and Knowledge Discovery 2(2) (1998)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Mining bases for association rules using closed sets. In: Proceedings of the 16th International Conference on Extending Databases. IEEE, Los Alamitos (2000)
Pirolli, P., Pitkow, J., Rao, R.: Silk from a sow’s ear: Extracting usable structures form the web. In: Proc. of CHI 96 Conference. ACM Press, New York (April 1996)
Pring, M.: An introduction to Technical Analysis. McGraw-Hill, New York (1997)
Punin, J.R., Krishnamoorthy, M.S., Zaki, M.J.: Logml: Log markup language for web usage mining. In: Kohavi, R., Masand, B., Spiliopoulou, M., Srivastava, J. (eds.) WebKDD 2001. LNCS (LNAI), vol. 2356, pp. 88–112. Springer, Heidelberg (2002)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.-N.: Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explorations 1(2), 12–23 (2000)
Teltzrow, M., Berendt, B.: Web-usage-based success metrics for multi-channel businesses. In: Proceedings of the Fifth WEBKDD Workshop: Webmining as a premise to effective and intelligent Web Applications, ACM SIGKDD, Washington, DC, USA. Springer, Heidelberg (2003)
Wille, R.: Concept lattices and conceptual knowledge systems. Computers and Mathematics with Applications 23, 493 (1992)
Zaki, M.: Mining non-redundant association rules. Data Mining and Knowledge Discovery 9, 223–248 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meo, R., Lanzi, P.L., Matera, M., Careggio, D., Esposito, R. (2006). Employing Inductive Databases in Concrete Applications. In: Boulicaut, JF., De Raedt, L., Mannila, H. (eds) Constraint-Based Mining and Inductive Databases. Lecture Notes in Computer Science(), vol 3848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11615576_14
Download citation
DOI: https://doi.org/10.1007/11615576_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31331-1
Online ISBN: 978-3-540-31351-9
eBook Packages: Computer ScienceComputer Science (R0)