Abstract
It is very important to derive association rules at high speed from huge volume of databases. However, the typical fast mining algorithms in text databases tend to derive meaningless rules such as stopwords, then many researchers try to remove these noisy rules by using various filters. In our researches, we improve the association algorithm and develop information navigation systems for text data using visual interface, and we also apply a dictionary to remove noisy keywords from derived association rules. In order to remove noisy keywords automatically, we propose an algorithm basedon the true positive rate and the false positive rate in the ROC analysis. Moreover, in order to remove stopwords automatically from raw association rules, we introduce several thresholdv alues of the ROC analysis into our proposedmining algorithm. We evaluate the performance of our proposedmining algorithms in a bibliographic database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th International Conference on Very Large Data Bases, Santiago, Chile (1994) 487–499
Kawahara, M., Kawano, H.: An application of text mining: Bibliographic navigator powered by extendedasso ciation rules. In: Proceedings of the 33rdAnn ual Hawaii International Conference on System Sciences (HICSS-33), CD-ROM, Maui, HI, USA (2000)
Kawano, H.: Mondou: Web search engine with textual data mining. In: Proc. of IEEE Pacific Rim Conference on Communications, Computers and Signal Processing. (1997) 402–405
Kawano, H., Kawahara, M.: Mondou: Information navigator with visual interface. In: Data Warehousing and Knowledge Discovery, SecondIn ternational Conference, DaWaK 2000, London, UK (2000) 425–430
Kawahara, M., Kawano, H.: Roc performance evaluation of web-basedbibliographic navigator using mining association rules. In: Internet Applications, Proc. of 5th International Computer Science Conference, ICSC’99, Hong Kong, China (1999) 216–225
Barber, C., Dobkin, D., Huhdanpaa, H.: The quickhull algorithm for convex hull. Technical Report GCG53, University of Minnesota (1993)
Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proceedings of 3rd International Conference on Knowledge Discovery and Data Mining (KDD-97). (1997) 43–48
Kawahara, M., Kawano, H.: The other thresholds in the mining association algorithm. SYSTEMS SCIENCE 26 (2000) 95–109
Salton, G., McGill, M.J.: Introduction to modern information retrieval. McGraw-Hill, New York, USA (1983)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kawano, H., Kawahara, M. (2002). Extended Association Algorithm Based on ROC Analysis for Visual Information Navigator. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_49
Download citation
DOI: https://doi.org/10.1007/3-540-45884-0_49
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43338-5
Online ISBN: 978-3-540-45884-5
eBook Packages: Springer Book Archive