Building a Digital Library of Web News
We introduce a new information system for organization of a Digital Library of news articles found on the Web, with automatic topic classification. We present our strategies to deal with different update frequencies of news Web sites, the classification methodology, the data model for storing news articles, measurements on the data retrieved and finally results of classification of this type of information.
Unable to display preview. Download preview PDF.
- 1.Bowman, C., Danzig, P., Hardy, D., Manber, U. and Schwartz, M.: The Harvest Information Discovery and Access System. Proceedings of the Second International WWW Conference. pp.763–771, 1994.Google Scholar
- 2.Dumais, S., Platt, J., Heckerman, D. and Sahami, M.: Inductive Learning Algorithms and Representations for Text Categorization. Proceedings of the Seventh International Conference on Information and Knowledge Management, 1998.Google Scholar
- 3.Joachims, T.: Making large-Scale SVM Learning Practical. Advances in Kernel Methods-Support Vector Learning, B. Schölkopf and C. Burges and A. Smola (ed.), MIT-Press, 1999.Google Scholar
- 4.Maria, N., Gaspar, P., Grilo, N., Ferreira, A. and Silva M. J.: ARIADNE-Digital Library Architecture. Proceedings of the 2nd European Conference on digital Libraries (ECDL’98), pages 667–668, 1998.Google Scholar
- 7.Yang, Y. and Liu X.. A re-examination of text categorization methods. Proceedings of the 22th Ann Int ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99), pages 42–49, 1999.Google Scholar