Skip to main content

Experiment with a Hierarchical Text Categorization Method on WIPO Patent Collections

  • Chapter
Applied Research in Uncertainty Modeling and Analysis

Part of the book series: International Series in Intelligent Technologies ((ISIT,volume 20))

4. Conclusion

We presented HITEC, an automated text classifier and its application categorize to English and German patent collections of WIPO under the IPC taxonomy. IPC covers all areas of technology and is currently used by the industrial property offices of many countries. Patent classification is indispensable for the retrieval of patent documents in the search for prior art. Such retrieval is crucial to patent-issuing authorities, potential inventors, research and development units, and others concerned with the application or development of technology. An efficient automated patent classifier is crucial component in providing an automated classification assistance system for categorizing patent applications in the IPC, that is a main aim at WIPO Fall et al., 2002. HITEC can be a prominent candidate for this purpose.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aas, L. and Eikvil, L. (1999). Text categorisation: A survey. Raport NR 941, Norwegian Computing Center.

    Google Scholar 

  • Baker, K. D. and McCallum, A. K. (1998). Distributional clustering of words for text classification. In Proc. of the 21th Annual Int. ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’98), pages 96–103, Melbourne, Australia.

    Google Scholar 

  • Chakrabarti, S., Dom, B., Agrawal, R., and Raghavan, P. (1998). Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies. The VLDB Journal, 7(3):163–178.

    Article  Google Scholar 

  • Fall, C. J., Törcsvári, A., Benzineb, K., and Karetka, G. (2003a). Automated categorization in the international patent classification. ACM SIGIR Forum archive, 37(1):10–25.

    Article  Google Scholar 

  • Fall, C. J., Törcsvári, A., Fievét, P., and Karetka, G. (2003b). Additional readme information for WIPO-de autocategorization data set. http://www.wipo.int/ibis/datasets/wipo-de-readme.html.

    Google Scholar 

  • Fall, C. J., Törcsvári, A., and Karetka, G. (2002). Readme information for WIPO-alpha autocategorization training set. http://www.wipo.int/ibis/datasets/wipo-alpha-readme.html.

    Google Scholar 

  • Koller, D. and Sahami, M. (1997). Hierarchically classifying documents using a very few words. In International Conference on Machine Learning, volume 14, San Mateo, CA. Morgan-Kaufmann.

    Google Scholar 

  • McCallum, A., Rosenfeld, R., Mitchell, T., and Ng, A. (1998). Improving text classification by shrinkage in a hierarchy of classes. In Proc. of ICML-98. http://www-2.cs.cmu.edu/~mccallum/papers/hier-icml98.ps.gz.

    Google Scholar 

  • Salton, G. and McGill, M. J. (1983). An Introduction to Modern Information Retrieval. McGraw-Hill.

    Google Scholar 

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47.

    Article  Google Scholar 

  • Tikk, D. and Biró, G. (2003). Experiments with multilabel text classifier on the Reuters collection. In International Conference on Computational Cybernetics (ICCC03), pages 33–38, Siófok, Hungary.

    Google Scholar 

  • Tikk, D., Yang, J. D., and Bang, S. L. (2003). Hierarchical text categorization using fuzzy relational thesaurus. Kybernetika, 39(5):583–600.

    Google Scholar 

  • van Rijsbergen, C. J. (1979). Information Retrieval. Butterworths, London, 2nd edition. http://www.dcs.gla.ac.uk/Keith.

    Google Scholar 

  • Weiss, S. M., Apte, C., Damerau, F. J., Johnson, D. E., Oles, F. J., Goetz, T., and Hampp, T. (1999). Maximizing text-mining performance. IEEE Intelligent Systems, 14(4):2–8.

    Article  Google Scholar 

  • Wibovo, W. and Williams, H. E. (2002). Simple and accurate feature selection for hierarchical categorisation. In Proc. of the 2002 ACM symposium on Document engineering, pages 111–118, McLean, Virginia, USA.

    Chapter  Google Scholar 

  • Wiener, E., Pedersen, J. O., and Weigend, A. S. (1993). A neural network approach to topic spotting. In Proc. of the 4th Annual Symposium on Document Analysis and Information Retrieval, pages 22–34.

    Google Scholar 

  • Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information Retrieval, 1(1–2):69–90. http://citeseer.nj.nec.com/yang97evaluation.html.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Tikk, D., Biró, G., Yang, J.D. (2005). Experiment with a Hierarchical Text Categorization Method on WIPO Patent Collections. In: Attoh-Okine, N.O., Ayyub, B.M. (eds) Applied Research in Uncertainty Modeling and Analysis. International Series in Intelligent Technologies, vol 20. Springer, Boston, MA. https://doi.org/10.1007/0-387-23550-7_13

Download citation

Publish with us

Policies and ethics