Skip to main content

A Fuzzy Semisupervised Clustering Method: Application to the Classification of Scientific Publications

  • Conference paper
Book cover Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2014)

Abstract

This paper introduces a new method of fuzzy semisupervised hierarchical clustering using fuzzy instance level constraints. It introduces the concepts of fuzzy must-link and fuzzy cannot-link constraints and use them to find the optimum α-cut of a dendrogram. This method is used to approach the problem of classifying scientific publications in web digital libraries. It is tested on real data from that problem against classical methods and crisp semisupervised hierarchical clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, M.A., Campaña, J.R.: An automatic system for identifying authorities in digital libraries. Expert Systems with Applications 40, 3994–4002 (2013)

    Article  Google Scholar 

  2. Davidson, I., Basu, S.: A survey of clustering with instance level constraints. ACM Transactions on Knowledge Discovery from Data, 1–41 (2007)

    Google Scholar 

  3. Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 1103–1110 (2000)

    Google Scholar 

  4. Peng, H.T., Lu, C.Y., Hsu, W., Ho, J.M.: Disambiguating authors in citations on the web and authorship correlations. Expert Systems with Applications 39, 10521–10532 (2012)

    Article  Google Scholar 

  5. Huang, J., Ertekin, S., Giles, C.L.: Efficient name disambiguation for large-scale databases. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 536–544. Springer, Heidelberg (2006)

    Google Scholar 

  6. Han, H., Giles, L., Zha, H., Li, C., Tsioutsiouliklis, K.: Two supervised learning approaches for name disambiguation in author citations. In: Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, pp. 296–305 (2004)

    Google Scholar 

  7. Treeratpituk, P., Giles, C.L.: Disambiguating authors in academic publications using random forests. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 39–48. ACM, Austin (2009)

    Chapter  Google Scholar 

  8. Ferreira, A.A., Gonçalves, M.A., Almeida, J.M., Laender, A.H.F., Veloso, A.: A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences 206, 42–62 (2012)

    Google Scholar 

  9. Tang, J., Fong, A.C.M., Wang, B., Zhang, J.: A unified probabilistic framework for name disambiguation in digital library. IEEE Trans. on Knowl. and Data Eng. 24, 975–987 (2012)

    Article  Google Scholar 

  10. Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H.: Using web information for author name disambiguation. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 49–58. ACM, Austin (2009)

    Chapter  Google Scholar 

  11. Cota, R.G., Ferreira, A.A., Nascimento, C., Gonçalves, M.A., Laender, A.H.F.: An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations. Journal of the American Society for Information Science and Technology 61, 1853–1870 (2010)

    Google Scholar 

  12. Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, M.A.: A proposal for automatic authority control in digital libraries. Information Processing and Magnament (2013)

    Google Scholar 

  13. Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, M.A.: Using a semisupervised fuzzy clustering process for identity identification in digital libraries. In: IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS), 2013 Joint, pp. 831–836 (2013)

    Google Scholar 

  14. Kaufmann, A.: Introduction to the Theory of Fuzzy Subsets. Academic Pr. (1975)

    Google Scholar 

  15. Delgado, M., Gomez-Skarmeta, A.F., Vila, M.A.: On the use of hierarchical clustering in fuzzy modeling. International Journal of Approximate Reasoning 14, 237–257 (1996)

    Article  MATH  Google Scholar 

  16. Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, MA. (2014). A Fuzzy Semisupervised Clustering Method: Application to the Classification of Scientific Publications. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2014. Communications in Computer and Information Science, vol 442. Springer, Cham. https://doi.org/10.1007/978-3-319-08795-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08795-5_19

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08794-8

  • Online ISBN: 978-3-319-08795-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics