A Fuzzy Semisupervised Clustering Method: Application to the Classification of Scientific Publications

Diaz-Valenzuela, Irene; Martin-Bautista, Maria J.; Vila, Maria-Amparo

doi:10.1007/978-3-319-08795-5_19

Irene Diaz-Valenzuela¹⁶,
Maria J. Martin-Bautista¹⁶ &
Maria-Amparo Vila¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 442))

Included in the following conference series:

International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems

933 Accesses
6 Citations

Abstract

This paper introduces a new method of fuzzy semisupervised hierarchical clustering using fuzzy instance level constraints. It introduces the concepts of fuzzy must-link and fuzzy cannot-link constraints and use them to find the optimum α-cut of a dendrogram. This method is used to approach the problem of classifying scientific publications in web digital libraries. It is tested on real data from that problem against classical methods and crisp semisupervised hierarchical clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, M.A., Campaña, J.R.: An automatic system for identifying authorities in digital libraries. Expert Systems with Applications 40, 3994–4002 (2013)
Article Google Scholar
Davidson, I., Basu, S.: A survey of clustering with instance level constraints. ACM Transactions on Knowledge Discovery from Data, 1–41 (2007)
Google Scholar
Wagstaff, K., Cardie, C.: Clustering with instance-level constraints. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 1103–1110 (2000)
Google Scholar
Peng, H.T., Lu, C.Y., Hsu, W., Ho, J.M.: Disambiguating authors in citations on the web and authorship correlations. Expert Systems with Applications 39, 10521–10532 (2012)
Article Google Scholar
Huang, J., Ertekin, S., Giles, C.L.: Efficient name disambiguation for large-scale databases. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 536–544. Springer, Heidelberg (2006)
Google Scholar
Han, H., Giles, L., Zha, H., Li, C., Tsioutsiouliklis, K.: Two supervised learning approaches for name disambiguation in author citations. In: Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, pp. 296–305 (2004)
Google Scholar
Treeratpituk, P., Giles, C.L.: Disambiguating authors in academic publications using random forests. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 39–48. ACM, Austin (2009)
Chapter Google Scholar
Ferreira, A.A., Gonçalves, M.A., Almeida, J.M., Laender, A.H.F., Veloso, A.: A tool for generating synthetic authorship records for evaluating author name disambiguation methods. Information Sciences 206, 42–62 (2012)
Google Scholar
Tang, J., Fong, A.C.M., Wang, B., Zhang, J.: A unified probabilistic framework for name disambiguation in digital library. IEEE Trans. on Knowl. and Data Eng. 24, 975–987 (2012)
Article Google Scholar
Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H.: Using web information for author name disambiguation. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 49–58. ACM, Austin (2009)
Chapter Google Scholar
Cota, R.G., Ferreira, A.A., Nascimento, C., Gonçalves, M.A., Laender, A.H.F.: An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations. Journal of the American Society for Information Science and Technology 61, 1853–1870 (2010)
Google Scholar
Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, M.A.: A proposal for automatic authority control in digital libraries. Information Processing and Magnament (2013)
Google Scholar
Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, M.A.: Using a semisupervised fuzzy clustering process for identity identification in digital libraries. In: IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS), 2013 Joint, pp. 831–836 (2013)
Google Scholar
Kaufmann, A.: Introduction to the Theory of Fuzzy Subsets. Academic Pr. (1975)
Google Scholar
Delgado, M., Gomez-Skarmeta, A.F., Vila, M.A.: On the use of hierarchical clustering in fuzzy modeling. International Journal of Approximate Reasoning 14, 237–257 (1996)
Article MATH Google Scholar
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Sciences and Artificial Intelligence / CITIC-UGR, University of Granada, Spain
Irene Diaz-Valenzuela, Maria J. Martin-Bautista & Maria-Amparo Vila

Authors

Irene Diaz-Valenzuela
View author publications
You can also search for this author in PubMed Google Scholar
Maria J. Martin-Bautista
View author publications
You can also search for this author in PubMed Google Scholar
Maria-Amparo Vila
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University Montpellier 2, LIRMM - CNRS UMR 5506, 161, Rue Ada, 34392, Montpellier Cedex 5, France
Anne Laurent
LIRMM, UMR CNRS/Universite Montpellier II, 161 rue Ada, 34392, Montpellier cedex 5, France
Olivier Strauss
LIP6, UPMC Univ. Paris 06, CNRS UMR 7606, F-75005, Paris, France
Bernadette Bouchon-Meunier
Dept. of Information Systems, Iona College, 710 North Ave, 10801, New Rochelle, NY, USA
Ronald R. Yager

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diaz-Valenzuela, I., Martin-Bautista, M.J., Vila, MA. (2014). A Fuzzy Semisupervised Clustering Method: Application to the Classification of Scientific Publications. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2014. Communications in Computer and Information Science, vol 442. Springer, Cham. https://doi.org/10.1007/978-3-319-08795-5_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-08795-5_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08794-8
Online ISBN: 978-3-319-08795-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics