Skip to main content

Author Name Disambiguation in Heterogeneous Academic Networks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11817))

Abstract

In the real world, it is inevitable that some people share a name. However, the ambiguity of the author’s name has brought many difficulties to the retrieval of academic works. Existing author name disambiguation works generally rely on the feature engineering or graph topology of the academic networks (e.g., the collaboration relationships). However, the features may be costly to obtain due to the availability or privacy of data. What’s more, the simple relational data cannot capture the rich semantics underlying the heterogeneous academic graphs. Therefore, in this paper, we study the problem of author name disambiguation in the setting of heterogeneous information network, and a novel network representation learning based author name disambiguation method is proposed. Firstly, we extract the heterogeneous information networks and meta-path channels based on the selected meta-paths. Secondly, two meta-path based proximities are proposed to measure the neighboring and structural similarities between nodes. Thirdly, the embeddings of various types of nodes are sampled and jointly updated according to the extracted meta-path channels. Finally, the disambiguation task is completed by employing an effective clustering method on the generated paper related vector space. Experimental results based on well-known Aminer dataset show that the proposed method can obtain better results compared to state-of-the-art author name disambiguation methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The concept of meta-path channel is similar to the color channel in image processing. For example, an RGB picture can be viewed as a combination of three color channels i.e., red, green, and blue. Similarly, a heterogeneous information network can also be considered as a combination of meta-path channels which contains multiple meta-path instances with respect to different meta-paths.

  2. 2.

    https://www.biendata.com/competition/scholar2018.

References

  1. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT’2010, pp. 177–186. Physica-Verlag HD, Heidelberg (2010)

    Google Scholar 

  2. Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proceedings of the Conference of 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, EACL 2006, 3–7 April 2006

    Google Scholar 

  3. Cen, L., Dragut, E.C., Si, L., Ouzzani, M.: Author disambiguation by hierarchical agglomerative clustering with adaptive stopping criterion. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 741–744. ACM (2013)

    Google Scholar 

  4. Chen, H., Perozzi, B., Hu, Y., Skiena, S.: HARP: hierarchical representation learning for networks. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  5. Ferreira, A.A., Gonçalves, M.A., Laender, A.H.: A brief survey of automatic methods for author name disambiguation. ACM Sigmod Rec. 41(2), 15–26 (2012)

    Article  Google Scholar 

  6. Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, 25–29 July 2011, pp. 765–774 (2011)

    Google Scholar 

  7. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)

    Google Scholar 

  8. Li, Y., Li, C., Chen, W.: Research on influence ranking of Chinese movie heterogeneous network based on PageRank algorithm. In: Proceedings of the 15th International Conference on Web Information Systems and Applications, pp. 344–356 (2018)

    Chapter  Google Scholar 

  9. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)

    Google Scholar 

  10. Shi, C., Hu, B., Zhao, X., Yu, P.: Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 31, 357–370 (2018)

    Article  Google Scholar 

  11. Shi, C., Li, Y., Zhang, J., Sun, Y., Philip, S.Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. 29(1), 17–37 (2017)

    Article  Google Scholar 

  12. Steinbach, M., Karypis, G., Kumar, V., et al.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining, Boston, vol. 400, pp. 525–526 (2000)

    Google Scholar 

  13. Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. International World Wide Web Conferences Steering Committee (2015)

    Google Scholar 

  14. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 990–998. ACM (2008)

    Google Scholar 

  15. Zhang, B., Al Hasan, M.: Name disambiguation in anonymized graphs using network embedding. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1239–1248. ACM (2017)

    Google Scholar 

  16. Zhang, B., Saha, T.K., Al Hasan, M.: Name disambiguation from link data in a collaboration graph. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 81–84. IEEE (2014)

    Google Scholar 

  17. Zhang, Y., Zhang, F., Yao, P., Tang, J.: Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1002–1011. ACM (2018)

    Google Scholar 

Download references

Acknowledgements

This research is funded by the National Natural Science Foundation of China under grant No. 61802440 and No. 61702553. We are also supported by the Fundamental Research Funds for the Central Universities, ZUEL: 2722019JCT037 and the Opening Project of State Key Laboratory of Digital Publishing Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ma, X., Wang, R., Zhang, Y. (2019). Author Name Disambiguation in Heterogeneous Academic Networks. In: Ni, W., Wang, X., Song, W., Li, Y. (eds) Web Information Systems and Applications. WISA 2019. Lecture Notes in Computer Science(), vol 11817. Springer, Cham. https://doi.org/10.1007/978-3-030-30952-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30952-7_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30951-0

  • Online ISBN: 978-3-030-30952-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics