Advertisement

Language Resources and Evaluation

, Volume 47, Issue 4, pp 919–944 | Cite as

The ACL anthology network corpus

  • Dragomir R. Radev
  • Pradeep Muthukrishnan
  • Vahed Qazvinian
  • Amjad Abu-Jbara
Original Paper

Abstract

We introduce the ACL Anthology Network (AAN), a comprehensive manually curated networked database of citations, collaborations, and summaries in the field of Computational Linguistics. We also present a number of statistics about the network including the most cited authors, the most central collaborators, as well as network statistics about the paper citation, author citation, and author collaboration networks.

Keywords

ACL Anthology Network Bibliometrics Scientometrics Citation analysis Citation summaries 

References

  1. Abu-Jbara, A., & Radev, D. (2011a). Coherent citation-based summarization of scientific papers. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies, Portland, Oregon, USA. Association for Computational Linguistics, pp. 500–509, June.Google Scholar
  2. Abu-Jbara, A., & Radev, D. (2011b). Coherent citation-based summarization of scientific papers. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies. Portland, Oregon, USA: Association for Computational Linguistics, pp. 500–509, June.Google Scholar
  3. Amblard, F., Casteigts, A., Flocchini, P., Quattrociocchi, W., & Santoro, N. (2011). On the temporal analysis of scientific network evolution. In International conference on computational aspects of social networks (CASoN), 2011, pp. 169–174, oct.Google Scholar
  4. Athar, A. (2011). Sentiment analysis of citations using sentence structure-based features. In Proceedings of the ACL 2011 student session, pp 81–87, Portland, OR, USA, June. Association for Computational Linguistics.Google Scholar
  5. Bird, S., Dale, R., Dorr, B., Gibson, B., Joseph, M., Kan, M.-Y., Lee, D., et al. (2008). The ACL anthology reference corpus: A reference dataset for bibliographic research in computational linguistics. In Language resources and evaluation conference (LREC 08). Marrakesh, Morocco, May.Google Scholar
  6. Blei, D., Ng, A., & Jordan, M. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.Google Scholar
  7. Borgman, C. L., & Furner, J. (2002). Scholarly communication and bibliometrics. Annual Review of Information Science and Technology, 36(1), 2–72.CrossRefGoogle Scholar
  8. Church, K. W. (1988). A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the second conference on applied natural language processing, pp. 136–143, Austin, Texas, USA, February. Association for Computational Linguistics.Google Scholar
  9. Collins, M. J. (1996). A New Statistical Parser Based On Bigram Lexical Dependencies (ACL, 1996).Google Scholar
  10. Councill, I. G., Lee Giles, C., & Kan, M.-Y. (2008). ParsCit: An open-source CRF reference string parsing package. In Proceedings of the language resources and evaluation conference (LREC-2008), Marrakesh, Morocco.Google Scholar
  11. Eisner, J. (1996). Three new probabilistic models for dependency parsing: An exploration. In Proceedings of the 34th annual conference of the association for computational linguistics (ACL-96), pp. 340–345.Google Scholar
  12. Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science Technology, 59(1), 51–62.CrossRefGoogle Scholar
  13. Hall, D., Jurafsky, D., & Manning, C. D. (2008). Studying the History of ideas using topic models. In EMNLP 2008.Google Scholar
  14. Luukkonen, T. (1992). Is scientists’ publishing behavior rewardseeking? Scientometrics, 24, 297–319. doi: 10.1007/BF02017913.CrossRefGoogle Scholar
  15. Marcus, M. P., Marcinkiewicz, M. A., & Santorini, B. (1993). Building a large annotated corpus of English: The penn treebank (CL, 1993).Google Scholar
  16. Mazloumian, A., Eom, Y.-H., Helbing, D., Lozano, S., & Fortunato, S. (2011). How citation boosts promote scientific paradigm shifts and nobel prizes. PLoS ONE, 6(5):e18975, 05.Google Scholar
  17. Mei, Q., & Zhai, C. (2008). Generating impact-based summaries for scientific literature. In Proceedings of ACL-08: HLT, pp. 816–824, Columbus, Ohio, June. Association for Computational Linguistics.Google Scholar
  18. Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., Radev, D., & Zajic, D. (2009). Using citations to generate surveys of scientific paradigms. In Proceedings of the North American chapter of the association for computational linguisticshuman language technologies (NAACL-HLT-2009), May 2009, Boulder, Colorado.Google Scholar
  19. Nakov, P. I., Schwartz, A. S., & Hearst, M. A. (2004). Citances: Citation sentences for semantic analysis of bioscience text. In Proceedings of the SIGIR04 workshop on search and discovery in bioinformatics.Google Scholar
  20. Nanba, H., Kando, N., Okumura, M., & Of Information Science. (2000). Classification of research papers using citation links and citation types: Towards automatic review article generation.Google Scholar
  21. Nanba, H., & Okumura, M. (1999). Towards multi-paper summarization using reference information. In IJCAI’99: Proceedings of the sixteenth international joint conference on artificial intelligence, pp. 926–931, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.Google Scholar
  22. Qazvinian, V., & Radev, D. R. (2008). Scientific paper summarization using citation summary networks. In COLING 2008, Manchester, UK.Google Scholar
  23. Qazvinian, V., & Radev, D. R. (2010). Identifying non-explicit citing sentences for citation-based summarization. ACL.Google Scholar
  24. Qazvinian, V., & Radev, D. R. (2011). Learning from collective human behavior to introduce diversity in lexical choice. In Proceedings of the 49th Annual Conference of the Association for Computational Linguistics (ACL’11), pp. 1098--1108.Google Scholar
  25. Qazvinian, V., Radev, D. R., & Ozgur, A. (2010).Citation summarization through keyphrase extraction, COLING’10.Google Scholar
  26. Radev, D. R., Joseph, M., Gibson, B., & Muthukrishnan, P. (2009a). A bibliometric and network analysis of the field of computational linguistics. JASIST, 2009.Google Scholar
  27. Radev, D. R., Muthukrishnan, P., & Qazvinian, V. (2009b). The acl anthology network corpus. In NLPIR4DL’09: Proceedings of the 2009 workshop on text and citation analysis for scholarly digital libraries, pp. 54–61, Morristown, NJ, USA. Association for Computational Linguistics.Google Scholar
  28. Redner, S. (2005). Citation statistics from 110 years of physical review. Physics Today, 58(6), 49–54.CrossRefGoogle Scholar
  29. Resnik, P. (1999). Mining the web for bilingual text. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics. Association for Computational Linguistics, (ACL’99).Google Scholar
  30. Schäfer, U., Kiefer, B., Spurk, C., Steffen, J., & Wang, R. (2011). The ACL anthology searchbench. In Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (ACL HLT 2011), system demonstrations, pp. 7–13. Portland, OR, USA.Google Scholar
  31. Shieber, S. M. (1985). Using restriction to extend parsing algorithms for complex-feature-based formalisms. In Proceedings of the 23rd annual meeting of the association for computational linguistics, pp. 145–152, Chicago, Illinois, USA, July. Association for Computational Linguistics.Google Scholar
  32. Siddharthan, A., & Teufel, S. (2007). Whose idea was this, and why does it matter? Attributing scientific work to citations. In Proceedings of NAACL/HLT-07.Google Scholar
  33. Teufel, S. (2007). Argumentative zoning for improved citation indexing. Computing attitude and affect in text. Theory and Applications, 159170.Google Scholar
  34. Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of EMNLP-06.Google Scholar
  35. Turney, P. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, (ACL’02).Google Scholar
  36. Zhou, D., Zhu, S., Yu, K., Song, X., Tseng, B. L., Zha, H., & Lee Giles, C. (2008). Learning multiple graphs for document recommendations. In Proceedings of the 17th international world wide web conference (WWW 2008), Beijing, China, 2008.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Dragomir R. Radev
    • 1
  • Pradeep Muthukrishnan
    • 1
  • Vahed Qazvinian
    • 1
  • Amjad Abu-Jbara
    • 1
  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of MichiganAnn ArborUSA

Personalised recommendations