Skip to main content

Estimating Number of Citations Using Author Reputation

  • Conference paper
Book cover String Processing and Information Retrieval (SPIRE 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4726))

Included in the following conference series:

Abstract

We study the problem of predicting the popularity of items in a dynamic environment in which authors post continuously new items and provide feedback on existing items. This problem can be applied to predict popularity of blog posts, rank photographs in a photo-sharing system, or predict the citations of a scientific article using author information and monitoring the items of interest for a short period of time after their creation. As a case study, we show how to estimate the number of citations for an academic paper using information about past articles written by the same author(s) of the paper. If we use only the citation information over a short period of time, we obtain a predicted value that has a correlation of r = 0.57 with the actual value. This is our baseline prediction. Our best-performing system can improve that prediction by adding features extracted from the past publishing history of its authors, increasing the correlation between the actual and the predicted values to r = 0.81.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adar, E., Zhang, L., Adamic, L.A., Lukose, R.M.: Implicit structure and the dynamics of blogspace. In: WWE 2004, New York, USA (May 2004)

    Google Scholar 

  2. Baeza-Yates, R., Saint-Jean, F., Castillo, C.: Web structure, dynamics and page quality. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Buriol, L., Castillo, C., Donato, D., Leonardi, S., Millozzi, S.: Temporal evolution of the wikigraph. In: WI 2006, Hong Kong, pp. 45–51. IEEE CS Press, Los Alamitos (December 2006)

    Google Scholar 

  4. Cho, J., Roy, S., Adams, R.E.: Page quality: in search of an unbiased web ranking. In: SIGMOD 2005, pp. 551–562. ACM Press, New York (2005)

    Chapter  Google Scholar 

  5. Feitelson, D.G., Yovel, U.: Predictive ranking of computer scientists using citeseer data. Journal of Documentation 60(1), 44–61 (2004)

    Article  Google Scholar 

  6. Fujimura, K., Tanimoto, N.: The eigenrumor algorithm for calculating contributions in cyberspace communities. In: Falcone, R., Barber, S., Sabater-Mir, J., Singh, M.P. (eds.) Trusting Agents for Trusting Electronic Societies. LNCS (LNAI), vol. 3577, pp. 59–74. Springer, Heidelberg (2005)

    Google Scholar 

  7. Gehrke, J., Ginsparg, P., Kleinberg, J.: Overview of the 2003 kdd cup. SIGKDD Explor. Newsl. 5(2), 149–151 (2003)

    Article  Google Scholar 

  8. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  9. Kumar, R., Novak, J., Raghavan, P., Tomkins, A.: Structure and evolution of blogspace. Commun. ACM 47(12), 35–39 (2004)

    Article  Google Scholar 

  10. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: KDD 2005, pp. 177–187. ACM Press, New York (2005)

    Chapter  Google Scholar 

  11. Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: CIKM 2003, pp. 556–559. ACM Press, New York (2003)

    Chapter  Google Scholar 

  12. Mei, Q., Liu, C., Su, H., Zhai, C.: A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: WWW 2006, pp. 533–542. ACM Press, New York (2006)

    Chapter  Google Scholar 

  13. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the Web. Technical report, Stanford Digital Library Technologies Project (1998)

    Google Scholar 

  14. Popescul, A., Ungar, L.H.: Statistical relational learning for link prediction. In: IJCAI 2003 (2003)

    Google Scholar 

  15. Salganik, M.J., Dodds, P.S., Watts, D.J.: Experimental study of inequality and unpredictability in an artificial cultural market. Science 311(5762), 854–856 (2006)

    Article  Google Scholar 

  16. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Nivio Ziviani Ricardo Baeza-Yates

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Castillo, C., Donato, D., Gionis, A. (2007). Estimating Number of Citations Using Author Reputation. In: Ziviani, N., Baeza-Yates, R. (eds) String Processing and Information Retrieval. SPIRE 2007. Lecture Notes in Computer Science, vol 4726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75530-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75530-2_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75529-6

  • Online ISBN: 978-3-540-75530-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics