Data Analysis of Correlation Between Project Popularity and Code Change Frequency

  • Dabeeruddin Syed
  • Jadran Sessa
  • Andreas Henschel
  • Davor SvetinovicEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9950)


Github is a source code management platform with social networking features that help increase the popularity of a project. The features of the GitHub like watch, star, fork and pull requests help make a project popular among the developers, in addition to enabling them to work on the code together. In this work, we study the relation between the project popularity and the continual code changes made to a GitHub project. The correlation is found by using the metrics such as the number of watchers, pull requests, and the number of commits. We correlate the time series of code change frequency with the time series of project popularity. As a result, we have found that projects with at least 1500 watchers each month have a strong positive correlation between the project popularity and frequency of code changes. We have also found that the number of pull requests is 73.2 % more important to the popularity of a project than the number of watchers.


Data analytics Mining software repositories Open-source development 


  1. 1.
    LaToza, T.D., Towne, W.B., Van Der Hoek, A., Herbsleb, J.D.: Crowd development. In: 6th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), pp. 85–88. IEEE (2013)Google Scholar
  2. 2.
    Yu, Y., Wang, H., Yin, G., Ling, C.X.: Who should review this pull-request: reviewer recommendation to expedite crowd collaboration. In: 2014 21st Asia-Pacific Software Engineering Conference, Jeju, pp. 335–342 (2014). doi: 10.1109/APSEC.2014.57
  3. 3.
    Rahman, M.M., Roy, C.K.: An insight into the pull requests of GitHub. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR, pp. 364–367 (2014)Google Scholar
  4. 4.
    Begel, A., Bosch, J., Storey, M.-A.: Social networking meets software development: perspectives from GitHub, MSDN, stack exchange, and topcoder. IEEE Softw. 30(1), 52–66 (2013)CrossRefGoogle Scholar
  5. 5.
    Dabbish, L., Stuart, C., Tsay, J., Herbsleb, J.: Social coding in GitHub: transparency and collaboration in an open software repository. In: Proceedings of the ACM Conference on Computer Supported Cooperative Work, pp. 1277–1286. ACM (2012)Google Scholar
  6. 6.
    Gousios, G.: The GHTorent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 233–236. IEEE Press (2013)Google Scholar
  7. 7.
    Borges, H., Valente, M.T., Hora, A., Coelho, J.: On the popularity of GitHub applications: a preliminary note. arXiv preprint arXiv:1507.00604 (2015)
  8. 8.
    Weber, S., Luo, J.: What makes an open source code popular on GitHub? In: IEEE International Conference on Data Mining Workshop (ICDMW), pp. 851–855. IEEE (2014)Google Scholar
  9. 9.
    Lee, M.J., Ferwerda, B., Choi, J., Hahn, J., Moon, J.Y., Kim, J.: GitHub developers use rockstars to overcome overflow of news. In: CHI 2013 Extended Abstracts on Human Factors in Computing Systems, pp. 133–138. ACM (2013)Google Scholar
  10. 10.
    Sheoran, J., Blincoe, K., Kalliamvakou, E., Damian, D., Ell, J.: Understanding watchers on GitHub. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR, pp. 336–339 (2014)Google Scholar
  11. 11.
    Weicheng, Y., Beijun, S., Ben, X.: Mining GitHub: why commit stops - exploring the relationship between developer’s commit pattern and file version evolution. In: Proceedings of the 20th Asia-Pacific Software Engineering Conference (APSEC), APSEC 2013, pp. 165–169 (2013)Google Scholar
  12. 12.
    Bissyandé, T. F., Thung, F., Lo, D., Jiang, L., Réveillere, L.: Popularity, interoperability, and impact of programming languages in 100,000 open source projects. In: 2013 IEEE 37th Annual Computer Software and Applications Conference (COMPSAC), pp. 303–312. IEEE (2013)Google Scholar
  13. 13.
    McDonald, N., Goggins, S.: Performance and participation in open source software on GitHub. In: CHI 2013 Extended Abstracts on Human Factors in Computing Systems, CHI EA 2013, pp. 139–144 (2013)Google Scholar
  14. 14.
    Aggarwal, K., Hindle, A., Stroulia, E.: Co-evolution of project documentation and popularity within GitHub. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 360–363. ACM (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Dabeeruddin Syed
    • 1
  • Jadran Sessa
    • 1
  • Andreas Henschel
    • 1
  • Davor Svetinovic
    • 1
    Email author
  1. 1.Department of Electrical Engineering and Computer ScienceMasdar Institute of Science and TechnologyAbu DhabiUAE

Personalised recommendations