Temporal Correlation Analysis of Programming Language Popularity
- 7 Downloads
Abstract
Based on the premise that programming languages interact with one another while their popularities changes over time, we describe a technique for extracting latent features from the popularities of programming languages. We constructed a matrix in which each column consisted of a time series of partial correlation coefficients between the popularities of different languages. For the analysis, we utilized non-negative matrix factorization (NMF) to factorize the matrix into the matrices of temporal modes and mixture components. We found that the matrix was optimally factorized with three temporal modes, and the factorization results were more or less independent of factorization algorithms. In accordance with NMF, which learns a part-based representation of the matrix, the sparse property of the temporal modes illustrated different patterns of correlation strength over time. By analyzing the NMF results, we show that the most popular languages of Java, C, and C++ become more correlated as time passes and that the recent similar trends in the popularities of Java and C can be explained by the positive correlation between the two at a later stage in time. These and other characteristics of the popularity explained by NMF may provide clues to understanding the evolution of the popularity of programming language.
Keywords
Programming language Popularity Non-negative matrix factorization Partial correlation coefficient Temporal correlationPreview
Unable to display preview. Download preview PDF.
Notes
Acknowledgments
This work was supported by a research grant from the Kongju National University in 2018.
References
- [1]TIOBE, https://www.tiobe.com/tiobe-index/ (2019).
- [2]PYPL PopularitY of Programming Language index, http://pypl.github.io/PYPL.html/ (2019).
- [3]RedMonk Programming Language Rankings, https://redmonk.com/sogrady/2016/02/19/language-rankings-1-16/ (2019).
- [4]
- [5]B. Ray, D. Posnett, P. Devanbu and V. Filkov, Commun. ACM 60, 91 (2017).CrossRefGoogle Scholar
- [6]P. Kochhar, D. Wijedasa and D. Lo, in IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER) (Suita, 2016), p. 563.Google Scholar
- [7]D. Kenett et al., PLOS ONE 5, e15032 (2010).ADSCrossRefGoogle Scholar
- [8]D. Lee and H. Seung, Nature 401, 788 (1999).ADSCrossRefGoogle Scholar
- [9]TIOBE Programming Community Index Definition, https://www.tiobe.com/tiobe-index/programming-lang uages-definition (2019).
- [10]K. Baba, R. Shibata and M. Sibuya, Aust. NZ. J. Stat. 46, 657 (2004).CrossRefGoogle Scholar
- [11]J. Brunet, P. Tamayo, T. Golub and J. Mesirov, Proc. Natl. Acad. Sci. U.S.A. 101, 4164 (2004).ADSCrossRefGoogle Scholar
- [12]T. Kolda and B. Bader, SIAM Rev. 51, 455 (2009).ADSMathSciNetCrossRefGoogle Scholar
- [13]A. Shashua and T. Hazan, in Proceedings of the 22nd International Conference on Machine Learning, ICML (Bonn, Germany, 2005), p. 792.Google Scholar
- [14]T. van de Cruys, in Proceedings of the Workshop on Geometrical Models of Natural Language Semantics (Association for Computational Linguistics, GEMS, Stroudsburg, PA, USA, 2009), p. 83.CrossRefGoogle Scholar
- [15]Y. Wang and E. Agichtein, in Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (New York, NY, USA: ACM, SIGIR, 2011), p. 1145.Google Scholar
- [16]K. Devarajan, PLoS Comput. Biol. 4, e1000029 (2008).ADSCrossRefGoogle Scholar
- [17]M. Xu et al., Proc. Natl. Acad. Sci. U.S.A. 106, 12323 (2009).ADSCrossRefGoogle Scholar
- [18]D. Lai, X. Wu, H. Lu and C. Nardini, Int. J. Mod. Phys. C 22, 1173 (2011).ADSCrossRefGoogle Scholar
- [19]I. Psorakis, S. Roberts, M. Ebden and B. Sheldon, Phys. Rev. E 83, 066114 (2011).ADSCrossRefGoogle Scholar
- [20]T. Johnson, S. Clark and D. Jaksch, Phys. Rev. E 82, 036702 (2010).ADSMathSciNetCrossRefGoogle Scholar
- [21]P. Sembiring, J. Phys.: Conf. Ser. 930, 012046 (2017).Google Scholar
- [22]D. Snyder et al., J. Chem. Phys. 128, 052313 (2008).ADSCrossRefGoogle Scholar
- [23]Y. Wang and Y. Zhang, IEEE Trans. Knowl. Data Eng. 25, 1336 (2013).CrossRefGoogle Scholar
- [24]A. Türkmen, https://doi.org/1507.03194 (2015).
- [25]M. Berry et al., Comput. Stat. Data Anal. 52, 155 (2007).CrossRefGoogle Scholar
- [26]C. Boutsidis and E. Gallopoulos, Pattern Recognit. 41, 1350 (2008).CrossRefGoogle Scholar
- [27]A. Langville et al., https://doi.org/1407.7299 (2014).
- [28]R. Gaujoux and C. Seoighe, BMC Bioinf. 11, 367 (2010).CrossRefGoogle Scholar
- [29]H. Kim and H. Park, Bioinformatics 23, 1495 (2007).CrossRefGoogle Scholar
- [30]P. Hoyer, J. Mach. Learn. Res. 5, 1457 (2004).Google Scholar
- [31]P. Rousseeuw, J. Comput. Appl. Math. 20, 53 (1987).CrossRefGoogle Scholar
- [32]A. Pascual-Montano et al., IEEE Trans. Pattern Anal. Mach. Intell. 28, 403 (2006).CrossRefGoogle Scholar
- [33]R. Fisher, Biometrika 10, 507 (1915).Google Scholar
- [34]Y. Benjamini and Y. Hochberg, J. R. Stat. Soc. B 57, 289 (1995).Google Scholar
- [35]P. Holme and J. Saramäki, Phys. Rep. 519, 97 (2012).ADSCrossRefGoogle Scholar