Identifying the landscape of Alzheimer’s disease research with network and content analysis
Alzheimer’s disease (AD) is one of degenerative brain diseases, whose cause is hard to be diagnosed accurately. As the number of AD patients has increased, researchers have strived to understand the disease and develop its treatment, such as medical experiments and literature analysis. In the area of literature analysis, several traditional studies analyzed the literature at the macro level like author, journal, and institution. However, analysis of the literature both at the macro level and micro level will allow for better recognizing the AD research field. Therefore, in this study we adopt a more comprehensive approach to analyze the AD literature, which consists of productivity analysis (year, journal/proceeding, author, and Medical Subject Heading terms), network analysis (co-occurrence frequency, centrality, and community) and content analysis. To this end, we collect metadata of 96,081 articles retrieved from PubMed. We specifically perform the concept graph-based network analysis applying the five centrality measures after mapping the semantic relationship between the UMLS concepts from the AD literature. We also analyze the time-series topical trend using the Dirichlet multinomial regression topic modeling technique. The results indicate that the year 2013 is the most productive year and Journal of Alzheimer’s Disease the most productive journal. In discovery of the core biological entities and their relationships resided in the AD related PubMed literature, the relationship with glycogen storage disease is founded most frequently mentioned. In addition, we analyze 16 main topics of the AD literature and find a noticeable increasing trend in the topic of transgenic mouse.
KeywordsAlzheimer’s disease (AD) Bibliometrics Document representation Concept graph Topic modeling
This work was supported by the Bio-Synergy Research Project (2013M3A9C4078138) of the Ministry of Science, ICT and Future Planning through the National Research Foundation.
- Andreasen, T., Bulskov, H., Jensen, P. A., & Lassen, T. (2009). Conceptual indexing of text using ontologies and lexical resources. Presented at the Proceedings of the eighth international conference on flexible query answering systems (Vol. 5822, pp. 323–332). Berlin: Springer.Google Scholar
- Ansari, M. A., Gul, S., & Yaseen, M. (2006). Alzheimer’s disease: A bibliometric study. Trends in Information Management (TRIM), 2(2), 130–140.Google Scholar
- Bachman, D., Wolf, P. A., Linn, R., Knoefel, J., Cobb, J., Belanger, A., … D’Agostino, R. (1993). Incidence of dementia and probable Alzheimer’s disease in a general population The Framingham Study. Neurology, 43(3 Part 1), 515–515.Google Scholar
- Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. (pp. 361–362). Presented at the International AAAI Conference on Weblogs and Social Media, ICWSM 2009.Google Scholar
- Bleik, S., Song, M., Smalter, A., Huan, J., & Lushington, G. (2009). CGM: A biomedical text categorization approach using concept graph mining (pp. 38–43). Presented at the IEEE International Conference on Bioinformatics and Biomedicine Workshop, 2009, BIBMW 2009.Google Scholar
- Cavnar, W. B., & Trenkle, J. M. (1994). N-gram-based text categorization. Proceedings of 3rd annual symposium on document analysis and information retrieval, 48113(2), 161–175.Google Scholar
- Chen, Y.-M., Wang, X.-L., & Liu, B.-Q. (2005). Multi-document summarization based on lexical chains. 2005. Presented at the Proceedings of 2005 IEEE international conference on machine learning and cybernetics (Vol. 3, pp. 1937–1942).Google Scholar
- Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. Presented at the Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (pp. 19–25). New York: ACM.Google Scholar
- Huang, C., Tian, Y., Zhou, Z., Ling, C. X., & Huang, T. (2006). Keyphrase extraction using semantic networks structure analysis (pp. 275–284). Presented at the Sixth IEEE international conference on data mining, ICDM’06.Google Scholar
- Krauthammer, M., Kaufmann, C. A., Gilliam, T. C., & Rzhetsky, A. (2004). Molecular triangulation: bridging linkage and molecular-network information for identifying candidate genes in Alzheimer’s disease. Proceedings of the National Academy of Sciences of the United States of America, 101(42), 15148–15153.CrossRefGoogle Scholar
- Lambiotte, R., Delvenne, J. C., & Barahona, M. (2009). Laplacian dynamics and multiscale modular structure in networks. ArXiv preprint arXiv: 0812.1770.Google Scholar
- Lindberg, D. A., Humphreys, B. L., & McCray, A. T. (1993). The unified medical language system. Methods of Information in Medicine, 32(4), 281–291.Google Scholar
- Mimno, D., & McCallum, A. (2008). Topic models conditioned on arbitrary features with dirichlet-multinomial regression. Presented at the Proceedings of the 24th conference on uncertainty in artificial intelligence (pp. 411–418).Google Scholar
- Ravetti, M. G., Rosso, O. A., Berretta, R., & Moscato, P. (2010). Uncovering molecular biomarkers that correlate cognitive decline with the changes of hippocampus’ gene expression profiles in Alzheimer’s disease. PLoS One, 5(4), e10153. doi: 10.1371/journal.pone.0010153.
- Shehata, S., Karray, F., & Kamel, M. (2007). A concept-based model for enhancing text categorization. Presented at the Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 629–637). New York: ACM.Google Scholar
- Sorensen, A. A. (2009). Alzheimer’s disease research: scientific productivity and impact of the top 100 investigators in the field. Journal of Alzheimer’s Disease, 16(3), 451–465.Google Scholar
- Wan, X., Yang, J., & Xiao, J. (2007). Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction (Vol. 45(1), p 552). Presented at the Annual Meeting-Association for Computational Linguistics.Google Scholar