Skip to main content
Log in

Classifying and ranking topic terms based on a novel approach: role differentiation of author keywords

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

In traditional bibliometric analysis, author keywords (AKs) play a critical role in such areas as information query, co-word analysis, and capturing topic terms. In past decades, the most relevant studies have focused on the weighting methods of AKs to find specialty or discriminated terms for a topic; however, very few explorations touched the issue of role differentiation for AKs within a specific topic or the context of topic query. Furthermore, either traditional co-word analysis or the latest semantic modeling methods still face the challenges on accurate classifying and ranking the keywords/terms for a specific research topic. As a complement to prior research, a novel analytical framework based on role differentiation of AKs and Technique for Order of Preference by Similarity to Ideal Solution is proposed in this article. In addition, a case study on additive manufacturing is conducted to verify the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The programming tool is Visual Studio Community 2015 (C# language) of Microsoft Company.

References

  • Aizawa, A. (2003). An information-theoretic perspective of tf–idf measures. Information Processing and Management, 39(1), 45–65.

    Article  MathSciNet  MATH  Google Scholar 

  • Altınçay, H., & Erenel, Z. (2010). Analytical evaluation of term weighting schemes for text categorization. Pattern Recognition Letters, 31(11), 1310–1323.

    Article  Google Scholar 

  • Behzadian, M., Otaghsara, S. K., Yazdani, M., & Ignatius, J. (2012). A state-of the-art survey of TOPSIS applications. Expert Systems with Applications, 39(17), 13051–13069.

    Article  Google Scholar 

  • Bhattacharjee, P., Debnath, A., Chakraborty, S., & Mandal, U. K. (2017). Selection of optimal aluminum alloy using TOPSIS method under fuzzy environment. Journal of Intelligent and Fuzzy Systems, 32(1), 871–876.

    Article  Google Scholar 

  • Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

    Article  Google Scholar 

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(1), 993–1022.

    MATH  Google Scholar 

  • Chen, C. M. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3), 359–377.

    Article  MathSciNet  Google Scholar 

  • Chen, G., & Xiao, L. (2016). Selecting publication keywords for domain analysis in bibliometrics: a comparison of three methods. Journal of Informetrics, 10(1), 212–223.

    Article  MathSciNet  Google Scholar 

  • Chen, K., Zhang, Z., Long, J., & Zhang, H. (2016). Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Systems with Applications, 66, 245–260.

    Article  Google Scholar 

  • Choi, J., Yi, S., & Lee, K. C. (2011). Analysis of keyword networks in MIS research and implications for predicting knowledge evolution. Information and Management, 48(8), 371–381.

    Article  Google Scholar 

  • Datta, D., Varma, S., & Singh, S. K. (2017). Multimodal retrieval using mutual information based textual query reformulation. Expert Systems with Applications, 68, 81–92.

    Article  Google Scholar 

  • Della Rocca, P., Senatore, S., & Loia, V. (2017). A semantic-grained perspective of latent knowledge modeling. Information Fusion, 36, 52–67.

    Article  Google Scholar 

  • Erenel, Z., & Altınçay, H. (2012). Nonlinear transformation of term frequencies for term weighting in text categorization. Engineering Applications of Artificial Intelligence, 25(7), 1505–1514.

    Article  Google Scholar 

  • Garfield, E. (1990). Key Words Plus-ISI’s breakthrough retrieval method. Expanding your searching power on current-contents on diskette. Current Contents, 32, 5–9.

    Google Scholar 

  • Garfield, E., & Sher, I. H. (1993). Brief communication keywords plus algorithmic derivative indexing. Journal of the American Society for Information Science, 44(5), 298.

    Article  Google Scholar 

  • Goswami, P., Gaussier, E., & Amini, M. R. (2017). Exploring the space of information retrieval term scoring functions. Information Processing and Management, 53(2), 454–472.

    Article  Google Scholar 

  • Grossman, D. A., & Frieder, O. (2012). Information retrieval: Algorithms and heuristics (Vol. 15). Berlin: Springer.

    MATH  Google Scholar 

  • Harold, A. L. (2011). Three eras of technology foresight. Technovation, 31, 69–76.

    Article  Google Scholar 

  • Huang, S. H., Liu, P., Mokasdar, A., & Hou, L. (2013). Additive manufacturing and its societal impact: A literature review. The International Journal of Advanced Manufacturing Technology, 67(5–8), 1191–1203.

    Article  Google Scholar 

  • Jones, S., & Paynter, G. W. (2002). Automatic extraction of document key phrases for use in digital libraries: Evaluation and applications. Journal of the American Society for Information Science and Technology, 53(8), 653–677.

    Article  Google Scholar 

  • Khorram Niaki, M., & Nonino, F. (2017). Additive manufacturing management: A review and future research agenda. International Journal of Production Research, 55(5), 1419–1439.

    Article  Google Scholar 

  • Ko, Y. (2015). A new term-weighting scheme for text classification using the odds of positive and negative class probabilities. Journal of the Association for Information Science and Technology, 66(12), 2553–2565.

    Article  Google Scholar 

  • Lan, M., Tan, C. L., Su, J., & Lu, Y. (2009). Supervised and traditional term weighting methods for automatic text categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 721–735.

    Article  Google Scholar 

  • Li, M. N., & Chu, Y. Q. (2017). Explore the research front of a specific research theme based on a novel technique of enhanced co-word analysis. Journal of Information Science, 43(6), 725–741.

    Article  Google Scholar 

  • Li, M. N., & Porter, A. L. (2018). Facilitating the discovery of relevant studies on risk analysis for three-dimensional printing based on an integrated framework. Scientometrics, 114(1), 277–300.

    Article  Google Scholar 

  • Li, M. N., Porter, A. L., & Wang, Z. L. (2017). Evolutionary trend analysis of nanogenerator research based on a novel perspective of phased bibliographic coupling. Nano Energy, 34(4), 93–102.

    Article  Google Scholar 

  • Liu, Z., Liu, Y., Guo, Y., & Wang, H. (2013). Progress in global parallel computing research: A bibliometric approach. Scientometrics, 95(3), 967–983.

    Article  Google Scholar 

  • Liu, Y., Loh, H. T., & Sun, A. (2009). Imbalanced text classification: A term weighting approach. Expert Systems with Applications, 36(1), 690–701.

    Article  Google Scholar 

  • Robertson, S. (2004). Understanding inverse document frequency: On theoretical arguments for IDF. Journal of Documentation, 60(5), 503–520.

    Article  Google Scholar 

  • Rousseau, R. (1998). Jaccard similarity leads to the Marczewski-Steinhaus topology for information retrieval. Information Processing and Management, 34(1), 87–94.

    Article  Google Scholar 

  • Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.

    Article  Google Scholar 

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1–47.

    Article  Google Scholar 

  • Shams, M., & Baraani-Dastjerdi, A. (2017). Enriched LDA (ELDA): Combination of latent Dirichlet allocation with word co-occurrence analysis for aspect extraction. Expert Systems with Applications, 80, 136–146.

    Article  Google Scholar 

  • Soucy, P., & Mineau, G. W. (2005). Beyond TFIDF weighting for text categorization in the vector space model. In Proceedings of the 19th international joint conference on artificial intelligence, San Francisco, CA, USA (pp. 1130–1135). Morgan Kaufmann Publishers Inc.

  • Spärck, J. K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28(1), 11–21.

    Article  Google Scholar 

  • Su, H. N., & Lee, P. C. (2010). Mapping knowledge structure by keyword co-occurrence: A first look at journal papers in Technology Foresight. Scientometrics, 85(1), 65–79.

    Article  Google Scholar 

  • Suominen, A., & Toivanen, H. (2016). Map of science with topic modeling: comparison of unsupervised learning and human-assigned subject classification. Journal of the Association for Information Science and Technology, 67(10), 2464–2476.

    Article  Google Scholar 

  • Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141–188.

    MathSciNet  MATH  Google Scholar 

  • Wang, Y., Lee, J. S., & Choi, I. C. (2016). Indexing by latent dirichlet allocation and an ensemble model. Journal of the Association for Information Science and Technology, 67(7), 1736–1750.

    Article  Google Scholar 

  • Wu, H. B., Gu, X. D., & Gu, Y. W. (2017). Balancing between over-weighting and under-weighting in supervised term weighting. Information Processing and Management, 53(2), 547–557.

    Article  Google Scholar 

  • Yang, S., Han, R., Wolfram, D., & Zhao, Y. (2016). Visualizing the intellectual structure of information science (2006–2015): Introducing author keyword coupling analysis. Journal of Informetrics, 10(1), 132–150.

    Article  Google Scholar 

  • Zhang, Y., Shang, L., Huang, L., et al. (2016a). A hybrid similarity measure method for patent portfolio analysis. Journal of Informetrics, 10(4), 1108–1113.

    Article  Google Scholar 

  • Zhang, W., Yoshida, T., & Tang, X. (2011). A comparative study of TF* IDF, LSI and multi-words for text classification. Expert Systems with Applications, 38(3), 2758–2765.

    Article  Google Scholar 

  • Zhang, J., Yu, Q., Zheng, F., Long, C., et al. (2016b). Comparing keywords plus of WOS and author keywords: A case study of patient adherence research. Journal of the Association for Information Science and Technology, 4(67), 967–972.

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge and appreciate all of the experts who were involved in the email survey. This material is based on work supported by the National Natural Science Foundation of China (No. 71673088), the Foundation of Guangdong Soft Science (No. 2017A070706003), the Foundation of China Scholarship Council (No. 201606155066).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Munan Li.

Appendix

Appendix

figure a

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M. Classifying and ranking topic terms based on a novel approach: role differentiation of author keywords. Scientometrics 116, 77–100 (2018). https://doi.org/10.1007/s11192-018-2741-7

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-018-2741-7

Keywords

Navigation