A bibliometric approach to finding fields that co-evolved with information technology
- 143 Downloads
Among the declining industries, for example music industry, some have been revived by information technology (IT). At the same time, in academic fields, some have expected co-evolutions between IT and other fields to cause the resurgence of either field. In this research, the clustering of citation networks with 14,438 academic papers resulted in the identification of 28 academic fields in the areas “Computer Science” or “Information Science and Library Science.” Co-evolutions between these 28 fields and citing fields to the 28 fields were evaluated by an investigation of contents; a methodology to search co-evolutions was also proposed. This paper proposes that pairs of academic fields (with both high correlation and high dissimilarity) co-evolve, and some co-evolving pairs of academic fields were found. This research contributes to the discovery of the co-evolution between academic fields.
KeywordsCo-evolution Academic landscape Big data Network analysis Citation analysis Clustering Unsupervised classification Cosine similarity Horizon scanning Exploratory scanning
With the advancement of information technology (IT), new industries have been developed, and old industries have been replaced. For example, the music industry has shifted from physical music recordings to downloading or streaming; this was due to IT advancements. In the music industry, sales of old media have declined. Furthermore, users have begun to participate in new attitudes toward music activities such as streaming music and crowdsourcing by using IT; sales related to live music have started to rise again after declining (IFPI 2017; Naveed et al. 2017). In the music industry, the music rankings like Billboard began in America at the end of the 19th century, and in 1990 it became the ranking of sales based on the number of plays (Broven 2009). Also, in 1979, Sony in Japan launched Walkman with the concept of “carrying music and feel free to enjoy.” At that time, it was music mobile with cassette tape, but now products evolved into digital products such as iPod (Haire 2009).
The evolution of IT also has revived the gaming, printing, and movie industries through new types of digital platforms that have enabled the delivery of content to customers through various pieces of digital equipment. For instance, Ojala (2016) demonstrated how the co-evolution of IT and the gaming industry has enabled new ways of bringing video games to the market—developing digital gaming platforms that radically change traditional business models in the industry. In the field of robotics, which is related to IT, robot manipulation of electronic engineering was mainstream in the beginning of the 21st century. However, in BigDog (Boston Dynamics 2018) and surgery robot (Tadano and Kawashima 2006), traditional mechanical engineering technologies (such as hydraulic pressure and air pressure) are being used to improve small robot manipulation. In an interview about soft robots, which one of the authors attended in 2015 (Center for Research and Development Strategy 2016), a story of experience that traditional technologies overcame the issue of small robot manipulations was obtained. In these ways, the resurgence of a declining industry can be observed due to the co-evolution of IT with that industry.
Therefore, by discovering co-evolution and resurgence of science, this research intends to contribute to industrial redevelopment. The purpose of this research is to establish a methodology for finding academic fields that have co-evolved with other fields by the means of bibliometric methodology. In this research, information technology is selected as the comprehensively explored scientific field. Although previous research (Naveed et al. 2017) showed a resurgence of the music industry by co-evolution in well-known pairs of music and IT on the basis of revenue data, this research aims to find unknown field combinations with co-evolutions and resurgences.
Regarding bibliometric methodologies, Garfield (1955) began the use of citation analysis (such as journal impact factors) to evaluate research in the fields of bibliometrics and scientometrics. The Web of Science provided by Clarivate Analytics took over the Science Citation Index (SCI), which was produced by the Institute for Scientific Information (ISI); it provides most major indicators such as times cited and impact factor among researchers. The number of scientific publications in each field is increasing, and experts need to roughly grasp their fields. Conventionally, experts have struggled to read all scientific publications to grasp their fields, but now it becomes a difficult way. Therefore, bibliometrics and scientometrics have provided methodologies for overviewing academic fields and identifying remarkable research and researchers (Börner et al. 2003; Leydesdorff and Rafols 2009; Rafols et al. 2010; Iwami et al. 2014). Some functions are provided as tools for performing complex analyses easily with web interfaces (Kajikawa et al. 2007; Innovation Policy Research Center 2013). These methodologies and tools have been developed to contribute to evidence-based policy, foresight-based policy, and horizon scanning, which are desired by governments (U.S. Department of Health and Human Services 2016; European Commission 2016; European Commission 2017).
In a previous study, which applied bibliometrics to co-evolution, Murray (2002) explored the co-evolution of science and technology by using the citation analysis of patents and academic papers in tissue engineering. She showed that co-evolution and spillovers can be measured by the cross-citations of papers in patents, simultaneous publication and patenting by companies, and co-publishing across academic and business institutions. Kimmerle et al. (2010) visualized the co-evolution of information in Wikipedia with link analysis. Co-evolution was identified by using the co-occurrence of IPC (International Patent Classification) codes and keywords in a patent on aerospace; this was done by van der Pol and Rameshkoumar (2018). In the way of using similarity of sentences, the methodologies to match social issues and scientific solutions for solving social issues (Ittipanuvat et al. 2014), to match fields of a domain and strong fields of a country for increasing country’s strengths (Iwami et al. 2015b), and to match cutting-edge fields and all scientific fields for developing new collaborations (Iwami et al. 2015a).
While citations, co-occurrences of keywords, collaborations between authors or institutions by co-authoring are available for the measure of co-evolution, this research focuses on co-evolution between sciences (which are industrial seeds) with citations. To identify resurgence, the rise and fall of the field time-series chart was used.
In the recent years, one mission of bibliometrics or scientometrics becomes to provide evidence for decision making about science, technology, research, development, and innovation management (U.S. Department of Health and Human Services 2016; European Commission 2016; European Commission 2017). This research tries to expand the utilization of bibliometric or scientometric approaches. Traditionally, experts such as researchers or funders have sought for investment targets by the means of qualitative investigation like interview, workshop and literature review. Meanwhile, it is difficult for experts to catch up with everything even in their specialty because a large amount of knowledge has been stored every year. Thus, experts need to be supported by the computational methodologies in order to overview their specialty and capture the cutting-edge science and technology. Because the principles and guidelines for evidence-based policy-making have been enacted (European Commission 2002) and data science has been required in technology and innovation management, the demand for evidence for policy and management is high.
This research explores co-evolved fields with IT fields in a half-open attitude, which means that the analysis does not restrict the number of fields on one side of the pairs of co-evolved fields though the fields on the other side is fixed in the IT fields. A previous study (Naveed et al. 2017) has limited the fields to those within the author’s knowledge; this limitation is rational as it removes noise and completes analyses. However, finding unexpected pairs is an advantage of big data analysis, and unexpected pairs will be able to further stimulate industrial development. This research takes the risk of not finding good co-evolved pairs, but exploratory-seeking good co-evolved pairs in all fields is worth a challenge.
The rest of this article is organized as follows: “Methodology” section describes the steps of this research and the technological factors. “Results and discussion” section indicates the results and provides a discussion. “Conclusion” section summarizes this research.
Fields of the top 1% of academic papers in “Computer Science” or “Information Science and Library Science” were identified by the means of clustering citation analysis.
Fields of the academic papers citing the top 1% of academic papers were identified through citation analysis.
Co-evolutions between citing fields and cited fields were evaluated based on dissimilarity and correlation.
Types of citation
In order to build a network of the top 1% academic papers, direct citations, co-citations and bibliographic couplings were used. The reason for using these three types of citations simultaneously was to compensate for the number of direct citations of only 1% of academic papers.
However, in order to build networks of CAs, direct citation and co-citation were used. The types of citations were selected to complete the analyses in a reasonable computation time, incorporating as many citations as possible, depending on the server capacity. The number of CAs became too large to be computed as a result of the actual collection of CAs’ bibliographic data. For example, when a small size of CAs (including around 10,000 papers) was computed actually to know the scale, the number of edges according to direct citation and co-citation was about five times greater than the number of direct citations; the number of edges according to the three types of citations was about 100 times greater than the number of direct citations and co-citations. The gap of 100 times the number of edges led to three times the computation time and one-fifth of the number of clusters. Thus, the usage of bibliographic coupling (based on papers citing CAs) was discarded, and direct citation and co-citation were used for CAs, as shown in Table 1 in the “Appendix”. Since there have been no previous cases in which over 300,000 records were analyzed with three types of citations, the CAs were analyzed without bibliographic coupling to avoid a server crash and irrational time consumptions.
The citation network was divided into clusters by using the academic landscape system (Kajikawa et al. 2007; Innovation Policy Research Center 2013; Kajikawa et al. 2014) with an unsupervised classification, the Newman method (Newman 2004). The system was built based on the methodology of Kajikawa et al. (2007), and it was implemented on a website (Innovation Policy Research Center 2013). The utilization of citations has the merit that the issue of homonyms and synonyms is solved in the form of citations, which use the collective knowledge of experts.
After that, the topics of the clusters are manually identified with keywords, titles, abstracts, and journal names. The merit of clustering of citation networks is that analysts do not need inclusive knowledge of the field’s topics. Therefore, classification can be executed before analysts understand the contents of large amounts of data. The merit of the Newman method (Newman 2004) is that the number of fields is decided automatically by computation in order to maximize modularity; Newman defined “modularity” as an indicator that occur at about 0.3 or more in communities, for example citation network. In the actual utilization of results after clustering (since several minor clusters with less than 10 papers are generated), the tool’s users determine the threshold of the cluster nodes for the adopted clusters—depending on the purpose.
The scores of similarities were relative values in the dataset because word diversity in each field was different. The author’s source code of similarity was provided in https://github.com/shinoiwami/a_JYU2017_06/blob/master/110_keyword_similarity.py.
Pairs of (x, y) are composed of number of cited papers and citing papers in the same year.
High dissimilarity and strong correlation: co-evolution
Low dissimilarity and strong correlation: the same fields between citing fields and cited fields
Weak correlation: non-co-evolution.
Results and discussion
Academic fields in computer science and information technology
The IT fields on the network band do not necessarily include the word “internet,” but general-purpose technologies related to local area network (LAN) and wide area network (WAN) have appeared. The recent IT fields shown in Fig. 7 have been used for business—even in applied science and fundamental theory, such as “genome analysis” (McCarthy 2012; Google 2014), “fuzzy control” (Chiou et al. 2009), “motion capture” (Vlasic et al. 2007), “face recognition” (Abate et al. 2007), and “classification, machine learning.” The IT fields related to “optimization” and “fuzzy” appear twice before and after 2005.
Case 1: fuzzy set
In Fig. 7, if the IT field “fuzzy set” had subsequent IT fields of “rough set” and “fuzzy control,” it was expected to have resurgence in addition to co-evolutions with citing fields. According to Fig. 8a, the cited IT field had two peaks (in 1996 and 2016), and the latter peak followed the increases of CA3, CA4, and CA6. CA2 started to grow with the cited IT field.
As shown in Fig. 8b, CA1–CA10 are divided into two groups by dissimilarity: fields related to decision-making (CA1–CA3, CA5–CA7) and medical fields (CA8 and CA9). The former group had a lower dissimilarity (0.531–0.764) than the latter group (1.000), so the former group contained the successors of the cited IT field. The latter addressed blood vessels in medicine and the applications of the cited IT field. CA10 was unique, belonging to computer science and mathematics, but having high dissimilarity (1.000).
Thus, CA3, CA5, CA6, and CA8 highly co-evolved with the IT field; in particular, a co-evolution between the cited IT field and CA8 fits the background of heterogeneous co-evolutions like publications about fuzzy set applied to images of vessel tracking and segmentation (Grisan et al. 2004; Ayala et al. 2005). Additionally, CA3, CA5, and CA6 contributed to the resurgence of the cited IT field.
Case 2: simulation optimization
The IT field of “simulation optimization” has latent co-evolutions—though it has low correlations with CAs. At step 1 of the methodology, the “simulation optimization” has a high potential for resurgence; this is because its average published year is 1993.0, and the “optimization algorithm” appears in the latter half of the 2010s. Both “simulation optimization” and “optimization algorithm” include the word “optimization.” Figure 9a illustrates trends in the cited IT field of “simulation optimization” and the citing fields as CAs. The peak in the 2000s co-occurred with the trend in CA6 of “model order reduction,” and the peak in 1990 matched the trend in CA9 “electron nucleus” (Huang et al. 1990).
Case 3: channel, antenna
The IT field of “channel, antenna” co-evolved with other fields. According to Fig. 10a, “channel, antenna” had peaks in 2003 and 2005, and CA8, CA10 and CA11 seem like to associate with IT field. As shown in Fig. 10b, CA1 “space time” and CA2 “multiple-input multiple-output” were direct successors because of the lowest dissimilarity (0.413 and 0.407) among CAs; in other words, they have the highest similarity and were almost the same fields of the cited IT field “channel, antenna.” Consequently, the co-evolving CAs with high dissimilarity and high correlation were CA7, CA8, CA9, CA10 and CA11, which belonged to the field of IT.
Observing the history of antenna, Yagi antenna was invented in 1926 (Yagi and Uda 1926; Wikipedia 2018), and it has reduced following the end of analog broadcast. The parabolic antenna is for receiving radio waves from the satellite regardless of analog broadcasting and digital broadcasting (Tuli et al. 2006). Personal Handy-phone System (PHS) has provided to the average people since 1990s, and many countries finished the service in 2010s (NTT DoCoMo Inc. 2014). The whip antenna of PHS was outside of the device, and its form was taken over to a mobile phone. For these wireless devices, low power was an important factor. The research in CA8 (low power) increased since 1999, and it reached the peak in 2005–2006. The 3rd Generation (3G) service enabled videophone in 2001 (Charny 2001), while number of publications in CA10 (video) has a peak in 2004. Smartphones with an invisible antenna started to be sold in 2004, and iPhone since 2008 made smartphones a kind of daily necessities. Multipliers have been studied as shown in CA11 of Fig. 10a, and the efficiency of the circuit including antenna was improved. In this way, tracing the history of the devices involved in the antenna confirms that CA8, CA10 and CA11 co-evolved for the resurgence of the antenna.
A proposed methodology to judge co-evolutions
In the case of CA4 in the “Case 1: fuzzy set” section, CA1 in the “Case 2: simulation optimization” section, and CA1 and CA2 in the “Case 3: channel, antenna” section, the CAs that have strong correlations with their citing IT fields mean the same fields with the citing fields—but not the co-evolved fields. Some cases in this research had successful pairs of co-evolution, but most of the pairs were not co-evolved. In Fig. 11, clusters until the average year 2002 (as shown in Table 1 in the “Appendix”) are plotted, and the mapping indicates relations between dissimilarity and correlation. However, CAs of C1-1, C1-2, and C2-1 were excluded due to memory shortage and the failure of accesses to the Web of Science.
Channel, antenna ↔ low power
Channel, antenna ↔ video
Channel, antenna ↔ multiplier
Fuzzy set ↔ cognitive map
Fuzzy set ↔ metric space
Fuzzy set ↔ vessel tracking and segmentation
Fluid analysis, fluid simulation ↔ quantum molecular dynamics.
This research extracted pairs of co-evolving academic fields by using a large amount of bibliographic data related to information technology. After the 28 IT fields in “Computer Science” and “Information Science and Library Science” were identified with a citation analysis for the top 1% of academic papers, academic papers citing 28 IT fields were retrieved. The dissimilarity and correlation between the citing fields and cited fields were computed. Some pairs of IT fields and citing fields were scrutinized, and there is an issue that pairs with high correlation included both co-evolutions and the same fields between the citing fields and cited fields. Thus, this research regarded pairs with both high dissimilarity and high correlation to be co-evolving pairs. Furthermore, using the time series charts, the background of the co-evolving academic field contributing to resurgence in IT field was confirmed by literature survey.
The findings from the results were as follows: (1) The IT fields “fuzzy set,” “simulation optimization,” and “channel, antenna” experienced a resurgence, and their co-evolved academic fields were identified from their trend charts. (2) Some pairs, such as “fuzzy set” and “vessel tracking and segmentation,” with high correlation and high dissimilarity were extracted as co-evolutions between different fields.
The contributions of this research were as follows: (1) to provide evidence of co-evolution and resurgence in the past and (2) to propose a methodology to find co-evolution. To know unexpected collaboration between heterogeneous fields, it is better that co-evolving fields are combinations of IT field and other than IT field. In fact, there was a phenomenon that co-evolution by insiders and an outsider of IT field support the resurgence in the field of “channel, antenna.” At the resurgence of “fuzzy set,” “vessel tracking and segmentation” became one of co-evolved fields. It is found that co-evolutions contributing to resurgence were identified.
For the precautions and limitations when the methodology was used, (1) more pairs of co-evolution were found when the lower threshold of correlation and dissimilarity was set. (2) Although Fig. 11 was overviewed in multiple fields, thresholds should be designedly set in individual fields as shown in the “Case 2: simulation optimization” section. (3) Causalities were guaranteed with the usage of citation—even if their rise and fall in time-series trends did not occur simultaneously. (4) The gap of timelines to find co-evolution was not considered in this research.
This research provided a methodology that used correlation, dissimilarity, and citation to find co-evolutions, but it still required checking time-series trends to find resurgence. Future works should address a methodology to find past resurgence and forecast resurgence candidates. In addition, in order to use co-evolution effectively, the durability or scale of co-evolutions had better to be measured. The discovery of co-evolution and resurgence will provide evidence for reviving industries through IT.
Open access funding provided by University of Jyväskylä (JYU). The research leading to these results has received funding from the Strategic Research Council at the Academy of Finland under Grant Agreement No 293446—Platform Value Now: Value capturing in fast-emerging platform ecosystems. The server resources are supported by the research project of Prof. Ichiro Sakata, Prof. Junichiro Mori’s laboratory in the University of Tokyo, and Prof. Yuya Kajikawa in Tokyo Institute of Technology.
- Boston Dynamics. (2018). BigDog. https://www.bostondynamics.com/bigdog. Retrieved April 7, 2018.
- Broven, J. (2009). Record makers and breakers: Voices of the independent rock “n” roll pioneers. University of Illinois Press. https://books.google.com/books?id=uFrYsNvmrvQC&pg=PA187. Retrieved August 19, 2018.
- Center for Research and Development Strategy (CRDS), Japan Science and Technology Agency (JST). (2016). Innovations of basic technology for robots through integration of nanotechnology, information technology, and mechanics—Toward smart and human-friendly robots (in Japanese). http://www.jst.go.jp/crds/pdf/2015/SP/CRDS-FY2015-SP-03.pdf. Retrieved April 26, 2018.
- Charny, B. (2001). World’s first 3G phone network goes live. ZDNet. https://www.zdnet.com/article/worlds-first-3g-phone-network-goes-live/. Retrieved September 4, 2018.
- Chiou, C. B., Chiou, C. H., Chu, C. M., & Lin, S. L. (2009). The application of fuzzy control on energy saving for multi-unit room air-conditioners. Applied Thermal Engineering,29(2–3), 310–316. https://doi.org/10.1016/j.applthermaleng.2008.02.031.CrossRefGoogle Scholar
- European Commission (EC). (2002). Communication from the Commission on the collection and use of expertise by the Commission: Principles and guidelines. https://publications.europa.eu/en/publication-detail/-/publication/969f5240-ec81-4998-911e-6831d9318919. Retrieved April 30, 2018.
- European Commission (EC). (2016). Foresight and horizon scanning—JRC science hub. https://ec.europa.eu/jrc/en/research/crosscutting-activities/foresight. Retrieved November 11, 2017.
- European Commission (EC). (2017). Foresight—Research and innovation. https://ec.europa.eu/research/foresight/index.cfm. Retrieved November 11, 2017.
- Fujita, K., Kajikawa, Y., Mori, J., & Sakata, I. (2012). Detecting research fronts using different types of combinational citation. In 17th International conference on science and technology indicators (STI 2012). Montreal. http://2012.sticonference.org/Proceedings/vol1/Fujita_Detecting_273.pdf.
- Google. (2014). Google genomics. https://cloud.google.com/genomics/. Retrieved November 22, 2017.
- Grisan, E., Pesce, A., Giani, A., Foracchia, M., & Ruggeri, A. (2004). A new tracking system for the robust extraction of retinal vessel structure. In The 26th annual international conference of the IEEE engineering in medicine and biology society (Vol. 3, pp. 1620–1623). IEEE. https://doi.org/10.1109/iembs.2004.1403491.
- Haire, M. (2009). The walkman. TIME. http://content.time.com/time/nation/article/0,8599,1907884,00.html. Retrieved August 16, 2018.
- IFPI. (2017). Global music report 2017: Annual state of the industry. http://www.ifpi.org/downloads/GMR2017.pdf. Retrieved June 28, 2018.
- Innovation Policy Research Center (IPRC). (2013). Academic landscape system. http://academic-landscape.com/. Retrieved November 11, 2017.
- Iwami, S., Harada, D., & Kajikawa, Y. (2015a). A study on the extraction of existing science and technology candidates to apply for the cutting-edge academic fields (in Japanese). In The Japan society for science policy and research management, the 30th annual meeting. Tokyo.Google Scholar
- Iwami, S., Kogo, K., Tacoa, F., Mori, J., Kajikawa, Y., & Sakata, I. (2015b). Cross-disciplinary methodology for detection of collaborative and competitive candidates. In The 24th international conference for management of technology (IAMOT 2015). Cape Town. http://iamot2015.com/2015proceedings/documents/P013.pdf.
- NTT DoCoMo. (2014). PHS (personal handyphone system) (In Japanese). http://history-s.nttdocomo.co.jp/list_phs.html. Retrieved August 18, 2018.
- Shibata, N., Kajikawa, Y., Takeda, Y., & Matsushima, K. (2008). Detecting emerging research fronts based on topological measures in citation networks of scientific publications. Technovation,28(11), 758–775. https://doi.org/10.1016/j.technovation.2008.03.009.CrossRefGoogle Scholar
- Shibata, N., Kajikawa, Y., Takeda, Y., Sakata, I., & Matsushima, K. (2011). Detecting emerging research fronts in regenerative medicine by the citation network analysis of scientific publications. Technological Forecasting and Social Change,78(2), 274–282. https://doi.org/10.1016/j.techfore.2010.07.006.CrossRefGoogle Scholar
- Tadano, K., & Kawashima, K. (2006). Development of 4-DOFs forceps with force sensing using pneumatic servo system. In Proceedings 2006 IEEE international conference on robotics and automation, 2006. ICRA 2006 (pp. 2250–2255). IEEE. https://doi.org/10.1109/robot.2006.1642038.
- Tuli, T. S., Orr, N. G., Robert, D., & Zee, E. (2006). Low cost ground station design for nanosatellite missions. In North American space symposium. North American Space Symposium. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.564.3644.
- U.S. Department of Health and Human Services (HHS). (2016). STAR METRICS. https://www.starmetrics.nih.gov/. Retrieved November 11, 2017.
- Wikipedia. (2018). History of smart antennas. In Wikipedia. https://en.wikipedia.org/wiki/History_of_smart_antennas. Retrieved August 18, 2018.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.