Keywords

1 Introduction

Kuhn [8] defines a scientific community as a group of persons who study a specific subject sharing a common paradigm, that is, a set of methods, values, language, principles and concepts. This shared viewpoint allows the proposition of problems and the acceptance of the respective solutions by the members of the community. Scientific literature plays a twofold role, either providing for the novice the stabilized facts widely accepted by the community, and communicating the newer theories, under development by the experts [3]. The actual paradigm is formalized, spread out and evolved through the literature.

As part of his work on Social Studies of Science and Technology, Latour [9] proposes, as a methodological principle, that the study of science should not be restricted to the analysis of its final products; instead, it should start from the discussions that evolve to the inception and refinement of such outcomes. By observing the science while it is produced, it is possible to witness the involved entities acting upon one another, making possible to understand how the resultant of the combined interests and possibilities shapes the path towards scientific discoveries.

The International Conference on Informatics and Semiotics in Organizations – ICISO – is the most traditional event to reunite periodically the community devoted to the research on Organizational Semiotics, reaching its 17th edition in 2016. It is also part of a series of international conferences aiming to study the role and effects of information and communication in organizational contexts.

In this work, our goal is to provide an understanding of the current structure of the Organizational Semiotics community as expressed through the papers published in the last two editions of ICISO. We draw upon methods of Social Studies of Science and Technology to grasp the broader social context in which the more technical and methodical work of researchers is immersed. We expect to contribute to the self-knowledge about how the community is arranged, in an alternative way, complementary to common widely available scientometrics.

This paper is organized as follows: Sect. 2 presents the chosen theoretical framework to support scrutinizing the structure of associations that compose ICISO. In Sect. 3 we detail the data sources and methodological steps applied, followed by Sect. 4 presenting and discussing the results of such procedure. In Sect. 5 we discuss the findings, and concluding in Sect. 6.

2 Theoretical Background

The study of science-making from the viewpoint of the sociology is the root of the Actor-Network Theory (ANT) a theoretical and methodological framework that understands social phenomena as the result of entities, or actors, capable of influencing the behavior of the others; these actors are tied together by relationships of mutual benefit, constituting a network. Moreover, ANT claims that the role of actors in social phenomena is not restricted to humans, but can also be played by non-humans.

One of the non-human participants of the collaborative construction of scientific facts is the scientific literature. Using another piece of literature in a paper may help building the adopted theoretical background, showing related results for the sake of positive or negative comparison, manifesting affiliation to a certain group or trend of opinion, or expressing the acceptance of a previous result taking it for granted and using it for further research. As an allegation of one publication is accepted by others in the form of citations, it progressively gains the status of scientific fact.

Only when it becomes tacit knowledge, the citation is no longer required. To give an example, Latour [9] asks “Who refers to Lavoisier’s paper when writing the formula H2O for water?” The stabilized fact of water being composed of hydrogen and oxygen becomes incorporated into tacit knowledge, with no mark of being produced by anyone [11].

When a scientific publication P refers to another one R, authors of P are benefited by the previous results and statements provided by the other, since much of the content of R might be – at least partially – accepted by the target audience. On the other hand, the more R is cited, the more its authors are benefited by the acknowledgement of the relevance of their work and the spreading of R’s content among P’s readers. This relationship, shown in Fig. 1, provides mutual benefit for the human actors, being mediated by the non-human entities P and R. Even whether authors of P and R do not know each other, they become associated.

Fig. 1.
figure 1

Relationship of mutual benefit established between authors of a paper P referencing another publication R. Solid arrows denote reference, dashed ones the provided benefit

Tracing and analyzing these networks of relationships, one can understand how a scientific community is arranged, what are the sources of its adopted paradigm, how convergent are its interests, and if there is any pattern over time. Unlike other methods of bibliometric analysis, as the co-authorship [6] or co-citation [15] graphs, which focus only on the persons or the publications, the ANT approach does not privilege participants of any specific nature. It allows highlighting main scientists and important bibliography contributing to the subject of a conference, and the whole structure that emerges from their relationships.

3 Procedures of Data Gathering and Summarization

Considering the editions of ICISO conference as samples of the Organizational Semiotics community, we chose to apply ANT’s point of view to its bibliographic data freely available on the internet. Springer provides the proceedings of 2014 and 2015 occurrences of ICISO, along with the list of references for each published paper. This data were fed into a software originally intended to support systematic bibliographic reviews [13], but that proved suited for our purpose, managing the publication list, the relations to references and authors, and generating the visualizations.

Not all references could be used. For instance, web pages are not considered references since their authors – human actors – are not defined. Other papers have duplicate references that were accounted only once. Some similar author names were disambiguated using an automatic algorithm [10]. Whenever available, DOI is applied to retrieve additional information about the references and to solve duplications due to typos or divergences in credited titles.

Once the automatic procedure was finished, we carried out a manual inspection of the results and performed minor corrections by hand when necessary. This includes splitting nodes representing distinct persons with the same name abbreviation, and merging references with misspelled titles, using additional data sources and search engines when necessary. Not all nodes required manual inspection: possible splitting was checked only by the ones representing authors and with degree >1 (177 of 3599, or about 5%); similarly, candidates for merging were screened using a Levenshtein distance <3 prior to manual procedures. The final dataset is summarized in Table 1.

Table 1. Summary of data retrieved for each conference edition

The visualization of the graph of bibliographic references is based on the proposal of Prado and Baranauskas [12]. They applied the betweenness centrality measure [2] to calculate the size of the vertices, representing the importance that each vertex has to keep the network together. They also used a spring forces layout algorithm [5], to obtain an automatic arrangement of vertices, what they argue is well aligned with ANT’s view of a social group as a balance between the forces generated by authors through associations, bringing together the ones they are interested but keeping apart from the others. In comparison to the most commonly used graphs of co-authorship containing only human actors, this representation provides a better view of the conference edition as a single social phenomena, since the number of connected components is significantly smaller.

The final representation of the graph is exemplified in Fig. 2: papers published in the conference are represented as blue square vertices and their authors as blue circles; while the references and their authors are red squares and circles, respectively. If an author of a paper has also authored a reference, that first condition prevails. The size of each vertex is proportional to the logarithm of its betweenness centrality. We can use it to identify the relevant actors for each network and understand the main contributors to the shared theoretical background. When authors and publications are marked relevant in more than a single conference edition, we can suspect their importance are not due to the specific topic of that particular edition, but instead it is part of the community’s adopted paradigm. Besides the full graphs, smaller versions were obtained by applying an intensive graph pruning algorithm [7]. This succinct preview allows knowing the main actors that compose the backbone of the graph, before we drill down into the details of the whole network.

Fig. 2.
figure 2

Adopted representation for the actor-network graph (Color figure online)

Buchdid and Baranauskas [4] analyze conferences using word clouds [14] extracted from the titles of papers published in conference editions. Word clouds provided visual clues to the essence of the studied conferences at a glance. We believe that this high-level representation complements the lower level structural analysis of ANT and both together provide a better understanding of the work of the social group.

To support the study in relation to the particularities of each conference edition, we recovered their proposed topics of interest. ICISO 2014 called for papers about “Service Science and Knowledge Innovation”, a young discipline that has attracted great attention by academy and industry because of the increasing need for a scientific approach to guide the study of services. ICISO 2015 tackled “Information and Knowledge Management in Complex Systems”, such as large-scale projects, network of networks, and dynamic and evolving enterprises.

4 Analysis of Results

Using the titles of the papers published in each conference edition, we built the word clouds shown in Fig. 3. The 2014 edition is well aligned with that conference motto, as the words “service”, “based” and “approach” are the most frequent, followed by “method” and “architecture”. For 2015, however, the topics become more generic, and “information” becomes highlighted, followed by “semiotics”, “systems”, and as in the previous year, “based” and “approach”, aligning the community production more to its roots of semiotics and information systems.

Fig. 3.
figure 3

Word clouds generated from the titles of papers published in ICISO 2014 and 2015 – above and below, respectively

From the graph data structures, some metrics were obtained, as shown in Table 2. Not all published papers share a path with the others, thus creating insulated groups of vertices, that is, distinct connected components [1]. Papers published in 2014 are aggregated in 11 components while the 2015 edition produced 10. However, there is always a major connected component gluing most of the vertices. In both years, this component comprises more than half of the whole graph, showing the convergence of the community persisting along the 2 years, despite of the different themes and scales of each conference edition.

Table 2. General graph measures for each conference edition

Figure 4 shows the simplified versions of the actor-networks, composed only by the main actors and their associations for each conference edition, including published papers, references and authors with higher betweenness. Although papers published at each edition and well aligned to its theme are included, e.g., “Hierarchical Clustering Based Web Service Discovery” for 2014, and “A Semiotic Approach to Investigate Quality Issues of Open Big Data Ecosystems” for 2015 – our analysis will focus on the references they draw upon to make their claims, as the basis of the community.

Fig. 4.
figure 4

Simplified graphs showing main actors for ICISO 2014 and 2015 – above and below, respectively

Considering the complete dataset, given the greater number of published papers in 2014, it is not feasible to print the whole graph in a single picture; therefore, we cropped it to show the central portion of the main connected component, producing Fig. 5. For the 2015 edition, having less than half of the features, it was possible to build Fig. 6 depicting the complete graph. Some of the actors cited in the text are pointed out.

Fig. 5.
figure 5

Detail of the main connected component of ICISO 2014 actor-network. The references and authors with higher betweenness centrality are labeled

Fig. 6.
figure 6

Complete graph of associations for ICISO 2015. Some smaller components were repositioned for better fitting on the page

In both years, the most relevant reference author is “Ronald K. Stamper” and the most relevant reference is “A Framework of information system concepts - The FRISCO Report”. The nodes representing them are well positioned in the two generated layouts, at the center of the main connected component. Both constitute a path between two large blocks of published papers and their related references. Another reference worth noting is “Semiotics in Information Systems Engineering”, being placed in the middle of dense clusters of papers in the two graphs, what led the analysis to its author, “Kecheng Liu”, who provided more used references and authored some of the published papers in each conference edition.

Regarding well-positioned references and authors that appear in a single conference edition, we can cite for 2014 the “Review of web service discovery technology” by “Liu J. et al.”; and “Measuring and Comparing Effectiveness of Data Quality Techniques” by “John Mylopoulos et al.”. Their titles suggest they are aligned to each conference theme, what adds proof to the relative relevance calculated to them, inside the subject of that particular edition.

5 Discussion of the Outcomes

There seems to be a recurring pattern of associations of the published papers and their references along with the studied conference editions, as sketched in Fig. 7. Stamper’s publications and himself are connecting distant clusters of papers, as a broader source of concepts. On the other hand, “Semiotics in Information Systems Engineering” appears inside a cluster of papers with several inner connections. This may be extrapolated as the current general pattern that creates a solid structural basis for the O.S. community.

Fig. 7.
figure 7

General pattern found in both conference editions

During the manual inspection of data, some limitations of the used tool were detected. The person “Taylor A”, pointed out as author of two references used in 2015, is in fact the abbreviated name of two distinct researchers: Alex Taylor, author of “On the naturalness of touchless” and Alva Taylor, author of “Superman or the Fantastic Four?”. An additional node was created for each one, removing the original. This kind of interventions did not reshape significantly the graph as a whole, but created one of the minor connected components. In another situation, author “Liu J.” could not have his identity checked: he is credited as “Jiajun Liu” at IEEE for authoring “Determination of Activities of Daily Living of independent living older people using environmentally placed sensors”, but the authorship of “Review of Web service discovery technology” could not be verified. In case of splitting, this node would have its relevance recalculated as a lower value.

To a brief comparison to a purely quantitative bibliometric analysis, we ranked the number of cited publications and authors among the conference papers. The most used bibliography are “Semiotics in Information Systems Engineering” (12 citations) and “Information in Business and Administrative Systems” (4 citations). Comparing to ANT’s results, both have also a relatively high betweenness, particularly for the 2015 edition, what makes them indubitably important for the community. However, the second one is cited only by papers of the same two groups of researchers, making it more localized into the community. Authors most often cited are Kecheng Liu (37 citations), followed by John Krogstie (23 citations), reputed scientists in the field of Information Systems. However, the citations for the latter come mostly from a single paper published at ICISO 2015; therefore, his publications cannot be seen as common ground for the community. There is a contrast between the good results obtained by “Ronald K. Stamper” and “The FRISCO report” using ANT and their lower citation count; we interpret this as if the ideas contained therein are becoming tacit knowledge within the members of the O.S. community. The major weakness of the used method is the dependence to the availability of data, as a relevant subset of the publications’ bibliographical references is required to provide a representative outcome. The unavailability of full proceedings of a conference edition hinders a longer-term analysis.

6 Conclusion

Scientific communities evolve around a set of concepts and assumptions, drawn from distinct bibliographic references and authors. Some of these become tacit knowledge among the researchers and practitioners, and therefore are not always captured by the most traditional scientometric methods. In this paper, we applied an alternative approach to analyze the relationships between researches within a scientific community mediated by scientific literature, as proposed by ANT. Based on the ICISO 2014 and 2015 proceedings available online, we were able to construct graphs representing such associations, and apply algorithms to build visualizations of the core structures that keep this community standing up as an X-ray.

Our analysis highlights the very foundational work of Ronald Stamper towards defining a common ground for the Organizational Semiotics researchers through the “FRISCO Report” and other papers. Besides, the book “Semiotics in Information Systems Engineering” also appears as a central source of tools and methods enabling many other related scientific projects, while its author Kecheng Liu remains an active member of the community. These results corroborate ANT’s choice of keeping human and non-human together and trace their relationships for the study of scientific communities.