We employ data envelopment analysis on a series of experiments performed in Fermilab, one of the major high-energy physics laboratories in the world, in order to test their efficiency (as measured by publication and citation rates) in terms of variations of team size, number of teams per experiment, and completion time. We present the results and analyze them, focusing in particular on inherent connections between quantitative team composition and diversity, and discuss them in relation to other factors contributing to scientific production in a wider sense. Our results concur with the results of other studies across the sciences showing that smaller research teams are more productive, and with the conjecture on curvilinear dependence of team size and efficiency.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Results of this study show that biology labs should ideally have between ten and fifteen members.
Moreover, from a data-driven analysis both general and particular conclusions can be extracted, while, as we will see in more detail shortly, this is not the case with data-independent, i.e. hypotheses-driven analyses.
Kitcher (1990) writes: “I claim, simply, that we sometimes want to maintain cognitive diversity even in instances where it would be reasonable for all to agree that one of two theories was inferior to its rival, and we may be grateful to the stubborn minority who continue to advocate problematic ideas.” Both Kitcher and Strevens use examples from history of science to underline the positive epistemic effect of diversity in methods used for tackling scientific problems. They have investigated the best way of assigning credit within the scientific community in order to achieve an optimal division of labor.
They point to shortcomings in Weisberg and Muldoon’s approach in order to undermine their attempts to provide epistemic reasons for a division of cognitive labor. Using an epistemic landscape model they show that in some cases homogeneous populations may be more successful than heterogeneous ones. Furthermore, they offer a general complaint against Weisberg and Muldoon’s particular model: the necessity of basing simulations on assumptions concerning the specific nature of the epistemic landscape is problematic because specific features of the epistemic landscape may be beyond our knowledge. They did not argue against any epistemic reason for cognitive diversity, but instead claimed that Weisberg and Muldoon did not succeed in showing it.
He investigates the following issue: imagine that there are two competing methods, M1 and M2, for tackling a scientific problem. It may happen that initial experiments favor M1 even though M2 is the correct method. In this situation a consensus can be reached too fast; scientists who believe that M2 is the correct method may cancel further research once they become aware of their colleagues' experimental results and thus the wrong methodology may become consensual. Zollman uses computer simulation to explore whether there is a correlation between the structure of a communication network and convergence towards the right hypothesis. The structure of a communication network represents connections between agents, or more precisely the connections between them and those with whom they share information.
In other words, the nodes of a graph representing a scientific network can represent individuals, teams, or laboratories.
Furthermore, one can use this information as a predictor about the project, with the help of machine-learning algorithms (neural networks, support vector machines, and logistic regression).
Each problem in linear programming, a so-called primal problem, can be converted into a dual problem that provides an upper limit for an optimal value of the primal problem.
Our choice of data was based on historical records, the HEP database we discuss in the following subsection, and interviews with the proposers of experiments in Fermilab, a former head of its Research and Development Department (during the period when some of the experiments we analyzed were performed), as well as several physicists currently and formerly involved in various experiments performed at Fermilab and CERN.
It would be interesting to include the costs as well as more fine-grained distribution of labor within a project, such as the number of researchers within each team. However, these data were not publicly available.
See Hoddeson et al. (2008, esp. section 7) on the constitution and labour distribution among the teams in Fermilab.
This is why the occasional replacement of researchers and graduate students, or those added to the project at later stages, are indicated on a secondary list of collaborators in Fermilab’s archives, not in the proposal section of the archive. This secondary list also contains the names of researchers who worked and published based on the results of the performance after the experiment was finalized, but who were not necessarily directly involved in it. Collaborations of this sort are undertaken for the purpose of producing papers based on the results of the experiment and do not necessarily overlap with the collaboration that designs and performs the experiment, which interests us here. We discuss the relationship between the primary collaborations and these wider collaborations in “Analysis” section.
The categories are a property of the database we used. Please see next section for the explanation concerning the use of this particular scaling.
The experiment associated with a number of prominent papers is not necessarily an indication that it was the breakthrough event. It could be a part of a larger research trend instead. But given the citation pattern it was certainly successful in addressing the physical phenomenon within the prominent trend, and which continued to be prominent after the experiment was performed. In other words, the high citation record indicates minimally that the goal of the experiment and the performance has successfully become part of the trend.
Some experiments are related closely in terms of their content, but they are not strings of experiments where essentially the same team applies for the next phase of the same experiment.
The value of the data from Fermilab is undeniable, since it is one of the biggest and most successful physics laboratories in the world as we have pointed out. In more practical terms, since the number of individual experiments and laboratories in high-energy physics is very small relative to, e.g., biology, a laboratory such as Fermilab was one of the natural options, both because of the number of experiments performed within it over the years, and because of the a high quality of record of experiments in its electronic archives.
Here is an example record for a Fermilab experiment (E-104): http://inspirehep.net/record/1110215. If you click on the "HEP articles associated with FNAL-E-0104" link, it will take you to the list of all the papers associated with the experiment, which should give you an overview of the results of the experiment. There are usually links to the full text of the papers, and the earliest item in that list is usually the experiment proposal.
Invenio is an open source software package which provides framework and tools for building an autonomous digital library server. For more details see: http://invenio-software.org/.
For the full list of INSPIRE content sources see: https://inspirehep.net/info/general/content-sources.
For the description of citation metrics available in Invenio see: http://inspirehep.net/help/citation-metrics#citesummary.
As an example of the citation summary of an experiment see: http://inspirehep.net/search?ln=en&ln=en&p=693__e%3AGNO&of=hcs.
Some of the available data concerning individual team members are either not complete or are not suitable to be used in DEA, the way that we utilized the data on the factors we tested. Also, the data on the financial resources of the projects could be beneficial as well but they are not available for individual experiments we included in our analysis.
The full list of considered Fermilab projects with their details can be found at the following address: http://ccd.fnal.gov/techpubs/fermilab-reports-proposal.html.
Sometimes an experiment falls in a grey area in terms of its goals, combining e.g. the introduction of new techniques and their wide application across physical sciences. This is typically not the case. Also, sometimes an experiment can aim at a particular goal, e.g. the testing of a new technique, but result in the major discovery of a new particle (e.g. discovery of J/PSI particle). This happens very rarely, however, as most experiments give results within their set goals.
As we pointed out, the indicated team members really worked with the support staff and technicians so the actual team size was larger. But again, this is not important since we only need data on the differences between the primary members of teams.
Changing the relative weight of the publications (i.e. the weight of a large number of less significant publications compared to the weight of a smaller number of more significant publications) would not change this significantly.
A future study could also look at the size of each sub-team of the master-team in the experiment, along with the overall number of researchers, and thus determine optimal size of each team within the experiment.
One could, generally speaking, focus on the performance of individual academics instead of teams (see e.g. Horta and Lacy 2011).
This is in fact the main reason why Fermilab management, as well as management in other HEP laboratories, insist that very specialized work done by graduate students in the lab can be turned into doctoral dissertations at their home universities (Hoddeson et al. 2008).
We considered as senior researchers scientists that had obtained their PhD degrees by the time the project was proposed, as well as researchers who had been academically active for at least 12 years before the project was proposed and had at least 24 publications in the high-energy physics database <http://inspirehep.net/?ln=en> before the project in question was proposed and for whom we could not find the year they received PhD degree. Publications in the High-Energy Physics database include articles, books, conference proceedings, and project proposals. Specifically, for 75 % of senior researches we established that they had a PhD when the project was proposed, while the seniority of the remaining 25 % was determined indirectly. This choice was made due to data limitations and the fact that at the time experiments were conducted, in some countries, the degrees other than PhD or related research statuses were awarded instead.
As junior researchers we characterised those for whom we conclusively established that they did not have a PhD degree at the time project was proposed, nor a publication record prior to the experiment.
Decision trees are graph structures used for identifying best strategies for a fixed goal.
Logistic regression is a statistical model used for predicting the optimal division of resources.
Abbasi, A., Hossain, L., Uddin, S., & Rasmussen, K. J. (2011). Evolutionary dynamics of scientific collaboration networks: Multi-levels and cross-time analysis. Scientometrics, 89(2), 687–710.
Agasisti, T., & Johnes, G. (2015). Efficiency, costs, rankings and heterogeneity: The case of US higher education. Studies in Higher Education, 40(1), 60–82.
Agrell, A., & Gustafson, R. (1996). Innovation and creativity in work groups. In M. A. West (Ed.), Handbook of work group psychology (pp. 317–344). Chichester: Wiley.
Alexander, J. M., Himmelreich, J., & Thompson, C. (2015). Epistemic landscapes, optimal search, and the division of cognitive labor. Philosophy of Science, 82(3), 424–453.
Andrews, F. M. (Ed.). (1979). Scientific productivity: The effectiveness of research groups in six countries. Cambridge: Cambridge University Press.
Bantel, K. A., & Jackson, S. E. (1989). Top management and innovations in banking: Does the demography of the top team make a difference? Strategic Management Journal, 10, 107–124.
Ben-Gal, I. (2005). Outlier detection. In O. Maimon & L. Rockach (Eds.), Data mining and knowledge discovery handbook: A complete guide for practitioners and researchers (pp. 131–146). Kluwer Academic Publishers/Springer.
Boisot, M. (2011). Collisions and collaboration: The organization of learning in the ATLAS experiment at the LHC. Oxford: Oxford University Press.
Bonaccorsi, A., & Daraio, C. (2005). Exploring size and agglomeration effects on public research productivity. Scientometrics, 63(1), 87–120.
Brinkman, P. T., & Leslie, L. L. (1986). Economies of scale in higher education: Sixty years of research. Review of Higher Education, 10(1), 1–28.
Campion, M. A., Medsker, G. J., & Higgs, A. C. (1993). Relations between work group characteristics and effectiveness: Implications for designing effective work groups. Personnel psychology, 46(4), 823–847.
Carayol, N., & Matt, M. (2004). Does research organization influence academic production? Laboratory level evidence from a large European university. Research Policy, 33(8), 1081–1102.
Carayol, N., & Matt, M. (2006). Individual and collective determinants of academic scientists’ productivity. Information Economics and Policy, 18(1), 55–72.
Carillo, M. R., Papagni, E., & Sapio, A. (2013). Do collaborations enhance the high-quality output of scientific institutions? Evidence from the Italian Research Assessment Exercise. The Journal of Socio-Economics, 47, 25–36.
Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision-making units. European Journal of Operational Research, 2(6), 429–444.
Cook, I., Grange, S., & Eyre-Walker, A. (2015). Research groups: How big should they be? PeerJ, 3, e989. doi:10.7717/peerj.989.
Cooper, W. W., Seiford, L. M., & Zhu, J. (2011). Handbook on data envelopment analysis (Vol. 164). New York: Springer.
Cvetkoska, V. (2011). Data envelopment analysis approach and its application in information and communication technologies. In M. Salampasis & A. Matopoulos (Eds.), Proceedings of the international conference on information and communication technologies for sustainable agri-production and environment (HAICTA 2011), Skiathos, pp. 421–430.
Dokas, I., Giokas, D., & Tsamis, A. (2014). Liquidity efficiency in the Greek listed firms: A financial ratio based on data envelopment analysis. DEA window analysis approach for measuring the efficiency of Serbian Banks based on panel data. Management (1820–0222), (65).
Emrouznejad, A., Banker, R., Lopes, A. L. M., & de Almeida, M. R. (2014). Data envelopment analysis in the public sector. Socio-Economic Planning Sciences, 48(1), 2–3.
Fowler, J. H., & Aksnes, D. W. (2007). Does self-citation pay? Scientometrics, 72(3), 427–437.
Franklin, A. (1990). Experiment, right or wrong. Cambridge: Cambridge University Press.
Galison, P., & Hevly, B. W. (1992). Big science: The growth of large-scale research. Stanford: Stanford University Press.
Garfield, E., Sher, I. H., & Torpie, R. J. (1964). The use of citation data in writing the history of science. Philadelphia: The Institute for Scientific Information.
Greenberg, D. S. (1999). The politics of pure science. Chicago: University of Chicago Press.
Hackman, J. R., & Vidmar, N. (1970). Effects of size and task type on group performance and member reactions. Sociometry, 33, 37–54.
He, F., Xu, X., Chen, R., & Zhang, N. (2015). Sensitivity and stability analysis in DEA with bounded uncertainty. Optimization Letters, 10(4), 1–16.
Heilbron, J. L., & Seidel, R. W. (1989). Lawrence and his laboratory: A history of the Lawrence Berkeley laboratory (Vol. 1). Berkeley: University of California Press.
Heinze, T., Shapira, P., Rogers, J. D., & Senker, J. M. (2009). Organizational and institutional influences on creativity in scientific research. Research Policy, 38(4), 610–623.
Herman, A., Krige, J., Mersits, U., & Pestre, D. (1987). History of CERN, vol. 1. Launching the European Organization for Nuclear Research. Amsterdam/New York: North-Holland Physics Pub.
Hoddeson, L. (1997). The rise of the standard model: A history of particle physics from 1964 to 1979. Cambridge: Cambridge University Press.
Hoddeson, L., Kolb, A. W., & Westfall, C. (2008). Fermilab: Physics, the frontier, and megascience. Chicago: University of Chicago Press.
Horta, H., & Lacy, T. A. (2011). How does size matter for science? Exploring the effects of research unit size on academics’ scientific productivity and information exchange behaviors. Science and Public Policy, 38(6), 449–460.
Jackson, S. E. (1996). The consequences of diversity in multidisciplinary work teams. In M. A. West (Ed.), Handbook of work group psychology (pp. 53–76). Chichester: Wiley.
Katz, R. (1982). The effects of group longevity or project communication and performance. Administrative Science Quarterly, 27, 81–104.
Kimberly, J. R. (1981). Managerial innovation. In P. C. Nystrom & W. H. Starbuck (Eds.), Handbook of organizational design: Adapting organizations to their environments (pp. 4–104). Oxford: Oxford University Press.
Kitcher, P. (1990). The division of cognitive labor. Journal of Philosophy, 87(1), 5–22.
Kitcher, P. (1993). The advancement of science. New York: Oxford University Press.
Koetter, M., & Meesters, A. (2013). Effects of specification choices on efficiency in DEA and SFA. In Efficiency and productivity growth: Modelling in the financial services industry, pp. 215–236.
Kozlowski, S. W. J., & Bell, B. S. (2003). Work groups and teams in organizations. In W. C. Borman, D. R. Ilgen, & R. J. Klimoski (Eds.), Handbook of psychology (Vol. 12, pp. 333–375). New York: Industrial and Organizational Psychology.
Kozlowski, S. W. J., & Hulls, B. M. (1986). Joint moderation of the relation between task complexity and job performance for engineers. Journal of Applied Psychology, 71, 196–202.
Kragh, H. (2002). Quantum generations: A history of physics in the twentieth century. Princeton: Princeton University Press.
Krige, J. (1993). Some socio-historical aspects of multinational collaborations in high-energy physics at CERN between 1975 and 1985. In Denationalizing science (pp. 233–262). Springer, Netherlands.
MacRoberts, M. H., & MacRoberts, B. R. (1989). Problems of citation analysis: A critical review. Journal of the American Society for Information Science, 40, 342–349.
Martin, B. R., & Irvine, J. (1984). CERN: Past performance and future prospects: I. CERN’s position in world high-energy physics. Research Policy, 13(4), 183–210.
Martin, B. R., & Irvine, J. (1985). Basic research in the East and West: A comparison of the scientific performance of high-energy physics accelerators. Social Studies of Science, 15(2), 293–341.
Martz, W. B., Vogel, D. R., & Nunamaker, J. F. (1992). Electronic meeting systems: Results from the field. Decision Support Systems, 8(2), 141–158.
Milojević, S. (2014). Principles of scientific research team formation and evolution. Proceedings of the National Academy of Sciences, 111(11), 3984–3989.
Nieva, V. F., Fleishman, E. A., & Reick, A. (1985). Team dimensions: Their identity, their measurement, and their relationships (Research Note 85–12). Washington, DC: U. S. Army, Research Institute for the Behavioral and Social Sciences.
Olsen, D., & Simmons, A. (1996). The research versus teaching debate: Untangling the relationships. New Directions for Institutional Research, 1996(90), 31–39.
Olson, B. J., Parayitam, S., & Bao, Y. (2007). Strategic decision making: The effects of cognitive diversity, conflict, and trust on decision outcomes. Journal of Management, 33(2), 196–222.
Page, S. E. (2007). Making the difference: Applying a logic of diversity. The Academy of Management Perspectives, 21(4), 6–20.
Page, S. E. (2011). Diversity and Complexity. Princeton: Princeton University Press.
Perovic, S. (2011). Missing experimental challenges to the Standard Model of particle physics. Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics, 42(1), 32–42.
Poulton, B. C. (1995). Effective multidisciplinary teamwork in primary health care. Unpublished doctoral thesis, Institute of Work Psychology, University of Sheffield, Sheffield, England.
Qurashi, M. (1991). Publication-rate and size of two prolific research groups in departments of inorganic chemistry at Dacca University (1944–1965) and Zoology at Karachi University (1966–84). Scientometrics, 20(1), 79–92.
Scharf, A. (1989). How to change seven rowdy people. Industrial Management, 31, 20–22.
Strevens, M. (2003). The role of the priority rule in science. Journal of Philosophy, 100(2), 55–79.
Torrisi, B. (2014). A multidimensional approach to academic productivity. Scientometrics, 99(3), 755–783.
Valentin, F., Norm, M. T., & Alkaersig, L. (2016). Orientations and outcome of interdisciplinary research: The case of research behavior in translational medical science. Scientometrics, 106(1), 67–90.
van der Wal, R., Fischer, A., Marquiss, M., Redpath, S., & Wanless, S. (2009). Is bigger necessarily better for environmental research? Scientometrics, 78(2), 317–322.
Von Tunzelmann, N., Ranga, M., Martin, B., & Geuna, A. (2003). The effects of size on research performance: A SPRU review. Brighton: SPRU.
Wang, J., Thijs, B., & Glanzel, W. (2015). Interdisciplinarity and impact: Distinct effects of variety, balance, and disparity. PLoS One, 10(5), e0127298.
Weinberg, S. (2012). The crisis of big science. The New York Review of Books, 59(8). www.nybooks.com/articles/2012/05/10/crisis-big-science/. Accessed 10 May.
Weisberg, M., & Muldoon, R. (2009). Epistemic landscapes and the division of cognitive labor. Philosophy of Science, 76(2), 225–252.
West, M., & Anderson, N. (1996). Innovation in top management teams. Journal of Applied Psychology, 81(6), 680–693.
Westfall, C. (1997). Science policy and the social structure of big laboratories, 1964–1979. In Hoddeson 1997.
Wu, C. F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. The Annals of Statistics, 14, 1261–1295.
Zollman, K. J. (2007). The communication structure of epistemic communities. Philosophy of Science, 74(5), 574–587.
Zollman, K. J. (2010). The epistemic benefit of transient diversity. Erkenntnis, 72(1), 17–35.
This work was supported in part by the Project “Dynamic Systems in Nature and Society: philosophical and empirical aspects” (#179041) financed by the Ministry of Education, Science, and Technological Development of Serbia. The work of the third author was supported by the FWF project W1255-N23. We would like to thank the Fermilab History & Archives Project, Fermilab’s lnformation Resources Department and the Fermilab Program Planning Office for providing us the necessary data and explanations about the INSPRE-HEP website. In particular we would like to thank Heath O’Connell, Adrienne W. Kolb, Valerie Higgins and Roy Rubinstein for their assistance. We would also like to thank Lilian Hoddeson for putting us in contact with Fermilab stuff, and Milan Ćirković for his important initial suggestions. Finally, we thank the two anonymous reviewers for their outstanding effort.
About this article
Cite this article
Perović, S., Radovanović, S., Sikimić, V. et al. Optimal research team composition: data envelopment analysis of Fermilab experiments. Scientometrics 108, 83–111 (2016). https://doi.org/10.1007/s11192-016-1947-9
- Social epistemology of science
- Team size
- Team diversity
- Data envelopment analysis
- High energy physics