Advertisement

Measuring Progress in Robotics: Benchmarking and the ‘Measure-Target Confusion’

  • Vincent C. MüllerEmail author
Chapter
Part of the Cognitive Systems Monographs book series (COSMOS, volume 36)

Abstract

While it is often said that in order to qualify as a true science robotics should aspire to reproducible and measurable results that allow benchmarking, I argue that a focus on benchmarking will be a hindrance for progress. Several academic disciplines that have been led into pursuing only reproducible and measurable ‘scientific’ results—robotics should be careful not to fall into that trap. Results that can be benchmarked must be specific and context-dependent, but robotics targets whole complex systems independently of a specific context—so working towards progress on the technical measure risks missing that target. It would constitute aiming for the measure rather than the target: what I call ‘measure-target confusion’. The role of benchmarking in robotics shows that the more general problem to measure progress towards more intelligent machines will not be solved by technical benchmarks; we need a balanced approach with technical benchmarks, real-life testing and qualitative judgment.

Notes

Acknowledgements

I am grateful to Fabio Bonsignorio and other members of the GEMSig, esp. Alan Winfield, for sustaining this discussion. Thanks to Barna Ivantovic for comments. I am grateful to Nick Bostrom for conversations about intelligence testing and measurement.

References

  1. 1.
    Aly, A., Griffiths, S., Stramandinoli, F.: Metrics and benchmarks in human-robot interaction: recent advances in cognitive robotics. Cognitive Systems Research (2016, forthcoming).  https://doi.org/10.1016/j.cogsys.2016.06.002
  2. 2.
    Amigoni, F., Bastianelli, E., Bonarini, A., Fontana, G., Hochgeschwender, N., Iocchi, L., Schiaffonati, V.: Competitions for benchmarking. IEEE Robot. Autom. Mag. 22(3), 53–61 (2016)CrossRefGoogle Scholar
  3. 3.
    Antonelli, G.: Robotic research: are we applying the scientific method? Frontiers in Robotics and AI 2, 1–4 (2015).  https://doi.org/10.3389/frobt.2015.00013CrossRefGoogle Scholar
  4. 4.
    Bonsignorio, F., Del Pobil, A.P.: Toward replicable and measurable robotics research. IEEE Robot. Autom. Mag. 22(3), 32–35 (2015)CrossRefGoogle Scholar
  5. 5.
    Bostrom, N.: Superintelligence: paths, dangers, strategies. Oxford University Press, Oxford (2014)Google Scholar
  6. 6.
    Campbell, D.T.: Assessing the impact of planned social change. Eval. Program Plan. 2(1), 67–90 (1979).  https://doi.org/10.1016/0149-7189(79)90048-XCrossRefGoogle Scholar
  7. 7.
    Dias, J., Althoefer, K., Lima, P.U.: Robot competitions: what did we learn? IEEE Robot. Autom. Mag. (1), 16–18 (2016)Google Scholar
  8. 8.
    EURON: Survey and inventory of current efforts in comparative robotics research. European Robotics Research Network (2008). Retrieved from http://www.robot.uji.es/EURON/en/index.htm
  9. 9.
    Gomila, A., Müller, V.C.: Challenges for artificial cognitive systems. J. Cogn. Sci. 13(4), 453–469 (2012).  https://doi.org/10.17791/jcs.2012.13.4.453CrossRefGoogle Scholar
  10. 10.
    Hick, D., Wouters, P., Waltman, L., de Rijcke, S., Rafois, I.: Bibliometrics: The Leiden Manifesto for research metrics. Nature 520, 429–431 (2015).  https://doi.org/10.1038/520429aCrossRefGoogle Scholar
  11. 11.
    Iantovics, L.B., Rotar, C., Nechita, E.: A novel robust metric for comparing the intelligence of two cooperative multiagent systems. Procedia Comput. Sci. 96, 637–644 (2016).  https://doi.org/10.1016/j.procs.2016.08.245CrossRefGoogle Scholar
  12. 12.
    Kant, I.: Critique of Pure Reason (N. K. Smith, Trans.) (1791). Palgrave Macmillan, London (1929)Google Scholar
  13. 13.
    Kurzweil, R.: The Singularity Is Near: When Humans Transcend Biology. Viking, London (2005)Google Scholar
  14. 14.
    Lier, F., Wachsmuth, S., Wrede, S: Modeling software systems in experimental robotics for improved reproducibility: a case study with the iCub humanoid robot. Humanoids (18–20 November 2014). http://pub.uni-bielefeld.de/luur/download?func=downloadFile&recordOId=2705677&fileOId=2705709
  15. 15.
    Madhavan, R., del Pobil, A.P., Messina, E.: Performance evaluation and benchmarking of robotic and automation systems (2010)Google Scholar
  16. 16.
    Müller, V.C.: Autonomous cognitive systems in real-world environments: less control, more flexibility and better interaction. Cogn. Comput. 4(3), 212–215 (2012).  https://doi.org/10.1007/s12559-012-9129-4CrossRefGoogle Scholar
  17. 17.
    Müller, V.C., Ayesh, A. (eds.): Revisiting turing and his test: comprehensiveness, qualia, and the real world, vol. 7/2012. AISB, Hove (2012)Google Scholar
  18. 18.
    Müller, V.C., Bostrom, N.: Future progress in artificial intelligence: a survey of expert opinion. In: Müller, V.C. (ed.) Fundamental Issues of Artificial Intelligence, pp. 553–570. Springer, Berlin (2016)Google Scholar
  19. 19.
    SPARC: Robotics 2020: multi-annual roadmap for robotics in Europe. Release B 03/12/2015 (2015). http://www.eu-robotics.net/

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.IDEA Centre, University of LeedsLeedsUK

Personalised recommendations