Empirical Software Engineering

, Volume 24, Issue 2, pp 537–561 | Cite as

Successes, challenges, and rethinking – an industrial investigation on crowdsourced mobile application testing

  • Ruizhi Gao
  • Yabin Wang
  • Yang Feng
  • Zhenyu ChenEmail author
  • W. Eric WongEmail author
Experience Report


The term crowdsourcing – a compound contraction of crowd and outsourcing – is a new paradigm for utilizing the power of crowds of people to facilitate large-scale tasks that are costly or time consuming with traditional methods. This paradigm offers mobile application companies the possibility to outsource their testing activities to crowdsourced testers (crowdtesters) who have various testing facilities and environments, as well as different levels of skills and expertise. With this so-called Crowdsourced Mobile Application Testing (CMAT), some of the well-recognized issues in testing mobile applications, such as multitude of mobile devices, fragmentation of device models, variety of OS versions, and omnifariousness of testing scenarios, could be mitigated. However, how effective is CMAT in practice? What are the challenges and issues presented by the process of applying CMAT? How can these issues and challenges be overcome and CMAT be improved? Although CMAT has attracted attention from both academia and industry, these questions have not been addressed or researched in depth based on a large-scale and real-life industrial study. Since June 2015, we have worked with Mooctest, Inc., a CMAT intermediary, on testing five real-life Android applications using their CMAT platform – Kikbug. Throughout the process, we have collected 1013 bug reports from 258 crowdtesters and found 247 bugs in total. This paper will present our industrial study thoroughly and give an insightful analysis to investigate the successes and challenges of applying CMAT.


Crowdsourcing Crowdsourced mobile application testing Android applications 



This work was supported in part by National Natural Science Foundation of China (Grant Nos. 61690201).


  1. Adaptive Vehicle Make (2018)
  2. Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad H, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput 17(2):76–81CrossRefGoogle Scholar
  3. Amazon Mechanical Turk (2018)
  4. Bruun A, Stage J (2015) New approaches to usability evaluation in software development: barefoot and crowdsourcing. J Syst Softw 105:40–53CrossRefGoogle Scholar
  5. Capgemini (2017–2018) World Quality Report for Mobile TestingGoogle Scholar
  6. Z. Chen and B. Luo (2014) “Quasi-crowdsourcing testing for educational projects,” in Proceedings of international conference on software engineering, pp. 272.275, Hyderabad, India, MaryGoogle Scholar
  7. J. Cheng, J. Teevan, M. S. Bernstein (2015) “Measuring crowdsourcing effort with error-time curves,” in Proceedings of ACM conference on human factors in computing systems, pp. 1365–1374, Seoul, KoreaGoogle Scholar
  8. CrowdMed (2018)
  9. (2013) “Using crowdsourcing for software testing”Google Scholar
  10. Lucas Dargis (2013) “Is UTest a Scam”
  11. E. Dolstra, R. Vliegendhart, and J. Pouwelse (2013) “Crowdsourcing GUI tests,” In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation, pages 332–341, LuxembourgGoogle Scholar
  12. Y. Feng, Z. Chen, J. A. Jone, C. Fang, and B. Xu (2015) “Test report prioritization to assist Crowdsourced testing,” in Proceedings of joint meeting on foundations of software engineering, pp. 225–236, Bergamo, ItalyGoogle Scholar
  13. M. Goldman (2011) “Role-based interfaces for collaborative software development,” in Proceedings of the 24th annual ACM symposium adjunct on user Interface software and technology, pp. 23–26, Charlotte, USAGoogle Scholar
  14. M. Goldman, G. Little, and R. C. Miller (2011) “Real-time collaborative coding in a web IDE,” in Proceedings of the 24th annual ACM symposium on user interface software and technology, pp. 155–164, Santa Barbara, USAGoogle Scholar
  15. M. Gomez, R. Rouvoy, B. Adams, and L. Seinturier (2016) “Reproducing context-sensitive crashes of mobile apps using Crowdsourced monitoring,” in Proceedings of the international conference on mobile software engineering and systems, pp. 88–99, Austin, TexasGoogle Scholar
  16. F. Guaiani and H. Muccini (2016) “Crowd and laboratory testing, can they co-exist? An exploratory study,” in Proceedings of the second international workshop on CrowdSourcing in software engineering, pp. 32–37, Florence, ItalyGoogle Scholar
  17. Haerem T, Rau D (2007) The influence of degree of expertise and objective task complexity on perceived task complexity and performance. J Appl Psychol 92(5):1320–1331CrossRefGoogle Scholar
  18. M. Harman, Y. Jia, W. B. Langdon, J. Petke, I. H. Moghadam, S. Yoo, and F. Wu (2014) “Genetic improvement for adaptive software engineering,” in Proceedings of the international symposium on software engineering for Adaptiveand self-managing systems, pp. 1–4, Austin, USAGoogle Scholar
  19. Hotelling H (1953) New light on the correlation coefficient and its transforms. J R Stat Soc 15(2):193–232MathSciNetzbMATHGoogle Scholar
  20. J. Howe (2016) “The rise of crowdsourcing,” Wired MagazineGoogle Scholar
  21. Y.-C. Huang, C.-I. Wang, and J. Hsu (2013) “Leveraging the crowd for creating wireframe-based exploration of mobile design pattern gallery,” in Proceedings of the companion publication of the 2013 international conference on intelligent user interfaces, pp. 17–20, Santa Monica, USAGoogle Scholar
  22. Latoza TD, Van der Hoek A (2016) Crowdsourcing in software engineering: models, motivations, and challenges. IEEE Softw 33(1):74–80CrossRefGoogle Scholar
  23. N. Leicht, N. Knop, I. Blohm, C. Müller-Bloch, and J. M. Leimeister (2016) “When is crowdsourcing advantageous? The case of Crowdsourced software testing,” in Proceedings of European conference on information systems, pp. 1–17, Istanbul, TurkeyGoogle Scholar
  24. Leicht N, Blohm I, Leimeister JM (2017) Leveraging the power of the crowd for software testing. IEEE Softw 34(2):62–69CrossRefGoogle Scholar
  25. D. Liu, M. Lease, R. Kuipers, and R. Bia (2012) “Crowdsourcing for usability testing,” in Proceedings of the American Society for Information Science and Technology, vol. 49, no. 1, pp. 1–10Google Scholar
  26. Mantyla MV, Itkonen J (2013) More testers - the effect of crowd size and time restriction in software testing. Inf Softw Technol 55(6):986–1003CrossRefGoogle Scholar
  27. K. Mao, L. Capra, M. Harman, and Y. Jia (2015) “A survey of the use of crowdsourcing in software engineering,” Research Note, University College LondonGoogle Scholar
  28. Mok R, Chang R, Li W (2017) Detecting low-quality workers in QoE Crowdtesting: a worker behavior-based approach. IEEE Transactions on Multimedia 19(3):530–543CrossRefGoogle Scholar
  29. D. Mujumdar, M. Kallenbach, B. Liu, and B. Hartmann (2011) “Crowdsourcing suggestions to programming problems for dynamic web development languages,” in Proceedings of the 2011 annual conference extended abstracts on human factors in computing systems, pp. 1525–1530, Vancouver, CanadaGoogle Scholar
  30. MyCrowd (2018)
  31. M. Nebeling, M. Speicher, and M. C. Norrie (2013) “CrowdStudy: General Toolkit for Crowdsourced Evaluation of Web Interfaces,” in Proceedings of the 5th ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp 255–264, London, UKGoogle Scholar
  32. OpenSignal, Android Fragmentation Visualized (2015)Google Scholar
  33. F. Pastore, L. Mariani, and G. Fraser (2013) “Crowdoracles: can the crowd solve the Oracle problem?” In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation, pages 342–351, LuxembourgGoogle Scholar
  34. UserTesting (2018)
  35. UTest (2018)
  36. H. Xue (2013) “Using redundancy to improve security and testing,” Ph.D. dissertation, University of Illinois at Urbana-ChampaignGoogle Scholar
  37. M. Yan, H. Sun, and X. Liu (2014) “iTest: testing software with mobile crowdsourcing,” in Proceedings of the 1st International Workshop on Crowd-based Software Development Methods and Technologies, pp. 19–24, Hong KongGoogle Scholar
  38. M. Yuen, I. King, and K. Leung (2011) “A survey of crowdsourcing systems,” in Proceedings of IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE Conference on Social Computing, pp. 766–773, Boston, USAGoogle Scholar
  39. Zhang X, Yang Z, Zhou Z, Cai H, Chen L, Li X (2014) Free market of crowdsourcing: incentive mechanism Design for Mobile Sensing. IEEE Trans Prallel Dist Syst 25(12):3190–3200CrossRefGoogle Scholar
  40. Zogaj S, Bretschneider U, Leimeister JM (2014) Managing Crowdsourced software testing: a case study based insight on the challenges of a crowdsourcing intermediary. J Bus Econ 84:375–405CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Texas at DallasRichardsonUSA
  2. 2.State Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina
  3. 3.Department of InformaticsUniversity of CaliforniaIrvineUSA

Personalised recommendations