Towards Collaborative Data Analysis with Diverse Crowds – A Design Science Approach

  • Michael Feldman
  • Cristian Anastasiu
  • Abraham Bernstein
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10844)


The last years have witnessed an increasing shortage of data experts capable of analyzing the omnipresent data and producing meaningful insights. Furthermore, some data scientists mention data preprocessing to take up to 80% of the whole project time. This paper proposes a method for collaborative data analysis that involves a crowd without data analysis expertise. Orchestrated by an expert, the team of novices conducts data analysis through iterative refinement of results up to its successful completion. To evaluate the proposed method, we implemented a tool that supports collaborative data analysis for teams with mixed level of expertise. Our evaluation demonstrates that with proper guidance data analysis tasks, especially preprocessing, can be distributed and successfully accomplished by non-experts. Using the design science approach, iterative development also revealed some important features for the collaboration tool, such as support for dynamic development, code deliberation, and project journal. As such we pave the way for building tools that can leverage the crowd to address the shortage of data analysts.


Collaborative data analysis Crowdsourcing Design science 



This work was supported by the Swiss National Science Foundation under contract number 14341.


  1. 1.
    Davenport, T.H., Patil, D.J.: Data_Scientist-the_Sexiest_Job_of_the_21St_Century.Pdf (2012)Google Scholar
  2. 2.
    Kandel, S., Paepcke, A., Hellerstein, J., Heer, J.: Wrangler: interactive visual specification of data transformation scripts. In: Human Factors in Computing Systems, pp. 3363–3372. ACM (2011).
  3. 3.
    Bernstein, A., Klein, M., Malone, T.W.: Programming the global brain. Commun. ACM 55, 41 (2012). Scholar
  4. 4.
    Sere, F.C., Swigger, K., Alpaslan, F.N., Brazile, R., Dafoulas, G., Lopez, V.: Online collaboration: collaborative behavior patterns and factors affecting globally distributed team performance. Comput. Hum. Behav. 27, 490–503 (2011). Scholar
  5. 5.
    Van Noorden, R.: Online collaboration: scientists and the social network. Nature 512, 126–129 (2014). Scholar
  6. 6.
    MacDonald, J.: Assessing online collaborative learning: Process and product. Comput. Educ. 40, 377–391 (2003). Scholar
  7. 7.
    Yadav, M.S., Pavlou, P.A.: Marketing in computer-mediated environments: research synthesis and new directions. J. Mark. 78, 20–40 (2014). Scholar
  8. 8.
    Tseng, H., Wang, C.-H., Ku, H.-Y., Sun, L.: Key factors in online collaboration and their relationship to teamwork satisfaction. Q. Rev. Distance Educ. 10, 195–206 (2009)Google Scholar
  9. 9.
    Salehi, N., McCabe, A., Valentine, M., Bernstein, M.S.: Huddler: convening stable and familiar crowd teams despite unpredictable availability. In: Proceedings of the 20th ACM Conference on Computer Supported Cooperative Work & Social Computing (2016)Google Scholar
  10. 10.
    Yukl, G.: Leadership in organizations. In: Personnel Psychology, 7th edn, p. 542 (2001).
  11. 11.
    Kulkarni, A., Can, M., Hartmann, B.: Collaboratively crowdsourcing workflows with turkomatic. In: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work - CSCW 2012, p. 1003 (2012).
  12. 12.
    Kittur, A., Smus, B., Kraut, R.: CrowdForge Crowdsourcing complex work. In: Proceedings of the 2011 Annual Conference Extended Abstracts on Human Factors in Computing Systems - CHI EA 2011. p. 1801 (2011).
  13. 13.
    Kittur, A., Khamkar, S., André, P., Kraut, R.E.: CrowdWeaver: visually managing complex crowd work. In: Scenario, pp. 1033–1036 (2012).
  14. 14.
    Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., Panovich, K.: Soylent: a word processor with a crowd inside. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, pp. 313–322 (2010).
  15. 15.
    Carpenter, J.: May the best analyst win. Science (New York) 331, 698–699 (2011). Scholar
  16. 16.
    Dissanayake, I., Zhang, J., Gu, B.: Virtual team performance in crowdsourcing contests: a social network perspective. In: ICIS 2015 Proceedings, pp. 1–16 (2014)Google Scholar
  17. 17.
    Heer, J., Viégas, F.B., Wattenberg, M.: Voyagers and voyeurs: supporting asynchronous collaborative visualization. Commun. ACM 52, 87–97 (2009). Scholar
  18. 18.
    Viegas, F.B., Wattenberg, M., Van Ham, F., Kriss, J., McKeon, M.: Many Eyes: a site for visualization at internet scale. IEEE Trans. Vis. Comput. Graph. 13, 1121–1128 (2007). Scholar
  19. 19.
    Willett, W., Heer, J., Hellerstein, J.M., Agrawala, M.: CommentSpace: structured support for collaborative visual analysis. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 3131–3140 (2011).
  20. 20.
    Haas, D., Krishnan, S., Wang, J., Franklin, M.J., Wu, E.: Wisteria: nurturing scalable data cleaning infrastructure. In: Proceedings of the 41st International Conference on Very Large Data Bases, vol. 8, pp. 2004–2007 (2015). Scholar
  21. 21.
    dos Santos, F., Bazzan, A.L.C.: An ant based algorithm for task allocation in large-scale and dynamic multiagent scenarios. In: Proceedings of the 11th Annual conference on Genetic and evolutionary computation - GECCO 2009, p. 73 (2009).
  22. 22.
    Campbell, A., Wu, A.S.: Multi-agent role allocation: issues, approaches, and multiple perspectives. Auton. Agents Multi-Agent Syst. 22, 317–355 (2011). Scholar
  23. 23.
    Chandrasekaran, B., Josephson, J.R., Benjamins, V.R.: Ontology of tasks and methods. Knowl. Acquis. 1–25 (1998). Spring symposium series technical report (AAAI Technical Report SS-97-06) Google Scholar
  24. 24.
    Stefik, M.: Planning with constraints (MOLGEN: part 1). Artif. Intell. 16, 111–139 (1981). Scholar
  25. 25.
    Malone, T.W., Crowston, K., Lee, J., Pentland, B., Dellarocas, C., Wyner, G., Quimby, J., Osborn, C., Bernstein, A., Herman, G., Klein, M., O’Donnell, E.: Tools for inventing organizations: toward a handbook of organizational processes. Manag. Sci. 45, 425–443 (1999)CrossRefGoogle Scholar
  26. 26.
    Howison, J., Crowston, K.: Collaboration through open superposition. Mis Q. 38(1), 29–50 (2014)CrossRefGoogle Scholar
  27. 27.
    Hevner, A.R., March, S.T., Park, J., Ram, S.: Design science in information systems research. MIS Q. 28, 75–105 (2004). Scholar
  28. 28.
    Gregor, S.: The nature of theory in information systems. MIS Q. 30, 611–642 (2006). Scholar
  29. 29.
    Reinecke, K., Bernstein, A.: Knowing what a user likes: a design science approach to interfaces that automatically adapt to culture. MIS Q. 37, 427–453 (2013)CrossRefGoogle Scholar
  30. 30.
    Peffers, K.E.N., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research methodology for information systems research. Decis. Sci. 24, 45–77 (2008). Scholar
  31. 31.
    Redmiles, D.: Software requirements for supporting collaboration through categories (2000)Google Scholar
  32. 32.
    Krishnan, S., Wang, J., Franklin, M.J., Goldberg, K., Kraska, T., Milo, T., Wu, E.: SampleClean: fast and reliable analytics on dirty data. Bull. IEEE Comput. Soc. Tech. Comm. Data Eng. 38(3), 59–75 (2015)CrossRefGoogle Scholar
  33. 33.
    Agrawal, A., Horton, J., Lacetera, N., Lyons, E.: Digitization and the contract labor market: a research agenda. NBER Working Paper, vol. 37 (2013).
  34. 34.
    Mascha, E.J.: Equivalence and noninferiority testing in anesthesiology research. Anesthesiology 113, 779–781 (2010). Scholar
  35. 35.
    Peffers, K., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research methodology for information systems research. J. Manag. Inf. Syst. 24(3), 45–77 (2007) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Michael Feldman
    • 1
  • Cristian Anastasiu
    • 1
  • Abraham Bernstein
    • 1
  1. 1.Department of InformaticsUniversity of ZurichZurichSwitzerland

Personalised recommendations