Creating user stereotypes for persona development from qualitative data through semi-automatic subspace clustering

  • 15 Accesses


Personas are models of users that incorporate motivations, wishes, and objectives; These models are employed in user-centred design to help design better user experiences and have recently been employed in adaptive systems to help tailor the personalized user experience. Designing with personas involves the production of descriptions of fictitious users, which are often based on data from real users. The majority of data-driven persona development performed today is based on qualitative data from a limited set of interviewees and transformed into personas using labour-intensive manual techniques. In this study, we propose a method that employs the modelling of user stereotypes to automate part of the persona creation process and addresses the drawbacks of the existing semi-automated methods for persona development. The description of the method is accompanied by an empirical comparison with a manual technique and a semi-automated alternative (multiple correspondence analysis). The results of the comparison show that manual techniques differ between human persona designers leading to different results. The proposed algorithm provides similar results based on parameter input, but was more rigorous and will find optimal clusters, while lowering the labour associated with finding the clusters in the dataset. The output of the method also represents the largest variances in the dataset identified by the multiple correspondence analysis.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. Abdi, H., Valentin, D.: Multiple correspondence analysis. In: Rasmussen, K. (ed.) Encyclopedia of Measurements and Statistics. Sage Publications, Thousand Oaks (2007)

  2. Achtert, E., Kriegel, H.P., Zimek, A.: ELKI: a software system for evaluation of subspace clustering algorithms (2008)

  3. Adlin, T., Pruitt, J., Goodwin, K., Hynes, C., McGrane, K., Rosenstein, A., Muller, M.J.: Panel: putting personas to work. Chi 2006, 13–16 (2006).

  4. Antle, A.N.: Child-personas: fact or fiction? In: Proceedings of the 6th Conference on Designing Interactive Systems, pp. 22–30 (2006).

  5. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)

  6. Bjørner, T.: Qualitative Methods for Consumer Research: The Value of the Qualitative Approach in Theory and Practice, 1st edn. Hans Reitzel, Copenhagen (2015)

  7. Bjørner, T., Korsgaard, D., Christine, H., Perez-cueto, F.J.A.: A contextual identification of home-living older adults’ positive mealtime practices: a honeycomb model as a framework for joyful aging and the importance of social factors. Appetite 129, 125–134 (2018).

  8. Bock, T.: How correspondence analysis works (a simple explanation) (2017). Accessed 29 Dec 2019

  9. Brickey, J., Walczak, S., Burgess, T.: Comparing semi-automated clustering methods for persona development. IEEE Trans. Softw. Eng. 38(3), 537–546 (2012).

  10. Burelli, P., Yannakakis, G.N.: Adapting virtual camera control through player modelling. User Model. User Adapt. Interact. (2015).

  11. Calde, S., Goodwin, K., Reimann, R.: SHS Orcas: the first integrated information system for long-term healthcare facility management. In: Case Studies of the CHI2002|AIGA Experience Design FORUM on—CHI ’02, pp. 2–16 (2002).

  12. Casas, R., Mar, B., Robinet, A., Roy, A.: User modelling in ambient intelligence for elderly and disabled people. In: International Conference on Computers for Handicapped Persons, pp. 114–122 (2008)

  13. Chapman, C.N., Love, E., Milham, R.P., ElRif, P., Alford, J.L.: Quantitative evaluation of personas as information. Proc. Hum. Factors Ergon. Soc. Annu. Meet. 52(16), 1107–1111 (2008).

  14. Christidis, K., Papailiou, N., Apostolou, D., Mentzas, G.: Semantic interfaces for personal and social knowledge work. Int. J. Knowl. Based Organ. 1(1), 61–77 (2011).

  15. Christiernin, L.G.: Guiding the designer: a radar diagram process for applications with multiple layers. Interact. Comput. 22(2), 107–122 (2010).

  16. Cooper Professional Education.: Why CPE (2019). Accessed 29 Dec 2019

  17. Cooper, A.: The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity. Sams, Indianapolis (1999)

  18. Cooper, A., Reimann, R., Cronin, D.: About Face 3: The Essentials of Interaction Design, 3rd edn. Wiley, New York (2007)

  19. Copenhagen University: ELDORADO project (2018) Accessed 29 Dec 2019

  20. Dice, L.: Measures of the amount of ecologic association between species. Ecological 26(3), 297–302 (1945)

  21. Djajadiningrat, J.P., Gaver, W.W., Frens, J.W.: Interaction relabelling and extreme characters: methods for exploring aesthetic interactions. Proceedings of the Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques, DIS 2000, 66–71 (2000).

  22. Ewusi-Mensah, K.: Software Development Failures: Anatomy of Abandoned Projects. MIT Press, Cambridge (2003)

  23. Fischer, G.: User modeling in human-computer interaction. User Model. User Adapt. Interact. 11(1–2), 65–86 (2001).

  24. Floyd, I.R., Cameron Jones, M., Twidale, M.B.: Resolving incommensurable debates: a preliminary identification of persona kinds, attributes, and characteristics. Artifact 2(1), 12–26 (2008).

  25. Goodwin, K.: Getting from research to personas: harnessing the power of data (2002) Accessed 29 Dec 2019

  26. Goodwin, K., Cooper, A.: Designing for the Digital Age: How to Create Human-Centered Products and Services. Wiley, New York (2009)

  27. Grenier-Boley, N.: Some issues about the introduction of first concepts in linear algebra during tutorial sessions at the beginning of university. Educ. Stud. Math. 87(3), 439–461 (2014).

  28. Grimmer, J., Stewart, B.M.: The promise and pitfalls of automatic content analysis methods for political texts. Soc. Political Methodol. 21(3), 267–297 (2013).

  29. Guest, G., Mclellan, E.: Distinguishing the trees from the forest: applying cluster analysis to thematic qualitative data. F. Methods 15(2), 186–201 (2003).

  30. Harel, G.: Variations in linear algebra content presentations. Learn. Math. 7(3), 29–32 (1987)

  31. Holmgard, C., Liapis, A., Togelius, J.: Evolving personas for player decision modeling. In: 2014 IEEE Conference on Computational Intelligence and Games, pp. 1–8 (2014).

  32. Husson, F., Le, S., Pages, J.: Exploratory Multivariate Analysis by Example Using R. Chapman & Hall, Boca Raton (2011)

  33. Husson, F., Josse, J., Le, S., Mazet, J.: Package ’FactoMineR’: multivariate exploratory data analysis and data mining (2018). Accessed 29 Dec 2019

  34. Jain, A.K.: Data clustering 50 years beyond K-means. In: 19th International Conference in Pattern Recognition (ICPR), pp. 651–666 (2010).

  35. Jovchelovitch, S., Bauer, M.: Narrative interviewing. In: Bauer, M., Gaskell, G. (eds.) Qualitative Researching with Text, Image and Sound, pp. 57–74. SAGE Publications Ltd, Thousand Oaks (2000)

  36. Kerr, S.J., Tan, O., Chua, J.C.: Cooking personas: goal-directed design requirements in the kitchen. Int. J. Hum. Comput. Stud. 72(2), 255–274 (2014).

  37. Landauer, T., McNamara, D., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Psychology Press, London (2007)

  38. Landauer, T.K., Folt, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2), 259–284 (1998).

  39. Laporte, L., Slegers, K., De Grooff, D.: Using correspondence analysis to monitor the persona segmentation process. In: Proceedings of the 7th Nordic Conference on Human–Computer Interaction Making Sense Through Design—NordiCHI ’12, p. 265 (2012).

  40. Leskovec, J., Rajaraman, A., Ullman, D.J. (eds.): Singular-value decomposition. In: Mining of Massive Datasets, Chapter 11, 2nd edn, p. 483. Cambridge University Press, Cambridge (2014)

  41. Macia, L.: Using clustering as a tool: mixed methods in qualitative data analysis. Qual. Rep. 20(7), 1083–1094 (2015)

  42. Madureira, A., Cunha, B., Pereira, J.P., Gomes, S., Pereira, I., Santos, J.M., Abraham, A.: Using personas for supporting user modeling on scheduling systems. In: 2014 14th International Conference on Hybrid Intelligent Systems, pp. 279–284 (2014).

  43. Masiero, A.A., Leite, M.G., Vilela, L., Filgueiras, L., Thomaz, P., Jr A., Humberto, A., Branco, A.C., Campo, S.B., Brasil, S.P., Paulo, U.D.S., Prof, A., Gualberto, L., Paulo, S.: Multidirectional knowledge extraction process for creating behavioral personas. In: 10th Brazilian Symposium on Human Factors in Computer Systems & 5th Latin American Conference on Human–Computer Interaction, pp. 91–99 (2011)

  44. Masters, R.: The effect of students’ physics background on their understanding of linear algebra. Ph.D. thesis, Concordia University (2000).

  45. Melhart, D., Azadvar, A., Canossa, A., Liapis, A., Yannakakis, G.N.: Your gameplay says it all: modelling motivation in Tom Clancy’s the division. In: IEEE Conference on Games (2019). arXiv:1902.00040

  46. Miaskiewicz, T., Kozar, K.A.: Personas and user-centered design: how can personas benefit product design processes? Des. Stud. 32(5), 417–430 (2011).

  47. Miaskiewicz, T., Sumner, T., Kozar, K.A.: A latent semantic analysis methodology for the identification and creation of personas. In: Proceedings of the 26th SIGCHI Conference on Human Factors in Computing Systems, pp. 1501–1510 (2008).

  48. Moser, C., Fuchsberger, V.: Revisiting personas: the making-of for special user groups. In: CHI’12 Extended Abstracs on Human Factors in Computing Systems, pp. 453–468 (2012).,

  49. Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. In: Proceedings of the 35th International Conference on Very Large Data Bases, Lyon, France (2009)

  50. Nielsen, L.: Personas. In: The Encyclopedia of Human–Computer Interaction, 2nd edn (2002)

  51. Nielsen, L., Storgaard Hansen, K.: Personas is applicable: a study on the use of personas in Denmark. In: Proceedings of the 32nd annual ACM Conference on Human Factors in Computing Systems, pp. 1665–1674 (2014).

  52. Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data. ACM SIGKDD Explor. Newsl. 6(1), 90–105 (2004).

  53. Podani, J.: Distance, similarity, correlation. In: Podani, J. (ed.) Introduction to the Exploration of Multivariate Biological Data. Backhuys Publishers, Leiden (2000)

  54. Procopiuc, C.M., Jones, M., Agarwal, P.K., Murali, T.M.: A Monte Carlo algorithm for fast projective clustering. In: SIGMOD, pp. 418–427 (2002).

  55. Pruitt, J., Adlin, T.: The Persona Lifecycle: Keeping People in Mind Throughout Product Design. Morgan Kaufmann, Burlington (2006).

  56. Pruitt, J., Grundin, J.: Personas: practice and theory. In: Proceedings of the 2003 Conference on Designing for User Experiences, pp. 1–15 (2003).

  57. Sahlgren, M.: The distributional hypothesis. Ital. J. Linguist. 20(1), 33–54 (2008)

  58. Savolainen, P., Ahonen, J., Richardson, I.: Software development project success and failure from the supplier’s perspective: a systematic literature review. Int. J. Proj. Manag. 30(4), 458–469 (2012).

  59. Siegel, D.A.: The mystique of numbers: belief in quantitative approaches to segmentation and persona development. In: CHI ’10 Extended Abstracts on Human Factors in Computing Systems, pp. 4721–4731 (2010).

  60. Sinha, R.: Persona development for information-rich domains. In: CHI ’03: CHI ’03 Extended Abstracts on Human Factors in Computing Systems, pp. 830–831 (2003).

  61. Sourial, N., Wolfan, C., Zhu, B., Quail, J., Fletcher, J., Karunananthan, S., Bandeen-Roche, K., Beland, F., Bergman, H.: Correspondence analysis is a useful tool to uncover the relationships among categorical variables. J. Clin. Epidemiol. 63(6), 638–646 (2010).

  62. Stevens, S.: On the theory of scales of measurement. Science 103(2684), 677–680 (1946)

  63. Tan, P.N., Steinbach, M., Kumar, V.: Chap 8: Cluster analysis: basic concepts and algorithms. In: Introduction to Data Mining, Chapter 8 (2005).

  64. Tara Matthews, S., Tejinder Judge: How do designers and user experience professional actually perceive and use personas? In: Conference of Human Factors in Computing Systems, pp. 1219–1228 (2012).

  65. Tu, N., Dong, X., Rau, P.L.P., Zhang, T.: Using cluster analysis in persona development. In: 2010 8th International Conference on Supply Chain Management and Information (2010)

  66. Van der Maaten, L.J.P.: An introduction to dimensionality reduction using matlab. Technical Report MICC 07-07, Maastricht University, Maastricht (2007)

  67. Viana, G., Robert, J.m.: The practitioners’ points of view on the creation and use of personas for user interface design. In: Human–Computer Interaction. Theory, Design, Development and Practice. HCI 2016. Lecture Notes in Computer Science, vol. 9731, pp. 233–244 (2016).

  68. Wöckl, B., Yildizoglu, U., Buber, I., Aparicio Diaz, B., Kruijff, E., Tscheligi, M.: Basic senior personas: a representative design tool covering the spectrum of European older adults. In: Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Accessibility, pp. 25–32 (2012).

  69. Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. arXivorg preprint ID:170802709v7 (October) (2018)

  70. Zhang, X., Brown, H.f., Shankar, A.: Data-driven personas : constructing archetypal users with clickstreams and user telemetry. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5350–5359, San Jose, California, USA (2016)

Download references


This study was part of the ELDORADO project “Preventing malnourishment and promoting well-being among the older adults at home through personalized cost-effective food and meal supply” supported by a grant (4105-00009B) from the Innovation Fund Denmark.

Author information

Correspondence to Dannie Korsgaard.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Korsgaard, D., Bjørner, T., Sørensen, P.K. et al. Creating user stereotypes for persona development from qualitative data through semi-automatic subspace clustering. User Model User-Adap Inter (2020) doi:10.1007/s11257-019-09252-5

Download citation


  • Ethnography
  • Persona
  • Mixed method
  • Subspace clustering
  • Older adults