Data Science for Urban Sustainability: Data Mining and Data-Analytic Thinking in the Next Wave of City Analytics

  • Simon Elias BibriEmail author
Part of the The Urban Book Series book series (UBS)


As a research direction, big data analytics has recently attracted scholars and scientists from diverse disciplines, as well as practitioners from a variety of professional fields, given their prominence in various urban domains, especially urban design and planning, transportation engineering, mobility, energy, public health, and socioeconomic forecasting. Indeed, there has recently been much enthusiasm about the immense possibilities created by the data deluge and its new sources to better operate, manage, and plan cities to improve their contribution to the goals of sustainable development as a result of thinking about and understanding sustainability problems in a data-analytic fashion. This data deluge is increasingly enriching and reshaping our experiences of how such cities can be advanced. Big data analytics is indeed offering many new opportunities for well-informed decision-making and enhanced insights with respect to our knowledge of how fast and best to improve urban sustainability. This unprecedented shift has been brought up by data science, an interdisciplinary field which involves scientific systems, processes, and methods used to extract useful knowledge from data in structured or unstructured forms. Data mining and knowledge discovery in databases as processes are by far the most widely used techniques for extracting useful knowledge from colossal datasets for enhanced decision-making and insights in relation predominantly to business intelligence. However, in city-related academic and scientific research, it is argued that “small data” studies—questionnaire surveys, focus groups, case studies, participatory observations, audits, interviews, content analyses, and ethnographies—are associated with high cost, infrequent periodicity, quick obsolescence, reflexivity, incompleteness, and inaccuracy, i.e., capture a relatively limited sample of data that are tightly focused, time and space specific, restricted in scope and scale, and relatively expensive to generate and analyze, to provide additional depth and insight with respect to urban phenomena. Accordingly, much of our knowledge of urban sustainability has been gleaned from studies that are characterized by data scarcity. The potential of big data lies in transforming the knowledge of smart sustainable cities through the creation of a data deluge that seeks to provide much more sophisticated, wider scale, finer grained, real-time understanding, and control of various aspects of urbanity. Therefore, this chapter aims to synthesize, illustrate, and discuss a systematic framework for urban (sustainability) analytics based on Cross-Industry Standard Process for Data Mining (CRISP-DM) in response to the emerging wave of city analytics in the context of smart sustainable cities. This framework, which can be tested and used in empirical applications in the city domain, has an innovative potential to advance urban analytics by providing a novel way of thinking data-analytically about urban sustainability problems. It provides fertile insights into how to conduct “big data” studies in the field of urban sustainability. The intent is to enable well-informed or knowledge-driven decision-making and enhanced insights in relation to diverse urban domains with regard to operations, functions, strategies, designs, practices, and policies for increasing the contribution of smart sustainable cities to the goals of sustainable development. This chapter can serve to bring together city analysts, data scientists, urban planners and scholars, and ICT experts on common ground in their endeavor to transform and advance the knowledge of smart sustainable cities in terms of sustainability.


Smart sustainable cities Big data analytics Data mining Knowledge discovery Techniques and algorithms Urban analytics Urban sustainability Databases Big data studies Decision-making 


  1. Angelidou M, Artemis P, Nicos K, Christina K, Tsarchopoulos P, Anastasia P (2017) Enhancing sustainable urban development through smart city applications. J Sci Technol Policy Manag, pp 1–25Google Scholar
  2. Al Nuaimi E, Al Neyadi H, Nader M, Al-Jaroodi J (2015) Applications of big data to smart cities. J Internet Serv Appl 6(25):1–15Google Scholar
  3. Batty M (2013) Big data, smart cities and city planning. Dialogues Hum Geogr 3(3):274–279Google Scholar
  4. Batty M, Axhausen KW, Giannotti F, Pozdnoukhov A, Bazzani A, Wachowicz M, Ouzounis G, Portugali Y (2012) Smart cities of the future. Eur Phys J 214:481–518Google Scholar
  5. Bell G, Hey T, Szalay A (2009) Computer science: beyond the data Deluge. Science 323(5919):1297–1298CrossRefGoogle Scholar
  6. Benevolo C, Dameri RP, D’Auria B (2016) Smart mobility in smart city. Springer International Journal, pp 13–26Google Scholar
  7. Bettencourt LMA (2014) The uses of big data in cities Santa Fe Institute. Santa Fe, New MexicoGoogle Scholar
  8. Bibri SE (2015) The human face of ambient intelligence, cognitive, emotional, affective, behavioral, and conversational aspects. Springer, BerlinGoogle Scholar
  9. Bibri SE, Krogstie J (2016) On the social shaping dimensions of smart sustainable cities: a study in science, technology, and society. Sustain Cities Soc 29:219–246CrossRefGoogle Scholar
  10. Bibri SE, Krogstie J (2017a) Smart sustainable cities of the future: an extensive interdisciplinary literature review. Sustain Cities Soc 31:183–212CrossRefGoogle Scholar
  11. Bibri SE, Krogstie J (2017b) Big data analytics for smart sustainable cities of the future: an analytical framework for physical and informational landscapes. Comput Environ Urban Syst (in press)Google Scholar
  12. Bibri SE, Krogstie J (2017c) The core enabling technologies of big data analytics and context–aware computing for smart sustainable cities: a review and synthesis, J Big Data (in press)Google Scholar
  13. Bibri SE (2018a) The IoT for smart sustainable cities of the future: an analytical framework for sensor-based big data applications for environmental sustainability. Sustain Cities and Soc 38:230–253Google Scholar
  14. Bibri SE (2018b) A foundational framework for smart sustainable city development: theoretical, disciplinary, and discursive dimensions and their synergies, Sustain Cities and Soc (in press).Google Scholar
  15. Bibri SE (2018c) The big data Deluge for transforming the knowledge of smart sustainable cities: a data mining framework for Urban analytics, J Big Data (in press)Google Scholar
  16. Bin S, Yuan L, Xiaoyi W (2010) Research on data mining models for the internet of things. In: Proceedings of the international conference on image analysis and signal processing, 9–11 Apr, Zhejiang, China, pp 127–132Google Scholar
  17. Boyd D, Crawford K (2012) Critical questions for big data. Inf Commun Soc 15(5):662–679Google Scholar
  18. Chapman P, Clinton J, Kerber R, Khabaza T, Reinartz T, Shearer C, Wirth R (2000) CRISP-DM 1.0 Step-by-step data mining guidesGoogle Scholar
  19. Chen F, Deng P, Wan J, Zhang D, Vasilakos AV, Rong X (2015) Data mining for the internet of things: literature review and challenges. Int J Distrib Sens Networks 501(431047):1–14Google Scholar
  20. Chen H, Chiang RHL, Storey VC (2012) Business intelligence and analytics: from big data to big impact. MIS Quart 36(4):1165–1188Google Scholar
  21. Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Networks Appl 19(2):171–209. (Springer, US)Google Scholar
  22. DeRen L, JianJun C, Yuan Y (2015) Big data in smart cities. Sci China Inf Sci 58:1–12Google Scholar
  23. Fan W, Bifet A (2013) Mining big data: current status, and forecast to the future. ACM SIGKDD Exploring Newsl 14(2):1–5Google Scholar
  24. Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. Artif Intell Mag 17(3):37–54Google Scholar
  25. Giannotti F, Pedreschi D (1998) Mobility, data mining and privacy. Springer, BerlinGoogle Scholar
  26. Giannotti F, Nanni M, Pinelli F, Pedreschi D (2007) Trajectory pattern mining. ACM SIGKDD 2007. In: Proceedings, international conference on knowledge discovery and data mining, 330Google Scholar
  27. Giannotti F, Nanni M, Pedreschi D, Pinelli F, Renso C, Rinzivillo S, Trasarti R (2011) Unveiling the complexity of human mobility by querying and mining massive trajectory data. VLDB JGoogle Scholar
  28. Hashem IAT, Chang NV, Anuar NB, Adewole K, Yaqooba I, Gania A, Ahmed E, Chiromac H (2016) The role of big data in smart city, Inf Manag, 36(5):748–758Google Scholar
  29. Jabareen YR (2006) Sustainable urban forms: their typologies, models, and concepts. J Plann Educ Res 26:38–52Google Scholar
  30. Ji C, Li Y, Qiu W, Awada U, Li K (2012) Big data processing in cloud computing environments. In: Pervasive systems, algorithms and networks (ISPAN), 2012 12th international symposium on IEEE, pp 17–23Google Scholar
  31. Khan MA, Islam MZ, Hafeez M.(2012) Evaluating the performance of several data mining methods for predicting irrigation water requirement. In: Proceedings of the tenth australasian data mining conference (AusDM), Sydney, Australia 134:199–207Google Scholar
  32. Khan M, Uddin MF, Gupta N (2014) Seven V’s of big data understanding: big data to extract value. In: American society for engineering education (ASEE Zone 1), 2014 Zone 1 conference of the IEEE, pp 1–5Google Scholar
  33. Karun KA, Chitharanjan K (2013) A review on hadoop—HDFS infrastructure extensions. In: IEEE, information & communication technologies (ICT), pp 132–137Google Scholar
  34. Katal A, Wazid M, Goudar R (2013) Big data: issues, challenges, tools and good practices. In: Proceedings of 6th international conference on contemporary computing (IC3), Noida, August 8–10. IEEE, US, pp 404–409Google Scholar
  35. Khan Z, Anjum A, Soomro K, Tahir MA (2015) Towards cloud based big data analytics for smart future cities. J Cloud Comput Adv Syst Appl. 4(2)Google Scholar
  36. Kitchin R (2013) Big data and human geography: opportunities, challenges and risks. Dialog Hum GeogrGoogle Scholar
  37. Kitchin R (2014) The real-time city? Big data and smart urbanism. Geo J 79:1–14Google Scholar
  38. Kohavi R, Deng A, Frasca B, Longbotham R, Walker T, Xu Y (2012) Trustworthy online controlled experiments: five puzzling outcomes explained. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM. pp 786–794Google Scholar
  39. Kumar A, Prakash A (2014) The role of big data and analytics in smart cities, Int J Sci Res, 6(14):12–23Google Scholar
  40. Kurgan L, Musilek P (2006) A survey of knowledge discovery and data mining process models. Knowl Eng Rev 21(1):1–24, Cambridge University Press, New York, NY, USAGoogle Scholar
  41. Kyriazis D, Varvarigou T, Rossi A, White D, Cooper J (2014) Sustainable smart city IoT applications: heat and electricity management and eco-conscious cruise control for public transportation. In: Proceedings of the 2013 IEEE 14th international symposium and workshops on a world of wireless, mobile and multimedia networks (WoWMoM), Madrid, Spain, pp 1–5Google Scholar
  42. Laney D (2001) 3-D data management: controlling data volume, velocity and variety. META group research noteGoogle Scholar
  43. Lenormand M, Ramasco JJ (2016) Towards a better understanding of cities using mobility data, 2016. Built Environ. 42(3):356–364Google Scholar
  44. Li DR, Wang S, Li DY (2006) Spatial data mining theories and applications. Science Press, BeijingGoogle Scholar
  45. Li S, Wang H, Xu T, Zhou G (2011) Application study on internet of things in environment protection field. Lect Notes Electr Eng 133:99–106Google Scholar
  46. Malik P (2013) Big data: principles and practices. IBM J Res Dev 57:4CrossRefGoogle Scholar
  47. Manyika J, Chiu M, Brown B, Bughin J, Dobbs R, Roxburgh C, et al. (2011) Big data: the next frontier for innovation, competition, and productivity. McKinsey Global InstituteGoogle Scholar
  48. Marbán O, Mariscal G, Segovia J (2009) A data mining and knowledge discovery process model. In: Julio P, Adem K (eds) Data mining and knowledge discovery in real life applications. I–Tech, Vienna, Austria, pp. 438–453Google Scholar
  49. Marz N, Warren J (2012) Big data: principles and best practices of scalable realtime data systems. MEAP edition, ManningGoogle Scholar
  50. Mayer-Schonberger V, Cukier K (2013) Big data: a revolution that will change how we live, work and think. John Murray, LondonGoogle Scholar
  51. Miller HJ (2010) The data avalanche is here. Shouldn’t we be digging? J Region Sci 50(1):181–201Google Scholar
  52. Milovic B, Milovic M (2012) Prediction and decision making in healthcare using data mining. Intern J Public Health Sci (IJPHS) 1:69–78Google Scholar
  53. Nagarkar S (2017) Data mining applications for smart city: a review, International Science Press 10(8):705–710Google Scholar
  54. Neirotti P, De Marco A, Cagliano AC, Mangano G, Scorrano F (2014) Current trends in smart city initiatives—some stylised facts. Cities 38:25–36Google Scholar
  55. Neuman M (2005) The compact city fallacy. J Plann Edu Res 25:11–26Google Scholar
  56. Neumeyer L, Robbins B, Nair A, Kesari A (2010) S4: distributed stream computing platform. In: ICDM Workshops, pp 170–177Google Scholar
  57. Pappalardo L, Simini F (2017) Data-driven generation of spatio-temporal routines in human mobility. Data Min. Knowl. DiscovGoogle Scholar
  58. Ponce J, Karahoca (2009) Data mining and knowledge discovery in real life applications. Book edited by, ITech, Vienna, AustriaGoogle Scholar
  59. Provost F, Fawcett T (2013) Data science for business. O’Reilly Media Inc, SebastopolGoogle Scholar
  60. Raeder T, Dalessandro B, Stitelman O, Perlich C, Provost F (2012) Design principles of massive, robust prediction systems. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data miningGoogle Scholar
  61. Shearer C (2000) The CRISP–DM model: the new blueprint for data mining. J Data Warehouse 5(4):13–22Google Scholar
  62. Sin K, Muthu L (2015) Application of big data in education data mining and learning analytics–a literature review. ICTAT J Soft Comput 05:1035–1049Google Scholar
  63. Singh J, Singla V (2015) Big data: tools and technologies in big data. Int J Comput Appl (0975–8887) 112(15)Google Scholar
  64. Smolan R, Erwitt J (2012) The human face of big data. Sterling, New YorkGoogle Scholar
  65. Townsend A (2013) Smart cities—big data, civic hackers and the quest for a new utopia. Norton & Company, New YorkGoogle Scholar
  66. Tsai C-W, Lai C-F, Chao H-C, Vasilakos AV (2015) Big data analytics: a survey 2(21)Google Scholar
  67. United Nations (2015a) World urbanization prospects. The 2014 revision. Department of economic and social affairs, New York. Accessed 22 Jan 2017
  68. United Nations (2015b) Big Data and the 2030 Agenda for sustainable development. Prepared by A. Maaroof, available at:–participants–big–data–and–2030–agendasustainable–development–achieving–developmen
  69. Wang D, Pedreschi D, Song C, Giannotti F, Barabasi A (2011) Human mobility, social ties, and link prediction. In: Proceedings, international conference on knowledge discovery and data mining, viewed 14 Mar 2016.
  70. Wan J, Liu J, Shao Z, Vasilakos AV, Imran M, Zhou K (2016) Mobile crowd sensing for traffic prediction in internet of vehicles. sensors 16(1):88Google Scholar
  71. West GF (2013) Big data needs a big theory to go with it. Sci Am–data–needs–big–theory. viewed 16 November 2016
  72. Zhang Y, Cao T, Li S, Tian X, Yuan L, Jia H, Vasilakos AV (2016) Parallel processing systems for big data: a survey. In: Proceedings of the IEEE, special issue on Big DataGoogle Scholar
  73. Zhou K, Fu C, Yang, S (2016) Big data driven smart energy management: from big data to big insights. Renew Sustain Energy Rev 215–225Google Scholar
  74. Zhao L, Zhang J, Zhong C (2016) The application of data mining technology in building energy consumption data analysis. Inter J Comp Electr Automat Control and Inf Eng 10:81–85Google Scholar
  75. Zikopoulos PC, Eaton C, deRoos D, Deutsch T, Lapis G (2012) Understanding big data. McGraw Hill, New YorkGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer and Information Science, Department of Urban Design and PlanningNorwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations