1 Introduction

The development of human-centered smart cities provides a new opportunity for the acquisition of real-time spatial data and a new attention to human emotions. Urban emotion, not only the important aspects of smart city, but also a human-centered approach, was emerging rapidly in the recent years. It integrates various disciplines such as spatial planning, geographic information system, computational linguistics, sensor technology methods, and real-world data [1, 2]. It aims to understand how people’s feelings get affected by urban or geographical factors [2], and it has many functions, for instance, as decision support or evaluation criteria in the ongoing urban planning process, more reliable results can be obtained than the existing urban analysis approaches [3], and the perception of citizens can be managed, and so on.

In the current era of rapid development, big data is no longer a strange word, sensor devices are becoming smaller, cheaper and more powerful. Ubiquitous sensors make humans machine-readable [4]. Human activity, appearance and emotional states will be recognized, recorded and processed [3]. This vast amount of human data has to be used. In this paper, a microscale urban emotion measurement method is proposed. This method recognizes a large number of photos which collected by external sensors through artificial intelligence identification technology, and the identified data can be used for visualization. Although the type of emotional data in this method is fixed and immutable, it can be applied different spatial scales projects according to practical and actual needs. In the future, real-time data visualization can be realized in combination with big data.

2 Related Work

There are two types of urban emotion research projects: user-oriented and spatial-oriented. The user-oriented projects target users, on the other hand, spatial-oriented projects target geographical space. Each of these projects contains two types of data: spatial data and emotional data.

Spatial data mostly refers to geographic location data which detects by sensors. These data are usually measured and recorded for visualization and mapping. Emotional data mostly refers to basic human emotions, happiness [5, 7], anxiety [8], perceptions [10, 11], stress [16], etc. These data are either detected by sensor or judged by user, and there are several measurement modes: tagging based on GPS, extracting from social media, ground-truthing, psychophysiological monitoring [12, 13], questionnaires, etc. Each emotional data will match with geographical location data.

Besides, the emotional data can be classified or derived from different criteria before calculation. For instance, the emotional data in Choudhury’s Majitar project [1] are divided into six parts according to facilities, including educational, entertainment, health, industrialization, shopping mart, transportation. World Happiness Report [5] used the Gallup World Poll questionnaire, which covering 14 fields, such as economic, education, government, safety, health, work, etc.

2.1 Spatial-Oriented

According the difference size of the location, spatial-oriented projects can be divided into macroscale, mesoscale and microscale. Macroscale spatial-oriented projects refer to projects at the global or national levels, such as World Happiness Report [5, 6] and Personal Wellbeing Three Year Dataset Maps [8] which recorded the emotional data based on United Kingdom. Mesoscale spatial-oriented projects refer to projects at the city or community levels, such as Christian Nold’s Emotion Map [9] and Choudhury’s Majitar project [1]. Microscale spatial-oriented projects refer to projects at the street levels.

Macroscale.

This type of project refers to the study of emotional data in a large scale urban space, such as at the global and national levels.

World Happiness Reports [5].

A landmark global happiness survey, has quantified and ranked the subjective senses of happiness and well-being in more than 150 countries. The data used for the country rankings came from the Gallup World Poll questionnaire, and there have six variables, included GDP per capita, social support, healthy life expectancy, freedom to make life choices, generosity, and perceptions of corruption. These six key variables were used to explain the variation of happiness across countries. They revealed a populated-weighted average score called the happiness score, which was tracked over time and compared with other countries. All these data are mapped and visualized [6].

Personal Well-Being Interactive Maps [8].

This is a study by Office for National Statistics (ONS) on local happiness and the geographies of subjective well-being in United Kingdom from 2012 to 2016. The data on life satisfaction, happiness, worthwhile and anxiety were collected using the Annual Population Survey (APS) and weighted average. The interactive maps were mapped and visualized from the weighted average data across the United Kingdom [17].

Mesoscale.

This type of project refers to the study of emotional data in a medium scale urban space, such as at the city and community levels.

Christian Nold’s Emotion Map [9].

The maps are the outcomes of the series of ‘Bio Mapping’ project, which has involved thousands of participants in over 16 different countries to explore the political, social and cultural implications of visualizing biometric data and emotions since 2004. A simple biometric sensor measuring Galvanic Skin Response and a GPS device was used to build the Bio Mapping device, which is portable and wearable tool. The bio-sensor, which based on a lie-detector, measures the sweat changes levels and assuming those changes are an indicator of emotional intensity. GPS part record the geographical location. The data were visualized and mapped by geographical mapping software at a city level, such as Stockport Emotion Map, East Paris Emotion Map, San Francisco Emotion Map, Greenwich Emotion Map, etc. [9].

K-Means Algorithm [1] and K-Nearest Algorithm [15].

The emotional data of Majitar, a small village in East Sikkim in the Indian state of Sikkim, were measured and determined in two separate projects. K-means is used in the unsupervised learning method project and k-nearest is used in the supervised learning method project. The emotional data are collected through questionnaire and can be divided into six facilities: education, entertainment, health, industrialization, shopping mart and transportation. The calculation results were drawn in six ANOVA tables according to six different facilities.

Microscale.

This type of project refers to the study of emotional data in a small scale urban space, such as at the street levels.

Human Sensory Assessment Methods [10].

Eastern Harbor Promenade, main promenade of Alexandria, is an average of a 4.0 km stretch along the water front. It can be divided to inner side and outer side of the path. The surrounding impacts that influence peoples’ perceptions can be identified using the data obtained by measuring stress reaction of local participants and foreigner participants when they are walking through the two pathways. The GPS tracker automatically synchronizes the geo-data when the stress reaction is identified. The identified stress points are stored as geological data and visualized in the stress hotspot heat map. This study not only proposed that different cultural backgrounds might affect the perception of the participants in different urban spaces, but also presented a method to investigate the relationship of stress emotion and environment.

2.2 User-Oriented

There are many user-oriented projects, which mainly study the emotions generated by users in different urban spaces, especially public spaces. This type of project focuses on the user emotion and perception rather than urban space. For instance, EmoCycling project [16] focuses on traffic safety and detects the subjective safety by measures the negative arousals of a cyclist riding a bike in the city. RADAR SENSING app [14] adopts the tagging [12] method, and People as sensors app [3] using the ground-truthing method [12] to measure and record the human basic emotions in the urban context and these emotional data is mapped. Walking & talking method [18] probed the relationship between people and places through walking interviews, and conducted in-situ emotions identification through the Plutchik Emotion Wheel.

All the above cases focus on individual’s subjective emotion and perception to the urban space. Most of the emotional data here refer to basic human emotions, like relax, happy, sad, angry, stress, etc. These projects are generally small in scale and directly focus on the emotions of users, whether they are subjective experience and provided by users or detected by external sensors. Subsequently, the emotional and geographic data were recorded and visualized for analysis.

3 Microscale Measurement Method

The user-oriented project is a study of the users’ emotions on a given topic. The mac-roscale and mesoscale spatial-oriented projects do not directly concern the personal emotions of users or citizens. Microscale projects which need to study the street, due to the its small scale, a large number of streets need to be collected and studied to form effective data for comparison. Therefore, this paper provides a method for microscale spatial-oriented project to collect and study a large quantities emotional data of different streets. The ‘micro’ of this microscale spatial-oriented method not only represents the small space as street, but also represents that the emotional aspect of this method will directly focus on the personal emotions of citizens. This method introduces a way that can collect a large amount of emotional data in a street and use the data for visualization. After collecting a large number of streets through the ubiquitous sensors in the future, mesoscale or macroscale projects can be gradually formed according street number.

3.1 Preparation

This paper proposes a micro scale spatial-oriented measurement method of urban emotion based on artificial intelligence technology, namely emotion recognition and object recognition. There are five streets in this project is selected from Beijing hutongs. In terms of emotion, seven emotions commonly used in current emotion recognition technologies are selected. These emotional data will be compared with the element data which come from the object recognition in the hutongs.

Hutong Selection.

Hutongs from the Mongolian word “gudum”, narrow street which recorded the history of Beijing, well regarded for the local customs, practices and numerous cultural attractions. The Old Beijing was divided into four walls, namely the outer city, the inner city, the imperial city and the Forbidden City. There are many famous hutong areas in inner city and outer city. The famous hutong areas in inner city included Beiluo area, Nanluo area, Shichahai area, etc. The famous hutong areas in outer city included Dashilan area, South hall, Temple of Heaven, etc.

The Shichahai hutong area and Dashilan hutong area was chosen from the inner city and the outer city respectively. From Shichahai hutong area, the Yangmeizhu Xiejie as long as 496 m, the Yingtao Xiejieas long as 579.4 m and the Tieshu Xiejieas long as 551 m were selected. From Dashilan hutong area, the Yangfang Hutong as long as 470 m and the Luoer Hutong as long as 285 m were selected.

The Hutong selected by this project has not been fully developed or completely abandoned. These hutongs not only have new and old residential houses, new and old stores but also include the style and appearances of the hutongs before and after the Beijing Hutong Renovation Project. In addition, the length of hutongs will affect the number of photos, so it is better to have similar length.

Emotion Group and Element Group.

The project began by taking a large number of photos and identifying two groups of data: the emotion group and the element group. The emotional group obtains emotional data by recognizing people’s expressions in the photos; the element group identifies the hutong elements in the photograph through object recognition technology based on Beijing Hutong ontology (see Fig. 1).

Fig. 1.
figure 1

Hutong Beijing ontology

Emotion Group.

The emotional data were drawn from seven emotions commonly used in current emotion recognition technologies: happiness, sadness, anger, neutral, fear, surprise and disgust. Besides, age and gender can also be identified. Each photo identifies only one person.

Element Group.

As shown in Fig. 1, the elements in Beijing Hutong photos are classified in detail according to dynamic and constant characteristics. The constant categories include five subcategories: sky, green (plant), building, facility, and road. The dynamic categories include two subcategories: people and vehicle. Each of subcategories contains many elements. Several elements are selected as key objects for object recognition. One of key elements is tuktuk (see Fig. 2), a tricycle, which is a common means of transportation in Beijing Hutong.

Fig. 2.
figure 2

Object recognition of photo ‘Yangmeizhu Xiejie 17’

3.2 Proposed Approach

This approach is divided into three steps: photo collection, AI recognition, and data visualization. The collected photos will identify two groups of data, emotion and element, and finally visualize the data.

  • Step 1: Data Collection. In this phase, GoPro was used to collect hutong images. The GoPro was set to shoot every five seconds and was attached to a bracket. Holding the GoPro bracket and walking slowly back and forth in a hutong. After walking slowly back and forth through those five selected hutongs, a total of 750 photos were taken. After deleting the overexposed and unoccupied photos, a total of 503 valid photos were obtained.

    The 503 photos identified two sets of data using two recognition techniques. Each photo can recognize a set of emotional data and a set of element data. The unrecognizable photos and valid data sets are deleted. The photos and valid data of each hutong are summarized as shown in the following Table 1.

    Table 1. Data collection and the number of valid data.
  • Step 2: Data Recognition. After the photo collection phase, two kinds of recognition (see Figs. 2 and 3) are carried out, and the data are obtained. The recognition process is accomplished by calling commercial API. Each valid photo can be identified one set of data (see Table 2). Next, the identified data can be sorted into a table, with a total of 1771 sets of emotion valid data and 377 sets of element valid data (see Table 1).

    Fig. 3.
    figure 3

    Emotion recognition of photo ‘LuoErHuTong 19’

    Table 2. An example of one valid data set from emotion recognition.
  • Step 3: Data Visualization. The 1771 sets of emotion valid data and 377 sets of element valid data are drawn into four visualization maps by visualization tools, such as RAW Graphs. The visualized maps are not map-based because there is a small geographic information gap between each street’s data.

4 The Visualization Maps

The data visualization maps included Classification of the Elements in Hutongs (Fig. 4), Gender-based Emotional Information of Hutong Residents (Fig. 5), The Key Elements in Hutongs (Fig. 6), and Emotional Information of Hutong Residents (Fig. 7).

Fig. 4.
figure 4

Classification of the elements in Hutongs

Fig. 5.
figure 5

Gender-based emotional information of Hutong residents

Fig. 6.
figure 6

The key elements in Hutongs (Color figure online)

Fig. 7.
figure 7

Emotional information of Hutong residents (Color figure online)

The Classification of the Elements in Hutongs (Fig. 4) categorizes the data of element groups according to hutong, subcategories of hutong ontology, and time. Yangfang hutong has the most effective data. Basically, each hutong contains the same proportion of subcategories, with the largest proportion of transportation (vehicles), followed by people, then architecture, greening and equipment.

The Gender-based Emotional Information of Hutong Residents (Fig. 5) show the basic emotion, age and gender information of citizen in different hutong. This visualized map shows two important messages. First, the ratio of male to female is approximately 7:3, there are more boys than girls. Secondly, the most common of the seven emotions of citizen was neutral, followed by sadness and happiness.

The Key Elements in Hutongs (Fig. 6) show the key element, such as tuktuk (blue color), private green (green color), small gatehouse (purple color), and drying rack (orange color). Tuktuk was chosen because it is the main means of transportation in the hutongs. The small gatehouse is a symbol of ancient Chinese architecture. The choice of drying rack, or we called clothes hangers, can only be placed on the roof, outside the door, and other public space because of the lack of space in the hutongs. Private greening in narrow space hutong is represent a phenomenon that old men how to spend their time and pursue green life in hutong. The colors represent the objects identified, while the number of blocks represents the number of objects identified. The blocks are hierarchical. First, divide the artboard into 5 large blocks according to 5 hutongs, and then divide the corresponding number of middle blocks according to the number of valid photos of each hutong. Finally, according to number of objects identified in the photo, the middle blocks is divided into small blocks and filled with color.

The Emotional Information of Hutong Residents (Fig. 7) show the sadness (blue color), neutral (gray color), disgust (violet color), anger (red color), surprise (orange color), fear (black color), and happiness (green color) of citizen in five hutong. The blocks are hierarchical. First, divide the artboard into 5 large blocks according to 5 hutongs, and then divide the corresponding number of small blocks according to the value of emotional data of each photo and filled with color. In this visualized map, it is not difficult to find out that the gray neutral is the majority, followed by blue sadness.

5 Discussion

These comparison between the visualized map explore the relationship between urban emotions and urban facilities. For instance, The Key Elements in Hutongs (Fig. 6) and The Emotional Information of Hutong Residents (Fig. 7) can be used to analysis at the same time. In the Key Elements in Hutongs (Fig. 6), the numerous blocks shown in this picture means the numerous objects was identified, while in The Emotional Information of Hutong Residents (Fig. 7), the gray neutral is the majority. Both result came from the same set of photos. In the half-developed hutongs, there are numerous hutong elements, which represent the current perfection of hutongs’ facilities, equipment, tools and environment. Although these things are not necessarily excellent and advanced, but certainly to meet the basic needs of residents. Even so, hutong residents are mostly neutral and sadness. The most interesting finding was that the development of city or hutong has failed to bring happiness to citizen.

6 Future Work

As a spatial-oriented project, these data results are from half-developed hutongs. Several years later, when the hutong reconstruction project was completed, the residents’ emotional information was collected again for comparison with the data this time, so that feedback from the citizen could be obtained. In the long term, the results are used in decision support in urban planning and the evaluation system in the ongoing processes. In the sensor-rich future cities, a large amount of data can be formed by accumulating the recognized data. These data can be used as a reference, feedback or evaluation criteria in the urban planning, such as street reconstruction, policy implementation and so on.