Keywords

1 Introduction

Today’s society faces complex ‘wicked problems’, such as migration, poverty, and climate change, for which not one optimal solution exists [1, 2]. In order to address such problems, governments aim to realize public sector innovation that gears them towards becoming platforms of open governance, making optimal use of information and communication technologies (ICTs) to create public value [1]. Increasingly, ICTs are not only used for improving the daily operations of government, but also for enhancing the process of policy making [2]. Policies address societal problems by formulating and implementing laws, rules and guidelines, and policy making is the process of creating and monitoring these policies. Hence, it is often conceptualized as a policy cycle, consisting of several different phases, such as agenda setting, policy formulation, decision-making, implementation and evaluation [3]. ICTs may be used to support and enhance different phases of the policy cycle and enable experimentation [1, 2].

Data-driven policy making uses ICTs to capture the benefits of new data sources [4, 5], and to support collaboration with relevant stakeholders and citizens [2, 6, 7]. It builds on the notion of evidence-based policy making [see, for instance, 8, 9]. In the literature on evidence-based policy making three types of evidence are considered relevant: “systematic (‘scientific’) research, program management experience (‘practice’), and political judgement” [9, p. 1]. Data-driven policy making acknowledges the importance of these types of evidence, but can be distinguished from evidence-based policy making, since it is mainly concerned with the inclusion of big and open data sources into policy making as well as with co-creation of policy by involving citizens. Data-driven policy making is not only expected to result in better policies, but also aims to create legitimacy [10]. Involvement of citizens in a data-driven policy making process is especially important since public data and statistics are increasingly met by citizens’ distrust [11].

To allow for better collaboration and involve citizens, public administrations around the world have set up Policy Labs to allow for experimentation and facilitate the involvement of relevant stakeholders [12, 13]. They, thus, address the need for experimentation and design-thinking to deal with wicked policy issues [1, 2]. Therefore, in this paper we develop a Policy Lab approach for data-driven policy making. First, based on literature of public sector innovation, we identify innovations in the use of data for policy making and co-creation of policy. Secondly, we map these innovations to different phases of the policy cycle. And thirdly, we develop an approach that can be used to guide data-driven policy making in a Policy Lab setting. The next section presents the theoretical background of public sector innovation. Section 3 discusses data-driven policy making and identifies innovations. Subsequently, Sect. 4 presents the development of the Policy Lab approach, followed by a discussion and recommendations for further research in Sect. 5. Finally, Sect. 6 presents the conclusion.

2 Public Sector Innovation

Public sector innovation holds that “[p]ublic policy and services need to become more open and innovative as well as being efficient and effective” [1, p. 2], making optimal use of ICTs [1]. As such, it encompasses a myriad of aspects. Gil-Garcia, Zhang and Puron-Cid [14] refer to as much as fourteen aspects of smartness in government, including evidence-based, technology savviness, openness, citizen engagement, and innovation. According to Millard [1], public sector innovation means that public administrations operate as a platform [15, 16] and use ICT to collaborate across organizational borders [17] and to involve citizens and other relevant stakeholders [6, 7, 18, 19] with the purpose of creating public value [20,21,22]. Over the past decades, ICTs have had a great impact on services delivery [23], opened up public datasets [24] and increased citizens’ participation [25]. The use of ICTs for policy making can, thus, be seen as a next step in public sector innovation [2].

The use of ICTs benefits policy making in two ways. The first is the use of new data sources, such as (real-time) sensor data, either physical (e.g. traffic monitoring [2, 4]), or virtual (e.g. social media data [2, 6]). “Data-driven decisions and intensive use of data, through ubiquitous sensing, advanced metering and integrated applications enable governments to make more informed decisions and improve the effectiveness of public policies and programs” [14, p. 527]. Secondly, it requires from governments to collaborate across organizational borders and with citizens and businesses to enable co-creation of policies [1, 6, 7, 16]. “Co-creation is understood as the active flow and exchange of ideas, information, components and products across society (academia, government, business, civil society and citizens) which allows for a better understanding of participation, engagement and empowerment in policy development” [1, p. 5].

Besides the deployment of ICTs to use new data sources and enable co-creation of policies, public sector innovation is concerned with the ability of public administrations to experiment, using innovative approaches such as gaming, simulation, and installing of sensors for do-it-yourself measurements, and deploy ‘design-thinking’ [1, 2, 14]. In order to do so, many public administrations have set up Policy Labs [12, 13]. “Policy Labs are emerging structures that construct public policies in an innovative, design-oriented fashion, in particular by engaging citizens and companies working within the public sector” [13, p. 2]. Policy Labs exist in all shapes and sizes and on different levels of government (national, regional and municipal) [13]. The majority of Policy Labs do not focus on a specific type of policy or on a specific phase of the policy cycle, but they employ a design and experimentation based approach to policy making [13]. As such, Policy Labs can be considered as a specific instance of Living Labs, which aim to “support public open innovation processes” [26, p. 90]. While Living Labs are concerned with the involvement of private sector organizations as well as citizens in public open innovation processes in general [26], Policy Labs focus on the involvement of citizens (and also other stakeholders) into the policy making process specifically.

3 Innovations in Data-Driven Policy Making

Data-driven policy making thus aims to use new data sources such as (real-time) sensor data and new techniques for processing these data and to realize co-creation of policies, involving citizens and other relevant stakeholders. However, realizing data-driven policy making is complex: many challenges related to the capturing, integration and re-use of data exist [4, 5], as well as to the involvement of citizens and other stakeholders in policy making [2, 6, 7]. This section identifies innovations of data-driven policy making based on literature.

3.1 Use of New Data Sources in Policy Making

The use of new data sources holds big promises: it is expected to offer organizations greater operational efficiency and effectiveness, and lead to the development of new products, services and business models [27,28,29]. In the context of governments, “we are faced with a deluge of data that, when combined with new technologies and analysis techniques, has the potential to inform decision and policy making in unprecedented ways” [4, p. 10]. Big data is often defined as “vast datasets that cannot be analyzed using conventional software and analytic tools” [4, p. 2]. Since many ‘big data’ sources can be stored on a USB-stick nowadays, in the context of public administration, important characteristics of big data are not so much that they require large processing power, but more the variety and the interoperability because of its different data sources and formats [4]. The use of (sensor) data in policy making encompasses three steps: capturing data, integrating data from different sources, and applying these data [30]. Table 1 summarizes the main opportunities, challenges and innovations per step.

Table 1. Opportunities, challenges and innovations of new data sources for policy making.

Table 1 shows that public administrations increasingly see opportunities for the use of new data sources, mainly (real-time) sensor data [2, 14]. These data can be physical, such as roadside monitoring, but also virtual, such as social media data. A study from 2015 finds that governments mainly make use of two types of data for data-driven policy making: “public datasets (administrative (open) data and statistics about populations, economic indicators, education, etc.) that typically contain descriptive statistics, which are now used on a larger scale, used more intensively, and linked [… and …] social media, sensors and mobile phones that are […] analyzed with novel methods such as sentiment analysis, location mapping or advanced social network analysis” [31, p. 3]. Main issues are whether the data are of sufficient quality [4, 5, 18], and whether they are reliable and secure [4, 5, 17, 18]. Otherwise, they may undermine the policy making process [4]. Innovations in capturing data are crowdsourcing [6], and nowcasting, which is the capturing of search engine data [32].

Regarding integration of data, to make successful use of big and open data in organizational processes, cross-boundary information integration (in between government agencies and between not-for-profit organizations and private firms and the public sector) is necessary [14, 17]. The integration of data is becoming more important: linking these data sources with data sources that are traditionally used for policy making such as statistics, surveys and organizational databases is becoming the norm [31, 33]. However, many challenges exist: interoperability of data and lack of standardization, architectures and portals [4, 5, 17]. Another issue are legacy systems that may negatively influence this linking [4, 5]. Poel et al. [31] conclude that currently privately held data is of less relevance, as they are still hardly shared. Opportunities for data integration include sentiment analysis, location mapping, and social network analysis [4, 14, 31].

The third step in the use of new data sources is application and sense-making. While social media analysis and network analysis can be seen as forms of data integration that can be used to support the policy making, we consider the use of visualization tools and computer simulations to be applications of data to the actual process of policy making [19]. However, “[a]mong the initiatives examined, there is little use of advanced analytics or visualization techniques” [31, p. 4]. Another opportunity is to realize greater accountability [14]. Likely, the most innovative use of new datasets take place in the hidden spheres of fighting crime and terrorism [31].

3.2 Co-creation of Policy

Another essential element of smartness in government is co-creation of policy, as ICT not only allows for collaborating with other organizations (public or private), but also with citizens [1, 2, 6, 14]. Co-creation is the exchange of ideas and information between relevant actors, such as governments, businesses, civil society and citizens that lead to the develop of policies [1, 6]. Involvement of citizens in policy making is especially important since public data and statistics are increasingly met by citizens’ distrust [11]. This can take on different forms, depending on the level of involvement [2]: it may range from merely informing public administrations, for example by tapping discussion fora, opinion polls and using social media [2, 6, 19], to participating in decision making and in policy implementation. Table 2, which is based on Janssen and Helbig [2], summarizes the main innovations and challenges to co-creation of policies.

Table 2. Opportunities, challenges and innovations in co-creation of policies.

The most basic form of citizen involvement is informing and signaling, meaning that citizens’ information is used for identifying problems and setting the agenda [2]. Main challenges for this level are to make sure that different groups of citizens are represented, without excluding relevant groups [6]. Examples of this happening can be found in literature on using social media data during disasters and disease outbreak. While nowcasting using search engine data for predicting flu outbreaks can be an accurate predictive methodology; for predicting Ebola, this method proved to be much less accurate since in the areas where the main outbreak was, internet access is still scarce [31]. Furthermore, the stability of social media is a challenge for its use in signaling problems [6]. Innovations in using citizens’ ideas include crowdsourcing [2, 6], online petitions [2], and participatory sensing [19].

The inclusion of citizens’ opinions in decision making refers to a higher level of involvement. This means that citizens are involved in the evaluation of policy options [2, 6]. The most elaborate form of this is the organization of a referendum, but using social media or other online tools, this could be done more efficiently and effectively [1, 2]. Important challenges are to ensure that both citizens’ and skills and motivation [2, 16] and that civil servants’ skills and culture [6] are sufficient. Innovations in involving citizens in the choice for different policy options and decision making are computer simulations and serious games [2, 19], and cross-platform social media analysis [6].

The third level of involvement is implementation of policies, which can be seen as the most immersive level of co-creation. Opportunities for co-creation include collaboration between public administrations, private companies and citizens in policy implementation [2], policy evaluation [19], and transparency and accountability [6]. Challenges include privacy and security [2, 6] and accuracy [6]. Innovations in this level of involvement include camera surveillance, the use of smart phone data and sensors [2], and allowing for agile implementation, delivering faster and better innovations because of regular and short-cycle interactions [19].

4 The Policy Lab Approach

In the previous section we identified opportunities, challenges, and innovations based on literature of new technologies and co-creation in policy making. This section aims to present a coherent Policy Lab approach to data-driven policy making based on the innovations in these fields. Since the framework is to be used for policy making, we mapped these innovations to phases of the policy cycle [3]. Inspired by Janssen and Helbig [2], we distinguish three phases: predictive and problem definition, design and experimentation, and evaluation and implementation. Table 3 elaborates innovations and impact per phase of the Policy Lab approach, and identifies challenges.

Table 3. Innovations, impact and challenges of data-driven policy making.

The first phase of policy making – predictive and problem definition, (real-time) sensor data is used, comprising physical sensor data such as roadside traffic data, and virtual data such as social media data. Furthermore, innovative approaches such as crowdsourcing and nowcasting are also used to predict and identify problems. This leads to the availability of (real-time) information that allows more precise predictions than those that are merely expert based. However, experts are still important to provide context information to the trends spotted by the data. Main challenges are the availability, quality, reliability and security of the data as well as representativeness of the data that should include viewpoints of different groups of citizens without excluding relevant groups. In a study on the use of data for policy making from 2015, over half of the cases identified were used for this first phase of policy making [31].

The second phase of policy making – design and experimentation, should ensure collaboration between government, private organizations, and citizens in the decision making process and choice for policy options. This requires the use of more advanced analytical approaches such as sentiment analysis, location mapping, social network analysis, visualization techniques, computer simulation and serious games to allow for the involvement of other stakeholders in the decision making. A major challenge for the integration of different data sources, the performance of more advanced analyses, and ensuring involvement of citizens is setting up an infrastructure that allows for interoperability and integration of data [17]. Standards, architectures and portals can be instruments for this. Traditionally, governments more often involve citizens after this phase, in the implementation, rather than in the process of decision making. This is reflected in the lower number of best practices in this phase [31].

Evaluation and implementation – the third phase of policy making, allows for joint policy implementation and co-creation of services by government, businesses and citizens. An advantage of the use of new data sets and technologies is the use of an agile approach [15] that allows for short cycles of decision making and implementation. The involvement of relevant stakeholders in the implementation and ongoing monitoring of policy creates public value [20,21,22]. More insight and collaboration may result in greater transparency and accountability, but also to more surveillance. Accuracy of data and data models and ensuring privacy and security are major challenges. Furthermore, co-creation of policy requires specific skills and motivation of citizens as well as specific skills and culture of the government agency [2, 16]. While in traditional e-participation, citizens are involved in policy implementation, actual co-creation involving citizens in the production of services is less often found in practice [7].

These innovations are challenging and in practice most governments do use new technologies and data sets for policy making, but they use this to enrich traditional statistical data rather than achieving co-creation [31]. Therefore, besides allowing for experimentation with policy making, new methodologies need to be developed that are able to make use of these new data sources and technologies. Using a design science approach [34], we developed the Policy Lab approach that can be used to guide innovations in data-driven policy making, allowing for experimentation with new policies and developing new data-driven methodologies at the same time. To validate this approach we held five internal workshops with experts that took place over the course of 2016. Furthermore, throughout this process we consulted academic and governmental stakeholders: four representatives of three academic institutions and six representatives of the national and local levels of government were involved. The Policy Lab approach is graphically presented in Fig. 1.

Fig. 1.
figure 1

The policy lab approach for data-driven policy making.

The conceptualization of the Policy Lab approach presented in Fig. 1 consist of two circles. The inner circle is represents the policy making process, consisting of several phases, such as agenda setting, policy formulation, decision-making, implementation and evaluation. The outer circle of the Policy Lab approach focuses on the development of data-driven methodologies and co-creation. This approach allows the two circles to mutually influence each other: policy experiments can be used to develop and test new methodologies, that, in turn can be used for developing and evaluation policies.

5 Discussion

Based on the literature review, we found that most applications of new data sources, such as (real-time) sensor data are to link them to traditional statistics and few innovative methodologies are used for policy making [31]. However, “utilizing such [social] channels for policy making purposes does not constitute an established approach yet” [19]. This means that first of all instruments and methodologies for the use of these new data sets in traditional statistical and econometric methodologies should be developed [35]. Furthermore, in order for governments to become used to these methodology, they could very well use the use the ‘design-thinking approach’ of a Policy Lab that allows for experimentation. This means that the Policy Lab approach, effectively, has three pillars: using new technologies and data sources for policy making, enabling co-creation and allowing for experimentation.

The use of new datasets in traditional statistical or econometric studies is widely regarded to have a large potential for policy making. Traditional data sources are often text based or have a strong qualitative character rather than a numerical or machine generated form. Newer data sources are often human generated (social media) data, or machine generated sensor data. This can also be seen as the main distinction between data-driven and evidence-based policy. Using these newer data sources means that not only new methodologies need to deal with the size of these new data sets, but also with the variety of data, that may range from traditional statistics, to (real-time) sensor data to human generated text based social media data to images, video streams or geo-data. Statisticians and econometrists aiming to deal with these new (big) data sets, need to learn ways to incorporate them into their traditional methodologies [35].

Fundamentally, there are no contradictions between big data and traditional econometric approaches, but the two have developed independently. For example, the use of big data sets enhances statistics in prediction methods (out-of-sample), which is often not possible in traditional econometrics because data sets are not large enough [35]. Furthermore, when using big data sets it makes more sense to focus on model uncertainty than on sampling uncertainty, which is often examined in traditional econometrics. Finally, machine learning techniques such as decision tree learning may give a better picture than logistic regression [35]. Traditional statistics, in turn, provide useful methods to help variable selection in big data models such as stepwise regression penalized regressions and Bayesian techniques (including time series analysis) [35].

However, while these new methodologies could benefit from the incorporation of big data and linking them with traditional methodologies, traditional policy models are far from obsolete. Big data mainly concerns the discovery of correlations, while policy models present causations that have been developed based on practical experience [36]. Causation hypotheses can ultimately be confirmed using controlled or natural experiments, and, thus, cannot be replaced with big data analyses alone. The degree to which the outcomes of such combinations of big data and statistical models can be explained, thus, represents a major issue. Therefore, the involvement of citizens and experimentation become paramount. This is even more the case in this ‘post-factual’ era, in which citizens are critical of official statistics and data [11]. A Policy Lab setting can be used for controlled experimentation allowing people to ‘buy into’ data, statistical methods and data-driven policies.

Similar to the challenges that Living Labs face, the Policy Lab approach, as a specific instance of a Living Lab, presents the risk of becoming primarily focused the implementation of an open innovation approach, rather than with achieving specific results [26]. While involvement of new data sources and citizens in the policy making process are important objectives, the primary aim should be to improve policy making. If this is not achieved, this may result in a limited application of data-driven policies outside of the Policy Lab environment. This also means, as is the case for Living Labs, that scaling and sustainability are major challenges [26].

Further research should thus focus on the development of these new methodologies that allow for combination of new data sources with traditional statistical data and the combination of big data methodologies with econometrics. Furthermore, experiments with policy development that address wicked problems should be carried out both to involve citizens and increase legitimacy of these policies and to capture the benefits of these new approaches for policy makers. This means that the Policy Lab approach should be validated and expanded based on these experiments. Finally, the issue of scalability and sustainability should be further explored to capture the benefits of data-driven policy making outside of the Policy Lab setting.

6 Conclusion

New data sources and ICTs have great potential for improving policy making. However, data-driven policy initiatives are scarce and the existing initiatives are, often, cases linking (real-time) sensor data to traditional statistical analyses. Therefore, using a design science approach, this paper develops a Policy Lab approach. Based on literature, we identified innovations in the use of new data sources and in co-creation of policies. The involvement of citizens will likely become more important for the legitimacy of statistics and data and policies. Subsequently, we mapped these innovations to the different phases of the policy cycle. Based on this overview, the Policy Lab approach draws on three aspects: using new data sources, co-creation and experimentation with policy making focusing on real-life wicked problems. The experiments can be used to develop data-driven policies as well as to develop new data-driven methodologies. Further research should focus on the development of methodologies for incorporating big data analyses into traditional statistical analysis and on experimentation with policy issues, thereby validating the Policy Lab approach.