1 Introduction

One of the main purposes of opening data by governments is to create transparency [1, 2]. Transparency is often viewed as a condition sine qua non for Democracy [35]. Since the mid-nineties politicians and governments started to create portals to publish government information including budgeting and statistical data [4]. Some governments created Application Interface Programming (APIs) to enable streaming data. Recently, world watched a boom of Big Data Analytics usage of techniques for some reasons [6] e.g. the sufficient possible quantity of data to be collected by millions of users’ devices (smartphones, social networks on Internet), processed at enterprises or opened by government [7], further the fast evolution of hardware to collect, storage, treat and analyze all the data in a small period of time with high velocity of processing [8]. For this we name on this paper Big and Open Linked Data (BOLD).

All these effort are aimed to create transparency, although the actual contribution to creation of transparency can be challenged [9, 10]. The merely opening of data without providing any user-interface might result in an inability to use, whereas the inclusion of a predefined user-interface and visualization might only result in a biased picture. The more data is opened the higher the information overload. Disclosing open data does not result in transparency, only the actual use of the open data can result in transparency.

A lack of transparency is usually caused by information asymmetry between the government (agent) and the citizens (principal) [11]. Information asymmetry refers to the situation in which one party has more information than the other [12]. In our situation this occurs as government has more information than its constituents. Information asymmetry prevents that all the stakeholders have the same insight on the governmental issues, being able to control the public policies and participate on the public management. The opening of data should overcome the information asymmetry to some extent and provide better insight into the inner part of the government.

Although transparency is an intuitive appealing concept, there is no uniform view on what constitutes transparency. Some authors even define transparency as magical concept [13, 14], whereas others use the term as synonymous with accountability [1517] and/or openness [9, 18]. In this research we contribute to this discourse by investigating the aspects of transparency and dimensions for opening data to enable transparency. This paper is structured as follows. First we present the research approach that uses literature review and interviews as the primary research instrumented. This is followed by the literature background, which results in the conceptualizations of transparency. Next a Big and Open Linked Data Framework is presented to identify dimensions and subdimensions that influences transparency. This results in two dimensions and theirs sub-dimensions: data dosclosure and data usage. Finally conclusions are drawn.

2 Research Approach

Our aim is to get insights into the factors influencing and conditions for creating transparency. First a literature review of transparency concept was included by searching on the terms “transparency + government’’ “Transparency + accountability + government’’ and “Transparency + openness + government’’. Papers published in the e-government top journals were reviewed including Government Information Quarterly (GIQ), International Journal of E-Government Research (IJEGR), Transformational Government, People, Processes and Policy (TGPPP), E-Government, an International Journal (EGIJ), International Journal of Public Administration in the Digital Age (IJPADA) and Information Polity (IP). The review was limited to the first 50 papers as presented by Google Scholar and Scopus. The main reason for this limitation is that after the fifth page, the number of publications started to repeat itself and the remaining works was not considered as influential due to the limited number of citations. Another limitation was that some books were not accessible online. Those publications were excluded. A total of 200 publications were found of which 85 publications were considered to be relevant for our research after scanning the papers but only 54 papers were selected as source and reference.

Results having only the keywords inside the list of references were excluded. The literature review revealed two mainstreams: transparency as being synonymous with accountability [1517] and as synonymous with openness [9, 18]. Next the content of these papers was analyzed for factors affecting and conditions necessary for data disclosure and data usage. The literature review results in transparencyfFramework considering dimensions that influences data disclosure and data usage to promote transparency, identified as a majority flow mechanism to achieve transparency.

3 Views on Transparency

Our literature review revealed that he concept of transparency is complex due to its ambiguity and various usage in the literature and practice [3]. Transparency has been used as a magic concept by governments to improve efficiency [13] or as synonymous with accountability [1517] and synonymous with openness [9, 18]. In practice the concept of transparency is often misunderstand and more talked about than practiced [19].

In our view transparency is aimed at overcoming the information asymmetry between the government and the public. Transparency refers to the ability for the public to understand the various aspects of government. It is about the ability to see the inner working of the government. This means that who and how decisions are made and what evidence is used are transparent. For this purpose data about the functioning of the government should be released. From this puzzle, two main categories on transparency were identified at literature view: Transparency as synonymous with accountability and transparency as synonymous with openness.

3.1 Transparency as Synonymous with Accountability

Accountability normally involves a relationship between two or more parties, where one party holds the responsibility of performance given certain objectives pre-stablished or planned, taking in consideration public principles such as effectiveness and efficiency use of resources to realize the purposed objective. Accountability implies answerability for one’s actions or inactions and the responsibility for their consequences [20]. Accountability means also taken responsibility for decisions. Elections are a case of accountability in governments, where people can judge the past actions of politics after managing the state during some period of time [21]. Accountability concerns the comparison of objectives with the realized performance and deviations [16].

Some actions, which are often considered as part of transparency, are in fact actions necessary for accountability and keeping politicians and public officials accountable. Many open data are just a publicity of data or the published data cannot be used for accountability, as its characteristics are not suitable for this. For example too low quality or only providing insight into one aspect. Transparency as accountability is also used to identify when it is possible to enable anti-corruption in government [22, 23]. For the practice perspective, the Transparency International [24] has lead its objective of transparency toward the anti-corruption goal, identifying who has not a suitable work within public management. In their index the term transparency is used to advocate its objective, the anti-corruption. Nevertheless, in our view transparency is the way to enable anti-corruption tools. In conclusion, information through transparency of governments is the raw material to enable accountability or anti-corruption tools. Accountability uses data/information [25] and do something with the information that publicity created [16]. Yet accountability does not need complete transparency. Some activities might be hidden, as they are no needed for being accountable. Only surrogates are published which are necessary to keep one accountable. No knowledge of the inner working of the public system needs to be published. Those parts remain hidden and are not transparent. From this we conclude that although accountability and transparency are overlapping they are also distinct.

3.2 Transparency as Synonymous with Openness

After being sworn in the United States of America, the President Barack Obama created the Memorandum of Open Government with the aim of “creating an unprecedented level of openness in Government” [26]. One of the underlying goals was to create an open and transparent government. Openness does not represent automatically the result of increasing transparency in governments, however, it influenced the creation of open government data portals and legislations as a freedom of information act. For example a box can be open, but still it might not be transparent and you cannot see inside it. On the other hand openness might be necessary condition for transparency. If the system is closed there cannot be any transparency. From this we conclude that although openness and transparency are overlapping but distinct concepts.

Scientific literature points out that openness is also close to open government initiatives [27, 28]. From this, scientific literature and practical people are facing an operational and theoretical definition toward openness as Open Government concept. For governments it is not an initial stage of the word and the concept, however, for military/civil usage of nuclear weapon/energy [29, 30] and the finances [28], openness is not also on initial phase. Both of them can be inspired on how processes can be at same time transparent, open and not show evidence of core business or private information, what basically is citizens’ need to do accountability and politics for governance without revealing important parts of the political game inside government.

4 Towards a Transparency Framework

4.1 Basic Framework

Basically the BOLD has three-steps: Collecting data, from data internal databases, spreadsheets, document files, sensors spread over a city or a social network on Internet. Secondly, storage of data, that requires advanced and unique data storage, management [31] and thirdly analysis and visualization technologies [6]. In the opening of data there are two important stakeholders the publishers and users of open data. The publishers and users often are unaware of each other needs and encounter different challenges and barriers [32]. The data publishers’ main activity is the disclosure of data, which is necessary before data can be used. Yet only releasing data does not result in any transparency. Only the actual use of data results in transparency. Both steps are influenced by a large amount of factors. In conclusions, transparency can only be created when both data disclosure and usage happens. In Fig. 1 the basic transparency framework in the form of these two essential activities are presented. Hereafter we delve into the details of factors and conditions impacting data disclosure and usage.

Fig. 1.
figure 1

- Basic transparency framework

4.2 BOLD Transparency Framework

From the basic framework, was possible to identify that each dimension could be deep described with sub-dimensions. The Fig. 2 reveals it and each dimensions identified was deeply described on the Sect. 4.3.

Fig. 2.
figure 2

The BOLD Transparency Framework

4.3 Data Disclosure Category

Being transparent requires the disclosure of data. The disclosure of data is a condition and a first principle for creating transparency. Yet simple making data online is not sufficient. Meta-data about the information quality, the way information is disclosed influences the actual transparency. Four dimensions were identified as follow and summarized at Fig. 2.

A. Type of Data Disclosure.

The literature review [33] identified that disclosure of data and information in government occurs when there is Proactive dissemination by the government, Release of requested materials by the government, Public meetings and Leaks from whistleblowers. The disclosure of data prompts two “types of data disclosure”, a first sub-dimension of dimension “data”: (i) formal and (ii) informal. Furthermore it was identified the existence of different channels to disclose, sometimes, the same kind of data, or eventually, on different formats and conditions.

B. Type of Channels to Data Disclosure.

On the dimension “types of channels to disclose data” the following types of channels were identified: (i) transparency portals [34], (ii) freedom of information access (FOIA) using all kind of channels [35], (iii) open government data portals [36], (iv) governmental portals, (v) dashboard of services advertising, (vi) outdoors with accountancy expenditures data in public works, (vii) public financial statements (newspaper, (viii) paper based at blackboards on City Hall, (ix) Internet based on portals), (x) call center to provide information and (xi) call center to provide access and demand to public services [37]. The channels aforementioned can use different kinds of technologies and some of them were explicated such as public financial statements, with newspaper, paper based at blackboards on City Hall and Internet based via portals.

C. Type of Technology Used to Disclose Data.

The third dimension, “Type of technology used to disclose data”, is part of the dimension “data” disclosure and a list of technology identified on literature is presented by: (i) politics discourses and civil servant responses, (ii) printed based (paper, newspapers, outdoors of public works and services, signs, etc.), (iii) electronic formats of data (static web portal, downloadable files, etc.) [38] and (iv) real-time electronic accessable data (direct databases access and APIs) [39].

D. Type of Characteristic of Data Disclosed.

This implies on the fourth dimension “type of characteristic of disclosed data”, that comprehends factors and conditions of data, such as the “quality of data”, taking in consideration the type of disclosure, type of channels and technology used and characteristics identified at literature. It is important to highlight the types identified were not deeply discussed at this point and will be approached on next publication, deepening the dimensions and sub-dimensions found here and presented below: (i) data accuracy [40], (ii) data timely [41], (iii) data acessibility [42], (iv) data completeness [43], (v) data security [44], (vi) data trustiness, (vii) data free [26], (viii) data documentation, (ix) data permanently and history, (x) data primarily [41], (xi) data metadata and interlinked [45], (xii) data non-proprietary and non-discrimnatory [41, 46], (xiv) data license-free [26, 41], (xv) data machine processability [41, 46], xvi) portal simple language [47], (xvii) open data policy and license [26, 41].

4.4 Data Usage Category

Only publishing data does not create transparency. The second step of the framework flow of transparency is the Data Usage, in which the public usage data to address solutions to solve theirs interests through the best technology they consider. Taking this principle, the dimension “Data Usage” has four sub-dimensions, factors and conditions as follow and summarized at Fig. 2.

A. Type of Actor Interested on Data Usage.

Anyone or any organization need to use data for some specific interest, running a determined business model and with a chosen technology to collect, treat and analyze the data. The scientific and practical literature [48, 49] identified five actors that have been using transparent data, whatever the types already described at Data Usage category and its dimensions: (i) Academics, (ii) Enterprises, (iii) Governments, (iv) Journalists and (v) Organized Civil Society.

B. Type of Interest on Data Usage.

From the actors that use disclosed data, it is necessary to comprehend their interest to use the data. The scientific and practical literature [37, 49, 50] identified four types of interest on data usage dimension: (i) Service Delivery, (ii) Accountability, (iii) Advocacy and (iv) Participation.

C. Type of Data Usage Business Model.

To sustain the data usage is necessary a business model. The scientific and practical literature [37, 49, 50] identified six types of data usage business model dimensions: (i) Big Data Analysis in governments, (ii) Governmental Portals and procedures for participation, (iii) Governmental portals for social control (accountability), (iv) Organized Civil Society portals and procedures for participation, (v) Organized Civil Society Data Visualization portals to intermediate relationship between governments and civil society and (vi) Private applications to improve service delivery ran by advertising or supported by civil society organizations.

D. Type of Technology Used by Actors.

Taking in consideration that actor has an interest and a business model to use the data, the last type of dimensions is technology. The scientific and practical literature [7, 51] identified six types of technology that can be used: (i) Computer programming languages, (ii) Data Visualization, (iii) Geography Coding and Mapping, (iv) Networking Analysis, (v) Business Intelligence and (vi) Data Mining.

4.5 A Framework with the Dimension of Transparency

From the BOLD Transparency Framework, is possible to addres the Table 1 - BOLD Framework Summary Dimensions and Sub-Dimensions.

Table 1. - BOLD framework summary dimensions, sub-dimensions and types of specifications and characteristics

5 Conclusions

Transparency is a multi-facetted concept and stakeholders give different meanings to the concept. An important contribution is the identification of two transparency concepts that are often used in the literature. One concept is accountability and the other openness. Both are overlapping with transparency, but are distinct concepts. We define transparency as the level of insight into functioning of the government. For this purpose data should be disclosed.

Although open data disclosure is a condition for transparency only the actual use can result in transparency. Hence a framework consisting of open data disclosure and use resulting in transparency was proposed where dimensions and sub-dimensions were revealed.

The identified dimensions help to understand what influences the synonymous types of transparency concept. We recommend combining information quality literature with the framework presented in this paper. Furthermore we suggest to filter the dimensions identified in this paper and determine its magnitude of influence on transparency. The next research paper will provide a case study based on the BOLD Transparency Framework identified to find refine the dimensions and sub-dimensions.