Innovation and Big Data

van den Herik, H. Jaap

doi:10.1007/978-3-319-45378-1_3

H. Jaap van den Herik¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9842))

Included in the following conference series:

IFIP International Conference on Computer Information Systems and Industrial Management

2252 Accesses

Abstract

Innovation is an essential issue for companies and educational institutes. We have seen that well-known companies came into trouble since they did not innovate in time. They kept their well-established way of operating and adhered to their old-fashioned business models. The main topic of this contribution is to address the question: to what extent can the availability of Big Data support the innovation attempts of companies and educational institutes?

We start by mentioning five examples of innovation and relate them to data science and big data. We describe basic concepts and developments. We then formulate six possible classes of obstacles and take into account the use of sensitive data. Finally, we arrive at two conclusions and three recommendations.

You have full access to this open access chapter, Download conference paper PDF

The Creation and Disruption of Innovation? Key Developments in Innovation as Concept, Theory, Research and Practice

Big Data in the Innovation Process - A Bibliometric Analysis

Innovation in Times of Big Data and AI: Introducing the Data-Driven Innovation (DDI) Framework

Keywords

1 Introduction

The current technological development is in a state of flux. New applications are followed by even newer ones. Many scientists see this concatenation of applications as a disruptive chain. For a proper understanding we mention five new developments (or applications of well-known concepts): (1) autonomous cars, (2) blockchains, (3) crowd sourced online dispute resolutions, (4) Airbnb, and (5) Uber Taxi. They can be seen as new paradigms that result from the combination of new and old technologies.

The question to what extent big data does support the innovation can be answered by investigating the string of activities that takes place in the study of data science. Data science is a new discipline in the academic world and in particular in the research world. It originates from computer science (informatics) and statistics. Big Data has mitigated the use of samples as is exploited by statisticians. This implies that for a proper handling of the available data no longer the well-known statistical sample techniques are sufficient, but that a new statistical approach has to be developed, i.e., reasoning in a Bayesian network.

In data science we nowadays distinguish seven phases of activities. They are: (1) collecting data, (2) cleaning data, (3) interpreting data, (4) analyzing data, (5) visualization of data, (6) narrative science, and (7) the emergence of new paradigms. The last phase has been instantiated by the five examples at the beginning of this section.

Below we will discuss the following topics and their obstacles: Real big data (Sect. 2); Definitions (Sect. 3); Small data (Sect. 4); Turn around management (Sect. 5); Criminal behavior (Sect. 6); Conclusions (Sect. 7), and Safeguards and recommendations (Sect. 8).

2 Real Big Data

There are only two research areas where we face real big data, viz. particle physics (e.g., at Cern) and astrophysics (e.g., at Lofar). Two other areas where the storage of data is extremely large are DNA-research and Geo-research (in general). For social sciences we see enormous amounts of data stored in Wikileaks, on Dating sites, and in the Panama papers. Here we meet our first class of obstacles. They consist of transport problems, complexity problems, and reputation damage:

Obstacles (Class 1)
Big data	Obstacles
CERN LOFAR	1. Transport problems (accessibility)
DNA GEO	2. Complexity problems (unrelated data)
Wikileaks Dating sites Panama papers	3. Reputation damage (side-effects/non anticipated)

3 Definitions

Big Data is a kind of container concept. Almost all disciplines have their own definition of Big Data. In this contribution we provide two definitions: a technical one and a business one. The technical definition is by White (2012).

Technical Definition: “Big Data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand databases management tools or traditional data processing applications.”

The challenges capture

1.
curation,
2.
storage,
3.
search,
4.
sharing,
5.
transfer,
6.
analysis,
7.
visualization,
8.
interpretation,
9.
real-time (van Eijk 2013).

The second class of obstacles is defined by the characteristics of the collection of Big Data at hand. They are called the five V’s:

Obstacles (Class 2)

The five V’s are:

1.
Volume
2.
Velocity
3.
Variety
4.
Veracity
5.
Value

The business definition of Big Data has been formulated by Phillips (2015).

Business Definition: “Big Data focuses on value creation, viz. to obtain operational benefits from data, and will then exploit the benefits for performance improvements.” Two important capabilities are:

(1)
Companies learn faster on their business trend,
(2)
Companies can rely faster on the new trends than their competitors.

The third class of obstacles is defined by legal requirements and competitive behavior:

Obstacles (Class 3)

The intention to make infringements on the legal requirements with reference to the commercial competition.

Examples come from the auto industry, among them Volkswagen.

4 Small Data

Next to big data there is small data. They constitute techniques which enables effective use of Big Data. A telling example is cookies. Originally, cookies were meant to bring relevant input to places where they are in a better environment, i.e., to be of better use in their “ecosystem”.

Nowadays cookies are used for: personalization, profiling, advertisements, recording of behavior and for “selling purposes”. The fourth class of obstacles is defined by privacy and security issues:

Obstacles (Class 4)

The main obstacles for small data are

Privacy
Security

5 Turn Around Management

Big data can be used for monitoring the life cycle of companies. We assume that the life cycle traverses through the following six stages: (1) Foundation of the company, (2) Establishing a healthy company, (3) Strategic crisis, (4) Earning crisis, (5) Liquidity crisis, and (6) Bankruptcy (cf. Adriaanse 2015). We will see the following development.

A proper recording of data within the company may serve as monitoring system and even as alert system for unexpected developments. The availability of big data will make the prediction very accurate.

The fifth class of obstacles is defined by privacy and ethical considerations as well as by commercial competition and legal opportunities:

Obstacles (Class 5)

The main obstacles for use of big data in Turn around management.

Privacy
Ethical considerations
Commercial competition
Legal opportunities

6 Criminal Behavior

An important case showing criminal behavior is the marathon bombing in Boston. We illustrate this marathon bombing by three pictures. In April 2014 an unexpected bomb exploded at the end of the Boston Marathon (see picture 1). There were sensor images and recorded pictures. The crowd was so big that no criminals involved in the attack could be identified. The police quarantined the city and attempted to find the criminals in their computer databases. They could not identify them. Finally, one brother was killed by police force (Dzjochar Tsjarnajev) in a gunfight and one day later Tamerlan Tsjarnajev was arrested albeit by a coincidence through a blood trace observed by an alert citizen.

If the bomb had been inspected accurately, it could have been established that its origin was Chechnya or a neighbor area, which would have refined the number of candidate criminals considerably. Once the criminals were caught, the recorded images afterwards produced evidence for what has happened and where the guilty persons stayed during the explosion.

To investigate such criminal behavior with modern technological means we see that a combination of Big Data, High Performance Computing, and Deep Learning is most effective. These three components form the basis of narrative science. It means that an intelligent program constructs the story and points to the criminals. This approach is also known under the name story telling. For lawyers it is called argumentation theory. The sixth class of obstacles is defined by privacy, public safety, technicalities of narrative science, and sensitive data:

Obstacles (Class 6)

Privacy
Public safety
Narrative science
Sensitive data

The investigations of criminal behavior have an important obstacle class 6 to take into account, viz. sensitive data (cf. Van Eijk 2015)

Sensitive Data

Obviously, some data are really personal, i.e., data revealing

(1)
racial or ethnic origin,
(2)
political opinions,
(3)
religious or philosophical beliefs,
(4)
trade-union membership, and
(5)
the processing of data concerning health or sex life

The general rule for companies is that the processing of sensitive data is prohibited without explicit consent

(Directive 95/46/EC, article 8).

The “scientific” development of “disruptive” technology, the Road to Deep Learning, is:

The road to deep learning
Artificial intelligence	1950–1990
Machine learning	1990–2000
Adaptivity	2000–2005
Dimension reduction	2005–2010
Deep learning	2010–2015
Big data and HPC	2012–2017
New statistics	2014–2019

7 Conclusions

From the above line of reasoning on the obstacles that occur when Big Data research is involved we may conclude the following:

Conclusion 1: For a legitimate commercial interest, a legal foundation is a necessary requirement. Such a requirement should constitute a careful balance between

the legitimate commercial interest,
the fundamental rights and liberties of the persons who possess the data (or to whom the data can be attributed). Keep here in mind the sensitive data.

Conclusion 2: For all who are working in the area of Technological Innovation and Social Innovation the Extrapolation as shown in below is relevant (cf. Van den Herik and Dimov 2011a, b).

8 Safeguards and Recommendations

To keep the delicate balance mentioned above, two safeguards are possible.

To diminish attention and research efforts for Big Data and Deep Learning.
To increase attention and research efforts for Big Data and Deep Learning.

From these two safeguards our preference goes to the second safeguard, provided that we are allowed to introduce three specific safeguards.

These are our recommendations.

Increase research on AI systems for Big Data and Deep Learning with emphasis on moral constraints.
Increase research on AI systems for Big Data and Deep Learning with emphasis on the prevention of AI systems to be hacked.
Establish (a) a committee of Data Authorities and (b) an ethical committee.

References

Adriaanse, J.: Ethical challenges in turnaround management. Lecture at the Workshop Leadership Challenges with Big Data. Turning Data into Business, Erasmus University Rotterdam, 30 June 2015
Google Scholar
Phillips, J.: Pass summit 2015 – Microsoft foundation session – Business Intelligence (2015)
Google Scholar
Van den Herik, H.J., Dimov, D.: Towards crowdsourced online dispute resolution. In: Rohrmann, C.A., Sampaion Junior, R.B. (eds.) Revista da Faculdade de Direito Milton Campos, vol. 22, pp. 141–162. Del Rey, Belo Horizonte (2011a). ISSN 1415-0778
Google Scholar
Van den Herik, H.J., Dimov, D.: Towards crowdsourced online dispute resolution. In: Kierkegaard, S.M., Kierkegaard, P. (eds.) Law Across Nations: Governance, Policy & Statutes, pp. 244–256. International Association of IT lawyers (IAITL), Denmark (2011b)
Google Scholar
Van Eijk, R.: Towards a delicate balance in strategic innovations – update on privacy legislation. Lecture at the Workshop Leadership Challenges with Big Data. Turning Data into Business, Erasmus University Rotterdam, 30 June 2015
Google Scholar
White, T.: Hadoop: The Definitive Guide. Storage and Analysis at Internet Scale, 3rd edn., p. 688. O’Reilly Media/Yahoo Press, Sebastopol (2012)
Google Scholar
Van Eijk, R.: Social innovation & Big Data. [Presentation]. Amersfoort: IT Innovation Day (2013)
Google Scholar

Download references

Acknowledgements

This insight overview is the result of my cooperation with many colleagues. The following colleagues are singled out for their contribution to this article. I would like to thank Rob van Eijk, Daniel Dimov, Joost Kok, Aske Plaat, and Joke Hellemons. Moreover, I am grateful to Wladyslaw Homenda from Vilnius for the stimulating discussions we have had and for the invitation to communicate on Innovation.

Author information

Authors and Affiliations

Leiden Centre of Data Science, Leiden University, Leiden, The Netherlands
H. Jaap van den Herik

Authors

H. Jaap van den Herik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to H. Jaap van den Herik .

Editor information

Editors and Affiliations

Bialystok University of Technology , Bialystok, Poland
Khalid Saeed
University of Bialystok , Vilnius, Lithuania
Władysław Homenda

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

van den Herik, H.J. (2016). Innovation and Big Data. In: Saeed, K., Homenda, W. (eds) Computer Information Systems and Industrial Management. CISIM 2016. Lecture Notes in Computer Science(), vol 9842. Springer, Cham. https://doi.org/10.1007/978-3-319-45378-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-45378-1_3
Published: 09 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45377-4
Online ISBN: 978-3-319-45378-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Innovation and Big Data

Abstract

Similar content being viewed by others

The Creation and Disruption of Innovation? Key Developments in Innovation as Concept, Theory, Research and Practice

Big Data in the Innovation Process - A Bibliometric Analysis

Innovation in Times of Big Data and AI: Introducing the Data-Driven Innovation (DDI) Framework

Keywords

1 Introduction

2 Real Big Data

3 Definitions

4 Small Data

5 Turn Around Management

6 Criminal Behavior

7 Conclusions

8 Safeguards and Recommendations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Innovation and Big Data

Abstract

Similar content being viewed by others

The Creation and Disruption of Innovation? Key Developments in Innovation as Concept, Theory, Research and Practice

Big Data in the Innovation Process - A Bibliometric Analysis

Innovation in Times of Big Data and AI: Introducing the Data-Driven Innovation (DDI) Framework

Keywords

1 Introduction

2 Real Big Data

3 Definitions

4 Small Data

5 Turn Around Management

6 Criminal Behavior

7 Conclusions

8 Safeguards and Recommendations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation