Big Data: Opportunities, Challenges and Solutions

Gorodetsky, Vladimir

doi:10.1007/978-3-319-13206-8_1

Vladimir Gorodetsky⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 469))

Included in the following conference series:

International Conference on Information and Communication Technologies in Education, Research, and Industrial Applications

812 Accesses
14 Citations

Abstract

The problems related to the phenomenon of Big Data are currently among the top 10 hottest topics of information and communication technology. Big Data phenomenon refers to the data explosion observed today. At present, the term is widely used in different communities of many application domains, including researchers and practitioners. Big Data analysis can provide for many new opportunities in many respects motivating and stimulating industrial and commercial take-up of novel emerging technologies. The in-depth analysis of Big Data processing and analytics publications shows that the most of them write about “new opportunities” and “new challenges”. However, very few papers present the solutions for predictive analytics that go beyond the limits of OLAP-like processing models and technologies. The goal of this paper is to outline in more detail not only the nature of opportunities and particular challenges but also some original solutions to attack them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Hereinafter Big Data refers to the corresponding problem domain whereas big data refers to big data samples.
2.
This application is used only to explain the essence of the big data analysis tasks arisen in social networks.
3.
Unfortunately, there are many industrial practitioners who still trust that unlimited computing resources are capable to cope with any big data-related problem.
4.
See [6] for impressive graphical illustrations of the error accumulation and the spurious correlation effects.
5.
At this step, the testing procedure has to be applied only for data subset assigned label \( \bar{\omega }_{k} \).
6.
Practically, causal analysis based on Bayesian network model can be used for data dimensionalities of no more than 20.

References

Aliferis, C.F., Statnikov, A., Tsamardinos, I., Xenofon, S.M., Koutsoukos, D.: Local causal and markov blanket induction for causal discovery and feature selection for classification Part I: Algorithms and empirical evaluation. J. Mach. Learn. Res. 11, 171–234 (2010)
MATH MathSciNet Google Scholar
Bedini, I., Nguyen, B.: Automatic Ontology Generation: State of the Art. http://bivan.free.fr/Janus/Docs/Automatic_Ontology_Generation_State_of_Art.pdf
Big Data: A New World of Opportunities. NESSI White Paper, December 2012. http://www.nessi-europe.com/Files/Private/NESSI_WhitePaper_BigData.pdf
Bizer, C., Heath, T., Berners-Lee, T.: Linked data – the story so far. Int. J Semant. Web Inf. Syst. 5(3), 1–22 (2009)
Article Google Scholar
Condorcet’s Theorem. http://en.wikipedia.org/wiki/Condorcet’s_jury_theorem
Fan, J., Han, F., Liu, H.: Challenges of Big Data Analysis. Princeton University, Johns Hopkins University (2013). http://arxiv.org/pdf/1308.1479.pdf
Fan, J., Guo, S., Hao, N.: Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 74(1), 37–65 (2012)
Article MathSciNet Google Scholar
Fan, J., Fan, Y.: High dimensional classification using features annealed independence rules. Ann. Stat. 36(6), 2605–2637 (2008)
Article MATH Google Scholar
Gorodetsky, V., Samoylov, V., Tushkanova, O.: Agent-based customer profile learning in 3G recommending systems. In: Proceedings of 9-th International Workshop on Agent and Data Mining Interaction (ADMI -2014) Associated with International Conference on Autonomous Agents and Multi-agent Systems (AAMAS -2014), Paris (2014)
Google Scholar
Gorodetsky, V., Samoylov, V., Serebryakov, S.: Context–Driven Data and Information Fusion. In: Proceedings of International Conference on Information Fusion (Fusion 2012), pp. 1830–1837, Singapore (2012)
Google Scholar
Gorodetsky, V., Samoylov, V., Serebryakov, S.: Ontology–based context–dependent personalization technology. In: Proceedings of WI/IAT/ACM International Conference, Associated Workshop “Web Personalization and Recommender Systems”, Toronto (2010)
Google Scholar
Gorodetsky, V., Serebryakov, S.: Methods and algorithms of collective recognition. Autom. Remote Control. 69(11), 1821–1851 (2008)
Article MathSciNet Google Scholar
Hall, P., Pittelkow, Y., Ghosh, M.: Theoretical measures of relative performance of classifiers for high dimensional data with small sample sizes. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 159–173 (2008)
Article MATH MathSciNet Google Scholar
IBM Big Data Success Stories. http://public.dhe.ibm.com/software/data/sw-library/big-data/ibm-big-data-success.pdf
InfoSphere BigInsights Enterprise Edition. http://www-03.ibm.com/software/products/ru/infobigienteedit/
IBM Business Analytics for Big Data – Overview. http://www-01.ibm.com/software/analytics/solutions/big-data/
InfoSphere Streams Technical Overview – Use Cases Big Data. http://www.slideshare.net/IBMInfoSphereUGFR/infosphere-streams-technical-overview-use-cases-big-data-jerome-chailloux
Kuncheva, L., Whitaker, C.: Measures of diversity in classifier ensembles. Mach. Learn. 51, 181–207 (2003)
Article MATH Google Scholar
Li, J., Le, T.D., Liu, L., Liu, J., Jin, Z., Sun, B.: Mining causal association rules. In: Proceedings of International ICDM-2013 Workshop on Causal Discovery, Dallas, USA (2013)
Google Scholar
NineSigma REQUEST #69987. https://www.ninesights.com/docs/DOC-8380
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1991)
Google Scholar
Pearl, J. and Verma, T.S.: A Theory of Inferred Causation. In: Proc. Second International Conference on the Principles of Knowledge Representation and Reasoning, pp. 441–452 (1991)
Google Scholar
Silverstein, C., Brin, S., Motwani, R.: Scalable techniques for mining causal structures. In: Proceedings of 24th VLDB Conference, New York, USA, pp 594–605 (1998)
Google Scholar
Skormin, V.A., Gorodetski, V.I., Popyack, L.J.: Data mining technology for failure prognostic of avionics. IEEE Trans. Aerosp. Electron. Syst. 38(2), 388–403 (2002)
Article Google Scholar

Download references

Acknowledgment

This research is supported by the Project No. 1.12 of the Research Program entitled “Information Technologies and Methods for Complex System Analysis” supervised by Nano- and Information Technology Branch of the Russian Academy of Sciences.

Author information

Authors and Affiliations

St. Petersburg Institute for Informatics and Automation, Russian Academy of Sciences, 39, 14-th Liniya, 199178, St. Petersburg, Russia
Vladimir Gorodetsky

Authors

Vladimir Gorodetsky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir Gorodetsky .

Editor information

Editors and Affiliations

Zaporizhzhya National University, Zaporizhzhya, Ukraine
Vadim Ermolayev
Institute of Applied Informatics, Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Heinrich C. Mayr
Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
Mykola Nikitchenko
Kherson State University, Kherson, Ukraine
Aleksander Spivakovsky
V.N. Karazin Kharkiv National University, Kharkov, Ukraine
Grygoriy Zholtkevych

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gorodetsky, V. (2014). Big Data: Opportunities, Challenges and Solutions. In: Ermolayev, V., Mayr, H., Nikitchenko, M., Spivakovsky, A., Zholtkevych, G. (eds) Information and Communication Technologies in Education, Research, and Industrial Applications. ICTERI 2014. Communications in Computer and Information Science, vol 469. Springer, Cham. https://doi.org/10.1007/978-3-319-13206-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-13206-8_1
Published: 28 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13205-1
Online ISBN: 978-3-319-13206-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics