Abstract
Data cleaning is an important process in the history of data acquisition, data storage, data management and data analytics, and is still go through rapid development. In fact, cleaning of data is considered as a very important challenging task in the Big data era, due to the exponential growth of data in terms volume and variety of data in most of the applications. This paper focus to prove an accurate data extraction system in different ways of Data cleaning, i.e., error detection methods and data repairing algorithms. To achieve the accuracy of data extraction and improve the quality of data, this paper proposes a hybrid Cuckoo Search Optimization along with Gravitational Search algorithm (CSO-GSA) which is used to effectively detect the error from the data received by the source file and repairs the data before delivering it. Through the experiment on the MATLAB platform, it is exhibits the proposed approach to bringing down the time for error detection and correction in huge data sets with acceptable error detecting accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The rise of “Big data” on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)
Wang, G., Gunasekaran, A., Ngai, E.W.T., Papadopoulos, T.: Big data analytics in logistics and supply chain management: certain Investigations for research and applications. Int. J. Prod. Econ. 176, 98–110 (2016)
Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A.S., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)
Zhang, Y., Zhang, G., Chen, H., Porter, A.L., Zhu, D., Lu, J.: Topic analysis and forecasting for science, technology and innovation: methodology with a case study focusing on Big data research. Technol. Forecast. Soc. Change 105, 179–191 (2016)
Zhang, H., Chen, G., Ooi, B.C., Tan, K.-L., Zhang, M.: In-memory Big data management and processing: a survey. IEEE Trans. Knowl. Data Eng. 27(7), 1920–1948 (2015)
Jin, X., Wah, B.W., Cheng, X., Wang, Y.: Significance and challenges of Big data research. Big Data Res. 2(2), 59–64 (2015)
Fong, S., Wong, R., Vasilakos, A.: Accelerated PSO swarm search feature selection for data stream mining Big data (2015)
Wu, D., Zhu, L., Xiwei, X., Sakr, S., Sun, D., Qinghua, L.: building pipelines for heterogeneous execution environments for Big data processing. IEEE Softw. 33(2), 60–67 (2016)
Wamba, S.F., Akter, S., Edwards, A., Chopin, G., Gnanzou, D.: How ‘Big data’can make Big impact: findings from a systematic review and a longitudinal case study. Int. J. Prod. Econ. 165, 234–246 (2015)
Tan, K.H., Zhan, Y., Ji, G., Ye, F., Chang, C.: Harvesting Big data to enhance supply chain innovation capabilities: an analytic infrastructure based on deduction graph. Int. J. Prod. Econ. 165, 223–233 (2015)
Fan, C., Xiao, F., Madsen, H., Wang, D.: Temporal knowledge discovery in Big BAS data for building energy management. Energy Build. 109, 75–89 (2015)
Dong, H., Wu, M., Ding, X., Chu, L., Jia, L., Qin, Y., Zhou, X.: Traffic zone division based on Big data from mobile phone base stations. Trans. Res. Part C: Emerg. Technol. 58, 278–291 (2015)
Zhou, K., Chao, F., Yang, S.: Big data driven smart energy management: from Big data to Big insights. Renew. Sustain. Energy Rev. 56, 215–225 (2015)
Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: a MapReduce solution for prototype reduction in Big data classification. Neurocomputing 150, 331–345 (2015)
Suresh, S.: Big data and predictive analytics: applications in the care of children. Pediatr. Clin. N. Am. 63(2), 357–366 (2016)
Pääkkönen, P., Pakkala, D.: Reference architecture and classification of technologies, products and services for Big data systems. Big data Res. 2(4), 166–186 (2016)
Wang, Y., Kung, L., Byrd, T.A.: Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol. Forecast. Soc. Change (2016)
Zhang, Y., Qiu, M., Tsai, C.W., Hassan, M.M., Alamri, A.: Health-CPS: healthcare cyber-physical system assisted by cloud and Big data (2015)
Zhong, R.Y., Huang, G.Q., Lan, S., Dai, Q.Y., Chen, X., Zhang, T.: A Big data approach for logistics trajectory discovery from RFID-enabled production data. Int. J. Prod. Econ. 165, 260–272 (2015)
D’Oca, S., Hong, T.: Occupancy schedules learning process through a data mining framework. Energy Build. 88, 395–408 (2015)
Daneshmand, A., et al.: Hybrid random/deterministic parallel algorithms for convex and nonconvex Big data optimization. IEEE Trans. Sig. Process. 63(15), 3914–3929 (2015)
Wu, X., Zhu, X., Gong-Qing, W., Ding, W.: Data mining with Big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)
Wang, L., Geng, H., Liu, P., Ke, L., Kolodziej, J., Ranjan, R., Zomaya, A.Y.: Particle swarm optimization based dictionary learning for remote sensing Big data. Knowl.-Based Syst. 79, 43–50 (2015)
Zhang, L., Chuan, W., Li, Z., Guo, C., Chen, M., Lau, F.: Moving Big data to the cloud: an online cost-minimizing approach. IEEE J. Sel. Areas Commun. 31(12), 2710–2721 (2013)
Zheng, K., Yang, Z., Zhang, K., Chatzimisios, P., Yang, K., Xiang, W.: Big data-driven optimization for mobile networks toward 5G. Network 30(1), 44–51 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rama Satish, K.V., Kavya, N.P. (2018). Hybrid Optimization in Big Data: Error Detection and Data Repairing by Big Data Cleaning Using CSO-GSA. In: Nagabhushan, T., Aradhya, V.N.M., Jagadeesh, P., Shukla, S., M.L., C. (eds) Cognitive Computing and Information Processing. CCIP 2017. Communications in Computer and Information Science, vol 801. Springer, Singapore. https://doi.org/10.1007/978-981-10-9059-2_24
Download citation
DOI: https://doi.org/10.1007/978-981-10-9059-2_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-9058-5
Online ISBN: 978-981-10-9059-2
eBook Packages: Computer ScienceComputer Science (R0)