Hybrid Optimization in Big Data: Error Detection and Data Repairing by Big Data Cleaning Using CSO-GSA
Data cleaning is an important process in the history of data acquisition, data storage, data management and data analytics, and is still go through rapid development. In fact, cleaning of data is considered as a very important challenging task in the Big data era, due to the exponential growth of data in terms volume and variety of data in most of the applications. This paper focus to prove an accurate data extraction system in different ways of Data cleaning, i.e., error detection methods and data repairing algorithms. To achieve the accuracy of data extraction and improve the quality of data, this paper proposes a hybrid Cuckoo Search Optimization along with Gravitational Search algorithm (CSO-GSA) which is used to effectively detect the error from the data received by the source file and repairs the data before delivering it. Through the experiment on the MATLAB platform, it is exhibits the proposed approach to bringing down the time for error detection and correction in huge data sets with acceptable error detecting accuracy.
KeywordsBig data Data cleaning Data repairing Error detection Data quality Hybrid Cuckoo Search Optimization Gravitational Search Algorithm
- 7.Fong, S., Wong, R., Vasilakos, A.: Accelerated PSO swarm search feature selection for data stream mining Big data (2015)Google Scholar
- 17.Wang, Y., Kung, L., Byrd, T.A.: Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol. Forecast. Soc. Change (2016)Google Scholar
- 18.Zhang, Y., Qiu, M., Tsai, C.W., Hassan, M.M., Alamri, A.: Health-CPS: healthcare cyber-physical system assisted by cloud and Big data (2015)Google Scholar
- 25.Zheng, K., Yang, Z., Zhang, K., Chatzimisios, P., Yang, K., Xiang, W.: Big data-driven optimization for mobile networks toward 5G. Network 30(1), 44–51 (2016)Google Scholar