Smartphone Information Extraction and Integration from Web

  • Supranee KhamsomEmail author
  • Wachirawut Thamviset
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1149)


We present herein a solution to problems in data integration, which is a process of consolidating similar information from different sources, in which multiple data sources ensure data unification. One concept value may have different name values used in two different databases that are consistent and meaningful under the same concept. This conflict must be resolved for consistency as well as to reduce data errors. We extracted the specifications of a mobile phone and smartphone from several websites and created JSON middleware for mapping and synonyms for the specification of mobile phone data in the form of same word standardization. Schema matching plays an important role in combining different sources of information, which can find meaningful consistency between the components of the two schemas, and are then integrated into a new database that collects more mobile phones and smartphones, but reduces the duplication of data from the original database obtained from website data extraction. The application of the proposed method involves the mobile phone data integration problem of two integrated languages, namely, Thai and English, demonstrating efficiency in actual use.


Data integration Middleware Schema matching 



This research was partially supported by the Department of Computer Science, Faculty of Science, Khon Kaen University, Khon Kaen, Thailand.


  1. 1.
    Su, J., Fan, R., Li, X.: Research and design of heterogeneous data integration middleware based on XML. In: 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, pp. 850–854 (2010)Google Scholar
  2. 2.
    Mirza, G.A.: Value name conflict while integrating data in database integration. In: 2014 11th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 316–320 (2014)Google Scholar
  3. 3.
    Ferrara, E., De Meo, P., Fiumara, G., Baumgartner, R.: Web data extraction, applications and techniques: a survey. Knowl.-Based Syst. 70, 301–323 (2014). Scholar
  4. 4.
    Sangkla, K., Seresangtakul, P.: Information integration of heterogeneous medical database systems using metadata. In: 2017 21st International Computer Science and Engineering Conference (ICSEC), pp. 1–5 (2017)Google Scholar
  5. 5.
    Ahamed, B.B., Ramkumar, T., Hariharan, S.: Data integration progression in large data source using mapping affinity. In: 2014 7th International Conference on Advanced Software Engineering and Its Applications, pp. 16–21 (2014)Google Scholar
  6. 6.
    Li, Y., Liu, D.-B., Zhang, W.-M.: Schema matching using neural network. In: The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2005), pp. 743–746 (2005)Google Scholar
  7. 7.
    Chen, W., Guo, H., Zhang, F., Pu, X., Liu, X.: Mining schema matching between heterogeneous databases. In: 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 1128–1131 (2012)Google Scholar
  8. 8.
    Ahmad, K., Chiew, H.K., Samad, R.: Intelligent Schema Integrator (ISI): a tool to solve the problem of naming conflict for schema integration. In: Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, pp. 1–5 (2011)Google Scholar
  9. 9.
    Gou, H., Jing, Y., Feng, B., Li, Y.: A scheme of information integration based on XML description and schema matching. In: 2012 Fourth International Conference on Computational and Information Sciences, pp. 381–384 (2012)Google Scholar
  10. 10.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings 18th International Conference on Data Engineering, pp. 117–128 (2002)Google Scholar
  11. 11.
    Madhavan, J., Bernstein, P., Chen, K., Halevy, A., Shenoy, P.: Corpus-based schema matching. In: In ICDE, pp. 57–68 (2003)Google Scholar

Copyright information

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer Science, Faculty of ScienceKhon Kaen UniversityKhonkaenThailand

Personalised recommendations