Advertisement

Exploring Data Analytics of Data Variety

  • Tiago Cruz
  • Jorge Oliveira e Sá
  • José Luís Pereira
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 746)

Abstract

The Internet allows organizations managers access to large amounts of data, and this data are presented in different formats, i.e., data variety, namely structured, semi-structured and unstructured. Based on the Internet, this data variety is partly derived from social networks, but not only, machines are also capable of sharing information among themselves, or even machines with people. The objective of this paper is to understand how to retrieve information from data analysis with data variety. An experiment was carried out, based on a dataset with two distinct data types, images and comments on cars. Techniques of data analysis were used, namely Natural Language Processing to identify patterns, and Sentimental and Emotional Analysis. The image recognition technique was used to associate a car model with a category. Next, OLAP cubes and their visualization through dashboards were created. This paper concludes that it is possible to extract a set of relevant information, namely identifying which cars people like more/less, among other information.

Keywords

Sentimental and emotional analysis Machine learning Data analysis techniques 

Notes

Acknowledgments

This work has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT - Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013.

References

  1. 1.
    Baars, H., Kemper, H.-G.: Management support with structured and unstructured data—an integrated business intelligence framework. Inf. Syst. Manag. 25(2), 132–148 (2008)CrossRefGoogle Scholar
  2. 2.
    Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., Arikawa, S.: Efficient substructure discovery from large semi-structured data. IEICE Trans. Inf. Syst. 87(2), 2754–2763 (2004)zbMATHGoogle Scholar
  3. 3.
    Russom, P.: Big data analytics TDWI best practices report. In: Introduction to Big Data Analytics (2011)Google Scholar
  4. 4.
    Dong, X., et al: Knowledge vault. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2014, pp. 601–610 (2014)Google Scholar
  5. 5.
    Li, L., Yao, Y., Tang, J., Fan, W., Tong, H.: QUINT. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2016, pp. 985–994 (2016)Google Scholar
  6. 6.
    Zhou, Y., Liu, L., Buttler, D.: Integrating vertex-centric clustering with edge-centric clustering for meta path graph analysis. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2015, pp. 1563–1572 (2015)Google Scholar
  7. 7.
    Bi, B., Ma, H., Hsu, B.J.P., Chu, W., Wang, K., Cho, J.: Learning to recommend related entities to search users. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015, pp. 139–148 (2015)Google Scholar
  8. 8.
    Blanco, R., Ottaviano, G., Meij, E.: Fast and space-efficient entity linking for queries. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015, pp. 179–188 (2015)Google Scholar
  9. 9.
    Caballero Barajas, K.L., Akella, R.: Dynamically modeling patient’s health state from electronic medical records. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2015, pp. 69–78 (2015)Google Scholar
  10. 10.
    Kokkodis, M., Papadimitriou, P., Ipeirotis, P.G.: Hiring behavior models for online labor markets. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015, pp. 223–232 (2015)Google Scholar
  11. 11.
    Zalmout, N., Ghanem, M.M.: Multivariate adaptive community detection in twitter. Int. J. Big Data Intell. 3(4), 239 (2016)CrossRefGoogle Scholar
  12. 12.
    Oliveira, T.P., Barbar, J.S., Soares, A.S.: Computer network traffic prediction: a comparison between traditional and deep learning neural networks. Int. J. Big Data Intell. 3(1), 28 (2016)CrossRefGoogle Scholar
  13. 13.
    Makrynioti, N., et al.: PaloPro: a platform for knowledge extraction from big social data and the news. Int. J. Big Data Intell. 4(1), 3 (2017)CrossRefGoogle Scholar
  14. 14.
    Dimitrakopoulos, G., Chatzigiannakis, V., Tsitouras, L.: A knowledge-based integrated framework for increasing social management intelligence. Int. J. Big Data Intell. 4(1), 36 (2017)CrossRefGoogle Scholar
  15. 15.
    Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2014, pp. 65–74 (2014)Google Scholar
  16. 16.
    Shashidhar, V., Pandey, N., Aggarwal, V.: Spoken English grading. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2015, pp. 2089–2097 (2015)Google Scholar
  17. 17.
    Chen, Z., Liu, B.: Mining topics in documents. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2014, pp. 1116–1125 (2014)Google Scholar
  18. 18.
    Wang, S., Chen, Z., Fei, G., Liu, B., Emery, S.: Targeted topic modeling for focused analysis. In: Proceeding of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2016, pp. 1235–1244 (2016)Google Scholar
  19. 19.
    Kurashima, T., Iwata, T., Takaya, N., Sawada, H.: Probabilistic latent network visualization. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2014, pp. 1236–1245 (2014)Google Scholar
  20. 20.
    Schubert, E., Weiler, M., Kriegel, H.-P.: SigniTrend. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2014, pp. 871–880 (2014)Google Scholar
  21. 21.
    Nagarajan, M., et al.: Predicting future scientific discoveries based on a networked analysis of the past literature. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2015, pp. 2019–2028 (2015)Google Scholar
  22. 22.
    Geerdink, B.: A reference architecture for big data solutions - introducing a model to perform predictive analytics using big data technology. Int. J. Big Data Intell. 2(4), 236 (2015)CrossRefGoogle Scholar
  23. 23.
    Papaioannou, M., Plum, E., Rogers, E.T.F., Valente, J., Zheludev, N.I.: All-optical image recognition using metamaterials. In: Frontiers in Optics 2016, p. FF5G.7 (2016)Google Scholar
  24. 24.
    Ganesan, K., Zhai, C.: Opinion-based entity ranking. Inf. Retr. Boston 15(2), 116–150 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Tiago Cruz
    • 1
  • Jorge Oliveira e Sá
    • 1
  • José Luís Pereira
    • 1
  1. 1.ALGORITMI Research CenterUniversity of MinhoGuimarãesPortugal

Personalised recommendations