Abstract
The Web is a vast source of semi-structured datasets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated as a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. We employ Web Scraping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scraping-, augmenting- and visualization-tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for a better understanding of large datasets.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
https://www.alexa.com/topsites/countries/AR Dec. 18th, 2018 at 22:00 h UTC-3.
- 7.
- 8.
AlVis prototype is publicly available https://github.com/gbosetti/alvis.
- 9.
- 10.
References
Sanou, B.: Measuring the information society report 2018. In: International Telecommunication Union, Geneva, Switzerland (2018)
Munzner, T.: Visualization Analysis and Design. AK Peters/CRC Press, New York (2014)
Yi, J.S., ah Kang, Y., Stasko, J.: Toward a deeper understanding of the role of interaction in information visualization. IEEE Trans. Vis. Comput. Graph. 13(6), 1224–1231 (2007)
Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. Craft Inf. Vis., 364–371 (2003)
Price, M., Crumley-Branyon, J., Leidheiser, W., Pak, R.: Effects of information visualization on older adults’ decision-making performance in a medicare plan selection task: a comparative usability study. JMIR Hum. Fact. 3(1), (2016)
Mahendiran, J., Kirstie Hawkey, N.Z.H.: Exploring the need for visualizations in system administration tools. In: CHI 2014 Extended Abstracts on Human Factors in Computing Systems, pp. 1429–1434. ACM (2014)
de Borja, F.G., Freitas, C.M.D.S.: CivisAnalysis: interactive visualization for exploring roll call data and representatives’ voting behaviour. In: 28th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI 2015), pp. 257–264. IEEE Computer Society (2015)
Heer, J., Card, S.K., Landay, J.A.: Prefuse: a toolkit for interactive information visualization. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 421–430. ACM (2015)
Sun, L., et al.: WebGIVI: a web-based gene enrichment analysis and visualization tool. BMC Bioinf. 18(1), 237 (2017)
Pantazos, K., Kuhail, M., Lauesen, S., Xu, S.: uVis Studio: an integrated development environment for visualization. Vis. Data Anal. 2013, 8654 (2013)
Fisher, D., Drucker, S., Fernandez, R., Ruble, S.: Visualizations everywhere: a multiplatform infrastructure for linked visualizations. IEEE Trans. Vis. Comput. Graph. 16(6), 1157–1163 (2010)
Bostock, M., Ogievetsky, V., Heer, J.: D3 data-driven documents. IEEE Trans. Vis. Comput. Graph. 17(12), 2301–2309 (2011)
Viégas, F.B., Wattenberg, M., van Ham, F., Kriss, J., McKeon, M.M.: ManyEyes: a site for visualization at internet scale. IEEE Trans. Vis. Comput. Graph 13(6) (2007)
Teixeira, J., Barata, G., Gonçalves, D.: Metabrain: web information extraction and visualization (2012)
Díaz, O., Arellano, C.: The augmented web: rationales, opportunities, and challenges on browser-side transcoding. ACM Trans. Web 9(2) (2015)
Firmenich, S., Bosetti, G., Rossi, G., Winckler, M., Barbieri, T.: Abstracting and structuring web contents for supporting personal web experiences. In: Bozzon, A., Cudre-Maroux, P., Pautasso, C. (eds.) ICWE 2016. LNCS, vol. 9671, pp. 77–95. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-38791-8_5
Toomim, M., Drucker, S.M., Dontcheva, M., Rahimi, A., Thomson, B., Landay, J.A.: Attaching UI enhancements to websites with end users. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1859–1868. ACM (2009)
Nguyen, D.Q., Schumann, H.: Visualization to support augmented web browsing. In: International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), pp. 535–541. IEEE/WIC/ACM (2013)
Aldalur, I., Diaz, O.: Addressing web locator fragility: a case for browser extensions. In: Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp. 45–50. ACM (2017)
Card, S.K., Mackinlay, J.D., Shneiderman, B.: Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann, San Francisco (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bosetti, G., Firmenich, S., Winckler, M., Rossi, G., Fandos, U.C., Egyed-Zsigmond, E. (2019). An End-User Pipeline for Scraping and Visualizing Semi-Structured Data over the Web. In: Bakaev, M., Frasincar, F., Ko, IY. (eds) Web Engineering. ICWE 2019. Lecture Notes in Computer Science(), vol 11496. Springer, Cham. https://doi.org/10.1007/978-3-030-19274-7_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-19274-7_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19273-0
Online ISBN: 978-3-030-19274-7
eBook Packages: Computer ScienceComputer Science (R0)