Skip to main content

Enriching Digital Libraries with Crowdsensed Data

Twitter Monitor and the SoBigData Ecosystem

  • Conference paper
  • First Online:
Digital Libraries: Supporting Open Science (IRCDL 2019)

Abstract

SoBigData is a Research Infrastructure (RI) aiming to provide an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining. A key milestone of the project focuses on data, methods and results sharing, in order to ensure the reproducibility, review and re-use of scientific works. For this reason, the Digital Library paradigm is implemented within the RI, providing users with virtual environments where datasets, methods and results can be collected, maintained, managed and preserved, granting full documentation, access and the possibility to re-use.

In this paper, we describe the results of our effort for integrating the Twitter Monitor, a tool for gathering messages from the Twitter Online Social Network, into the SoBigData RI. The Twitter Monitor provides a simple user interface, enabling researchers and stakeholders, without programming skills, to seamlessly (i) select relevant messages out of the huge Twitter stream by means of language, keyword, user tracking and geographical filters, (ii) store data on user personal Workspace, (iii) and publish them in the SoBigData Resource Catalogue, which implements all the aforementioned Digital Library features.

Thanks to the seamless integration in the SoBigData RI, the Twitter Monitor allows researchers and stakeholders, belonging to different areas and having different backgrounds, to exploit the crowdsensing paradigm for enriching the SoBigData Digital Library. In this way, crowdsensing acquires the key features of openness, accessibility, interoperability and interdisciplinarity that characterize the Digital Libraries framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.sobigdata.eu.

  2. 2.

    https://twitter.com/.

  3. 3.

    https://developer.twitter.com/en/docs.html.

  4. 4.

    https://developer.twitter.com/en/docs/basics/rate-limiting.html.

  5. 5.

    The documentation for all the platform libraries, functions and methods, mentioned in this subsection, can be found at https://gcube.wiki.gcube-system.org/gcube/GCube_Documentation.

  6. 6.

    http://www.json.org/.

  7. 7.

    https://oauth.net/.

  8. 8.

    https://developer.twitter.com/en/docs/basics/response-codes.html.

  9. 9.

    https://zenodo.org/.

  10. 10.

    https://figshare.com/.

  11. 11.

    https://sobigdata.d4science.org/group/sobigdatalab/method-engine.

  12. 12.

    https://sobigdata.d4science.org/group/sobigdata-gateway/workspace.

  13. 13.

    https://apps.twitter.com/.

References

  1. Avvenuti, M., Bellomo, S., Cresci, S., La Polla, M.N., Tesconi, M.: Hybrid crowdsensing: a novel paradigm to combine the strengths of opportunistic and participatory crowdsensing. In: Proceedings of WWW 2017 Companion, pp. 1413–1421. ACM (2017)

    Google Scholar 

  2. Avvenuti, M., Cimino, M.G., Cresci, S., Marchetti, A., Tesconi, M.: A framework for detecting unfolding emergencies using humans as sensors. SpringerPlus 5(1), 43 (2016)

    Article  Google Scholar 

  3. Avvenuti, M., Cresci, S., Del Vigna, F., Fagni, T., Tesconi, M.: CrisMap: a big data crisis mapping system based on damage detection and geoparsing. Inf. Syst. Front. 1–19 (2018)

    Google Scholar 

  4. Avvenuti, M., Cresci, S., Marchetti, A., Meletti, C., Tesconi, M.: Predictability or early warning: using social media in modern emergency response. IEEE Internet Comput. 20(6), 4–6 (2016)

    Article  Google Scholar 

  5. Avvenuti, M., Cresci, S., Nizzoli, L., Tesconi, M.: GSP (Geo-Semantic-Parsing): geoparsing and geotagging with machine learning on top of linked data. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 17–32. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_2

    Chapter  Google Scholar 

  6. Bezuidenhout, L., Chakauya, E.: Hidden concerns of sharing research data by low/middle-income country scientists. Glob. Bioeth. 29(1), 39–54 (2018)

    Article  Google Scholar 

  7. Borgman, C.L.: The conundrum of sharing research data. J. Am. Soc. Inf. Sci. Technol. 63(6), 1059–1078 (2012)

    Article  Google Scholar 

  8. Candela, L., Castelli, D., Pagano, P.: D4Science: an e-infrastructure for supporting virtual research environments. In: Proceedings of IRCDL 2009, pp. 166–169 (2009)

    Google Scholar 

  9. Candela, L., Castelli, D., Pagano, P.: Virtual research environments: an overview and a research agenda. Data Sci. J. 12, GRDI75–GRDI81 (2013)

    Article  Google Scholar 

  10. Candela, L., et al.: Setting the foundations of digital libraries. D-Lib Mag. 13(3/4), 1082–9873 (2007)

    Google Scholar 

  11. Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: Social fingerprinting: detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Dependable Secure Comput. 15(4), 561–576 (2018)

    Google Scholar 

  12. Cresci, S., Lillo, F., Regoli, D., Tardelli, S., Tesconi, M.: \$FAKE: evidence of spam and bot activity in stock microblogs on Twitter. In: Proceedings of ICWSM 2018, pp. 580–583. AAAI (2018)

    Google Scholar 

  13. Deelman, E., Gannon, D., Shields, M., Taylor, I.: Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009)

    Article  Google Scholar 

  14. Foster, I., Kesselman, C., Tuecke, S.: The anatomy of the grid: enabling scalable virtual organizations. Int. J. High Perform. Comput. Appl. 15(3), 200–222 (2001)

    Article  Google Scholar 

  15. Giannotti, F., Trasarti, R., Bontcheva, K., Grossi, V.: SoBigData: social mining & big data ecosystem. In: Proceedings of WWW 2018 Companion, pp. 437–438. ACM (2018)

    Google Scholar 

  16. Hey, T., Trefethen, A.E.: Cyberinfrastructure for e-Science. Science 308(5723), 817–821 (2005)

    Article  Google Scholar 

  17. Newman, H.B., Ellisman, M.H., Orcutt, J.A.: Data-intensive e-science frontier research. Commun. ACM 46(11), 68–77 (2003)

    Article  Google Scholar 

  18. Simeoni, F., Candela, L., Lievens, D., Pagano, P., Simi, M.: Functional adaptivity for digital library services in e-infrastructures: the gCube approach. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds.) ECDL 2009. LNCS, vol. 5714, pp. 51–62. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04346-8_7

    Chapter  Google Scholar 

  19. Tablan, V., Roberts, I., Cunningham, H., Bontcheva, K.: GATECloud.net: a platform for large-scale, open-source text processing on the cloud. Phil. Trans. R. Soc. A 371(1983), 20120071 (2013)

    Article  Google Scholar 

Download references

Aknowledgements

This research is supported in part by the EU H2020 Program under the schemes INFRAIA-1-2014-2015: Research Infrastructures grant agreement #654024 SoBigData: Social Mining & Big Data Ecosystem.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serena Tardelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cresci, S., Minutoli, S., Nizzoli, L., Tardelli, S., Tesconi, M. (2019). Enriching Digital Libraries with Crowdsensed Data. In: Manghi, P., Candela, L., Silvello, G. (eds) Digital Libraries: Supporting Open Science. IRCDL 2019. Communications in Computer and Information Science, vol 988. Springer, Cham. https://doi.org/10.1007/978-3-030-11226-4_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-11226-4_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-11225-7

  • Online ISBN: 978-3-030-11226-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics