Networking Big Data: Definition, Key Technologies and Challenging Issues of Transmission

Hou, Weigang; Guo, Pengxing; Guo, Lei

doi:10.1007/978-3-319-22047-5_9

Weigang Hou^18,19,20,
Pengxing Guo¹⁸ &
Lei Guo¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9196))

Included in the following conference series:

International Conference on Big Data Computing and Communications

2311 Accesses
1 Citations

Abstract

The big data has been touted as the new oil, which is expected to transform our society. Specially, the data source from the networking domain (networking big data) has higher volume, velocity, and variety compared with others. Thus in this article, we make a short survey on existing works investigating key technologies of networking big data, and propose challenging issues of transmission that is the most important stage for networking big data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

http://en.wikipedia.org/wiki/Big_data
James, M.: Big Data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011)
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. IEEE/ACM Trans. Commun 51(1), 107–113 (2008)
Google Scholar
A comprehensive list of big data statistics. http://wikibon.org/blog/big-data-statistics/
Manyika, J., et al.: Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute, pp. 1–137 (2011)
Google Scholar
Sagiroglu, S., Sinanc, D.: Big data: a review. In: Proc. CTS, pp. 42–47 (2013)
Google Scholar
Song, Y., Alatorre, G., Mandagere, N., et al.: Storage mining: where IT management meets big data analytics. In: Proc. Big Data, pp. 421–422 (2013)
Google Scholar
Wang, Y., Jin, X.: Network big data: present and future. Chinese Journal of Computers 36(6), 1–15 (2013)
Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.T.: The google system. In: Proc. SOSP, pp. 29–43 (2003)
Google Scholar
McKusick, M.K., Quinlan, S.: GFS: Evolution on fast-forward. ACM Queue 7(7), 10–20 (2009)
Article Google Scholar
Hadoop: distributed file system (2013). http://hadoop.apache.org/docs/r1.0.4/hdfsdesign.html
Kosmosfs. https://code.google.com/p/kosmosfs/
Chaiken, R., et al.: Scope: Easy and efficient parallel processing of massive data sets. In: Proc. VLDB, pp. 1265–1276 (2008)
Google Scholar
Beaver, D., Kumar, S., Li, H.C., Sobel, J., Vajgel, P.: Finding a needle in Haystack: facebook’s photo storage. In: Proc. SOSDI, pp. 1–8 (2010)
Google Scholar
Taobao file system. http://code.taobao.org/p/tfs/src/
Fast distributed file system. https://code.google.com/p/fastdfs/
DeCandia, G.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Article Google Scholar
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proc. STC, pp. 654–663 (1997)
Google Scholar
Voldemort. http://www.project-voldemort.com/voldemort/
Redis. http://redis.io/
Tokyo Canbinet. http://fallabs.com/tokyocabinet/
Tokyo Tyrant. http://fallabs.com/tokyotyrant/
Memcached. http://memcached.org/
MemcacheDB. http://memcachedb.org/
Riak. http://basho.com/riak/
Scalaris. http://code.google.com/p/scalaris/
Chang, F., et al.: Bigtable: A distributed storage system for structured data. IEEE/ACM Trans. Comput. Syst. 26(2), 1–26 (2008)
Article Google Scholar
Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proc. SOSDI, pp. 335–350 (2006)
Google Scholar
Lakshman, A., Malik, P.: Cassandra: structured storage system on a p2p network. In: Proc. SPDC, pp. 1–5 (2009)
Google Scholar
HBase. http://hbase.apache.org/
Hypertable. http://hypertable.org/
RFC 4627-The application/JSON media type for Javascript object notation (JSON). http://tools.ietf.org/html/rfc4627
MongoDB. http://www.mongodb.org/
Hu, H., Wen, Y., Chua, T., Li, X.: Towards scalable systems for big data analytics: a technology tutorial. IEEE ACCESS, 652–687 (2014)
Google Scholar
Hinton, G.E.: Learning multiple layers of representation. Trends Cog-nit. Sci. 11(10), 428–434 (2007)
Article Google Scholar
Baah, G.K., Gray, A., Harrold, M.J.: On-line anomaly detection of deployed software: a statistical machine learning approach. In: Proc. SQA, pp. 70–77 (2006)
Google Scholar
Moeng, M., Melhem, R.: Applying statistical machine learning to multicore voltage and frequency scaling. In: Proc. Comput. Frontiers, pp. 277–286 (2010)
Google Scholar
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: A review. ACM SIGMOD Rec. 34(2), 18–26 (2005)
Article Google Scholar
Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec. 33(1), 50–57 (2004)
Article Google Scholar
Vander, W.A.: Process mining: Overview and opportunities. IEEE/ACM Trans. Manag. Inform. Syst. 3(2), 1–17 (2012)
Google Scholar
Ritter, A., Clark, S., Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proc. EMNLP, pp. 1524–1534 (2011)
Google Scholar
Li, Y., Hu, X., Lin, H., Yang, Z.: A framework for semisupervised feature generation and its applications in biomedical literature mining. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(2), 294–307 (2011)
Article Google Scholar
Blei, D.M.: Probabilistic topic models. IEEE/ACM Trans. Commun. 55(4), 77–84 (2012)
MathSciNet Google Scholar
Balinsky, H., Balinsky, A., Simske, S.J.: Automatic text summarization and small-world networks. In: Proc. SDE, pp. 175–184 (2011)
Google Scholar
Mishra, M., Huan, J., Bleik, S., Song, M.: Biomedical text categorization with concept graph representations using a controlled vocabulary. In: Proc. DMB, pp. 26–32 (2012)
Google Scholar
Huet, J., et al.: Enhancing text clustering by leveraging wikipedia semantics. In: Proc. RDIR, pp. 179–186 (2008)
Google Scholar
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inform. Retr. 2(1), 1–135 (2008)
Article Google Scholar
Pal, S.K., Talwar, V., Mitra, P.: Web mining in soft computing framework: Relevance, state of the art and future directions. IEEE Trans. Neural Netw. 13(5), 1163–1177 (2002)
Article Google Scholar
Chen, X., Lin, X.: Big data deep learning: challenges and perspectives. IEEE Access, 514–525 (2014)
Google Scholar
Olshannikova, E., Ometov, A., Koucheryavy, Y.: Towards big data visualization for augmented reality. In: Proc. CBI, pp. 33–37 (2014)
Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proc. SOSDI, pp. 137–150 (2004)
Google Scholar
Lam, W., Liu, L., Prasad, S., Rajaraman, A., Vacheri, Z., Doan, A.: Muppet: mapreduce style processing of fast data. In: Proc. VLDB, pp. 1814–1825 (2012)
Google Scholar
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: MapReduce online. In: Proc. NSDI (2010)
Google Scholar
Logothetis, D., Yocum, K.: Ad-hoc data processing in the cloud. In: Proc. VLDB, pp. 1472–1475 (2008)
Google Scholar
Brito, A., Martin, A., Knauth, T., Creutz, S., Becker, D., Weigert, S., Fetzer, C.: Scalable and low-latency data processing with stream MapReduce. In: Proc. CCTS, pp. 48–58 (2011)
Google Scholar
Suto, K., Nishiyama, H., Kato, N.: Context-aware task allocation for fast parallel big data processing in optical-wireless networks. In: Proc. IWCMC, pp. 423–428 (2014)
Google Scholar
Sun, W., Li, F., Guo, W., Jin, Y., Hu, W.: Store, schedule and switch–a new data delivery model in the big data era. In: Proc. ICTON, pp. 1–4 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Engineering, Northeastern University, Boston, 110819, China
Weigang Hou, Pengxing Guo & Lei Guo
State Key Laboratory of Networking and Switching Technology, Beijing, 100876, China
Weigang Hou
State Key Laboratory of Information Photonics and Optical Communications, Beijing, 100876, China
Weigang Hou

Authors

Weigang Hou
View author publications
You can also search for this author in PubMed Google Scholar
Pengxing Guo
View author publications
You can also search for this author in PubMed Google Scholar
Lei Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weigang Hou .

Editor information

Editors and Affiliations

University of North Carolina at Charlotte, Charlotte, North Carolina, USA
Yu Wang
Rutgers Business School, Newark, New Jersey, USA
Hui Xiong
Illinois Institute of Technology, Chicago, Illinois, USA
Shlomo Argamon
Illinois Institute of Technology, Chicago, Illinois, USA
XiangYang Li
Harbin Institute of Technology, Harbin, China
JianZhong Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hou, W., Guo, P., Guo, L. (2015). Networking Big Data: Definition, Key Technologies and Challenging Issues of Transmission. In: Wang, Y., Xiong, H., Argamon, S., Li, X., Li, J. (eds) Big Data Computing and Communications. BigCom 2015. Lecture Notes in Computer Science(), vol 9196. Springer, Cham. https://doi.org/10.1007/978-3-319-22047-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-22047-5_9
Published: 24 July 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22046-8
Online ISBN: 978-3-319-22047-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics