Abstract
The big data has been touted as the new oil, which is expected to transform our society. Specially, the data source from the networking domain (networking big data) has higher volume, velocity, and variety compared with others. Thus in this article, we make a short survey on existing works investigating key technologies of networking big data, and propose challenging issues of transmission that is the most important stage for networking big data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
James, M.: Big Data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. IEEE/ACM Trans. Commun 51(1), 107–113 (2008)
A comprehensive list of big data statistics. http://wikibon.org/blog/big-data-statistics/
Manyika, J., et al.: Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute, pp. 1–137 (2011)
Sagiroglu, S., Sinanc, D.: Big data: a review. In: Proc. CTS, pp. 42–47 (2013)
Song, Y., Alatorre, G., Mandagere, N., et al.: Storage mining: where IT management meets big data analytics. In: Proc. Big Data, pp. 421–422 (2013)
Wang, Y., Jin, X.: Network big data: present and future. Chinese Journal of Computers 36(6), 1–15 (2013)
Ghemawat, S., Gobioff, H., Leung, S.T.: The google system. In: Proc. SOSP, pp. 29–43 (2003)
McKusick, M.K., Quinlan, S.: GFS: Evolution on fast-forward. ACM Queue 7(7), 10–20 (2009)
Hadoop: distributed file system (2013). http://hadoop.apache.org/docs/r1.0.4/hdfsdesign.html
Kosmosfs. https://code.google.com/p/kosmosfs/
Chaiken, R., et al.: Scope: Easy and efficient parallel processing of massive data sets. In: Proc. VLDB, pp. 1265–1276 (2008)
Beaver, D., Kumar, S., Li, H.C., Sobel, J., Vajgel, P.: Finding a needle in Haystack: facebook’s photo storage. In: Proc. SOSDI, pp. 1–8 (2010)
Taobao file system. http://code.taobao.org/p/tfs/src/
Fast distributed file system. https://code.google.com/p/fastdfs/
DeCandia, G.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proc. STC, pp. 654–663 (1997)
Voldemort. http://www.project-voldemort.com/voldemort/
Redis. http://redis.io/
Tokyo Canbinet. http://fallabs.com/tokyocabinet/
Tokyo Tyrant. http://fallabs.com/tokyotyrant/
Memcached. http://memcached.org/
MemcacheDB. http://memcachedb.org/
Riak. http://basho.com/riak/
Scalaris. http://code.google.com/p/scalaris/
Chang, F., et al.: Bigtable: A distributed storage system for structured data. IEEE/ACM Trans. Comput. Syst. 26(2), 1–26 (2008)
Burrows, M.: The chubby lock service for loosely-coupled distributed systems. In: Proc. SOSDI, pp. 335–350 (2006)
Lakshman, A., Malik, P.: Cassandra: structured storage system on a p2p network. In: Proc. SPDC, pp. 1–5 (2009)
HBase. http://hbase.apache.org/
Hypertable. http://hypertable.org/
RFC 4627-The application/JSON media type for Javascript object notation (JSON). http://tools.ietf.org/html/rfc4627
MongoDB. http://www.mongodb.org/
Hu, H., Wen, Y., Chua, T., Li, X.: Towards scalable systems for big data analytics: a technology tutorial. IEEE ACCESS, 652–687 (2014)
Hinton, G.E.: Learning multiple layers of representation. Trends Cog-nit. Sci. 11(10), 428–434 (2007)
Baah, G.K., Gray, A., Harrold, M.J.: On-line anomaly detection of deployed software: a statistical machine learning approach. In: Proc. SQA, pp. 70–77 (2006)
Moeng, M., Melhem, R.: Applying statistical machine learning to multicore voltage and frequency scaling. In: Proc. Comput. Frontiers, pp. 277–286 (2010)
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: A review. ACM SIGMOD Rec. 34(2), 18–26 (2005)
Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y., Theodoridis, Y.: State-of-the-art in privacy preserving data mining. ACM SIGMOD Rec. 33(1), 50–57 (2004)
Vander, W.A.: Process mining: Overview and opportunities. IEEE/ACM Trans. Manag. Inform. Syst. 3(2), 1–17 (2012)
Ritter, A., Clark, S., Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proc. EMNLP, pp. 1524–1534 (2011)
Li, Y., Hu, X., Lin, H., Yang, Z.: A framework for semisupervised feature generation and its applications in biomedical literature mining. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(2), 294–307 (2011)
Blei, D.M.: Probabilistic topic models. IEEE/ACM Trans. Commun. 55(4), 77–84 (2012)
Balinsky, H., Balinsky, A., Simske, S.J.: Automatic text summarization and small-world networks. In: Proc. SDE, pp. 175–184 (2011)
Mishra, M., Huan, J., Bleik, S., Song, M.: Biomedical text categorization with concept graph representations using a controlled vocabulary. In: Proc. DMB, pp. 26–32 (2012)
Huet, J., et al.: Enhancing text clustering by leveraging wikipedia semantics. In: Proc. RDIR, pp. 179–186 (2008)
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inform. Retr. 2(1), 1–135 (2008)
Pal, S.K., Talwar, V., Mitra, P.: Web mining in soft computing framework: Relevance, state of the art and future directions. IEEE Trans. Neural Netw. 13(5), 1163–1177 (2002)
Chen, X., Lin, X.: Big data deep learning: challenges and perspectives. IEEE Access, 514–525 (2014)
Olshannikova, E., Ometov, A., Koucheryavy, Y.: Towards big data visualization for augmented reality. In: Proc. CBI, pp. 33–37 (2014)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proc. SOSDI, pp. 137–150 (2004)
Lam, W., Liu, L., Prasad, S., Rajaraman, A., Vacheri, Z., Doan, A.: Muppet: mapreduce style processing of fast data. In: Proc. VLDB, pp. 1814–1825 (2012)
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: MapReduce online. In: Proc. NSDI (2010)
Logothetis, D., Yocum, K.: Ad-hoc data processing in the cloud. In: Proc. VLDB, pp. 1472–1475 (2008)
Brito, A., Martin, A., Knauth, T., Creutz, S., Becker, D., Weigert, S., Fetzer, C.: Scalable and low-latency data processing with stream MapReduce. In: Proc. CCTS, pp. 48–58 (2011)
Suto, K., Nishiyama, H., Kato, N.: Context-aware task allocation for fast parallel big data processing in optical-wireless networks. In: Proc. IWCMC, pp. 423–428 (2014)
Sun, W., Li, F., Guo, W., Jin, Y., Hu, W.: Store, schedule and switch–a new data delivery model in the big data era. In: Proc. ICTON, pp. 1–4 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hou, W., Guo, P., Guo, L. (2015). Networking Big Data: Definition, Key Technologies and Challenging Issues of Transmission. In: Wang, Y., Xiong, H., Argamon, S., Li, X., Li, J. (eds) Big Data Computing and Communications. BigCom 2015. Lecture Notes in Computer Science(), vol 9196. Springer, Cham. https://doi.org/10.1007/978-3-319-22047-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-22047-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22046-8
Online ISBN: 978-3-319-22047-5
eBook Packages: Computer ScienceComputer Science (R0)