Skip to main content

NOSQL Databases

  • Chapter
  • First Online:
Data-intensive Systems

Part of the book series: Advanced Information and Knowledge Processing ((BRIEFSAIKP))

Abstract

In chapters so far, you have relied on HDFS as your storage medium. It has two major advantages for the type of processing we desired to do. It excels at storing large files and enabling distributed processing of these files with help of MapReduce. HDFS is most efficient for tasks that require a pass through all data in a file (or a set of files). In case you only need to access a certain element in a dataset (operation sometimes called point query) or a continuous range of elements (sometimes called range query), HDFS does not provide you an efficient toolkit for the task. You are forced to simply scan over all elements to pick out the ones you are interested in.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://cwiki.apache.org/confluence/display/Hive/RCFile.

  2. 2.

    https://parquet.apache.org.

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Wiktorski .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wiktorski, T. (2019). NOSQL Databases. In: Data-intensive Systems. Advanced Information and Knowledge Processing(). Springer, Cham. https://doi.org/10.1007/978-3-030-04603-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04603-3_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04602-6

  • Online ISBN: 978-3-030-04603-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics