Abstract
A table’s data is stored in regions. A single table’s data can be stored in one or more regions. A region is a sorted set consisting of a range of adjacent rows stored together. A table’s data can be stored in one or more regions depending on how many rows are stored in a region. RegionServers manage data stored in regions. When HBase starts, the Master assigns regions to RegionServers. If required for load balancing, the Master also reassigns regions across the RegionServers. As discussed in Chapter 9, when the number of row keys in a region becomes too large, the region splits into approximately two equal halves, and this is called auto-sharding. Regions split automatically or manually with growing data as a region becomes too large. A RegionServer does not compact and splits in parallel. For example, a table’s row keys are not stored in the same region; a table's row keys are distributed across the cluster stored on different regions on different RegionServers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Deepak Vohra
About this chapter
Cite this chapter
Vohra, D. (2016). Region Splitting. In: Apache HBase Primer. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-2424-3_14
Download citation
DOI: https://doi.org/10.1007/978-1-4842-2424-3_14
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-2423-6
Online ISBN: 978-1-4842-2424-3
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)