A Scalable, Continuously Available Database System Based on Commodity Hardware
- 1.3k Downloads
RDBMS is facing several challenges. First, the scalability of RDBMS depends exclusively on single big and highly reliable server which is disproportionately expensive. Second, a few bytes of mutation often causes both read and write of a few kilobytes of data (a page/block) in RDBMS (also known as write amplification) which severely curbs its write transaction performance. Third, due to its mixed read and write infrastructure, it is difficult for RDBMS to take the advantage of modern commodity flash-based SSDs which own perfect random read ability but relatively poor random write performance. In this talk, we will present OceanBase, an open source (https://github.com/alibaba/oceanbase) database system. OceanBase is a distributed system based on commodity hardware. It owns many features of traditional RDBMS, e.g., ACID, key SQL features and SQL interface, etc. as well as a distributed system, e.g., scalability, continuous availability, etc. It also owns a separated read and write architecture which removes the above write amplification in RDBMS and is very friendly to commodity SSDs by eliminating random disk write. It makes the widely used database shading obsolete. There are dozens of OceanBase instances in the production system of Alibaba and serves tens of billions read and write transactions every day. The first instance has been in service for more than 2 years and consists of 80 commodity servers today. One table in the above instances contains more than 100 billion records and a few tens of terabytes of data.