Intelligent Storage Systems
KeywordsStorage System Storage Device Host Computer Storage Management Active Storage
The term Intelligent Storage System is a general term used to describe a storage system which has the capability of fully or partially realizing functions that used to be or are usually implemented on host computers.
The idea behind Intelligent Storage Systems may have its origin in the early researches on database machines. Similar ideas have continued to be studied in the academic communities to date, and they have recently been partially applied in commercial storage systems.
The basic ideas of implementing full or partial application code on controller processors of disk drives may be traced back to the database machines which were actively studied in the 1970s and 1980s. The database machine was an approach of special hardware solutions. The early researchers focused on the development of filter processors, which could do selection operations closely to disk drives so as to obtain strong performance benefits. Several prototypes such as CASSM (University of Florida)  and RAP (Ohio University)  were implemented in the 1970s. Filter processors were at times coupled with a front-end database server. Such ideas called backend processors  were attempted in the industry. When it came to the 1980s, with new algorithms such as hash-based joins, researchers proposed new database machines that pursued intensive parallel processing such as GRACE(The University of Tokyo)  and GAMMA (University of Wisconsin) . Parallel machines that were specially designed for database processing were released and then resulted in commercial success. However, the solution of utilizing dedicated hardware lost its unique advantage by the 1990s, since powerful general-purpose machines became easily available. Major database vendors shifted to software-level solutions which used generic hardware instead.
Intelligent Storage Systems were again brought under the spotlight in the late 1990s. Storage network technologies such as Fibre Channel launched in the market. Then enterprise systems began to deploy the storage-centric system architecture, where storage systems could be designed and managed independently from host computers at the infrastructure level. Naturally, sophisticated new functions such as virtualization were being incorporated into storage processors and such solutions were widely accepted. Around the same time, Active Storage , Intelligent Disks  and Active Disks  were published in the academia in 1998. These were trying to exploit the capability of disk processors for data intensive applications such as ad-hoc query processing and image processing. The attempt of active storage looked similar to the database machine, but they carefully discussed software frameworks for running application code on disk processors.
After the twenty-first century began, storage networking has been practiced in many systems and storage resources are being consolidated more. A variety of new functions are being implemented in commercial storage systems. Functions that used to be run on host computers such as volume copy, remote replication and snapshot generation are usually executed in storage processors. Although only limited types of low-level applications are currently implemented in storage systems, the application domain is gradually being widened by the active research and development.
The motivation behind database machines and active storage was mainly in significant performance improvement. By processing data more closely to disk platters, they tried to efficiently exploit the limited bandwidth between main memory and storage systems. This “storage wall” is seen even in recent enterprise systems and thus such solutions are still beneficial. At the same time, in the light of the complexity of recent enterprise systems, Intelligent Storage Systems may have another substantial benefit. That is, storage-level implementation could improve the function-level isolation between components. This would be very helpful for system administrators to design and manage the complicated system.
Discussion on interface standards is a crucial point for realizing Intelligent Storage Systems. OSD (Object Storage Device)  has evolved out of the NASD (Network-Attached Secure Disk) project which started in Carnegie Mellon University in 1995. In contrast to traditional storage devices, where the storage space is represented as an array of fixed-size blocks, OSD works as a container of objects, their attributes and their metadata. Specifically, OSD can be seen as a storage device in which lower layers of file systems are implemented. The interface protocol of OSD, designed as an extension of SCSI, has been standardized as ANSI T10 SCSI OSD. Several NAS products and distributed files systems have already supported OSDs as backend storage devices. SNIA, a leading industry association of storage networks, has also promoted standardization. SMI-S (Storage Management Initiative-Specification)  is the standard protocol for storage management, which improves the interoperability between different storage devices, switches and management applications of different manufacturers. SNIA has developed and maintained SMI-S and has provided vendors with certification programs. XAM (eXtensible Access Method)  is another task operated by SNIA. XAM, a new suite of APIs, would provide an abstraction layer between storage devices which store fixed contents and management applications which access those contents.
Recent commercial storage systems have deployed storage-side implementation of simple functions such as data conversion between main frames and open systems, third-party copy, remote replication and snapshot generation. These were so far implemented only in top-end storage systems, but they are also being implemented in mid-range products and sometimes even in entry-level products.
- 1.Acharya A, Uysal M, Saltz JH. Active disks: programming model, algorithms and evaluation. In: Proceedings of 8th international conference on architectural support for programming languages and operating systems. 1998. p. 81–91.Google Scholar
- 2.ANSI. Information technology – SCSI object-based storage device commands (OSD). standard ANSI/INCITS 400–2004. 2004.Google Scholar
- 4.DeWitt DJ, Gerber RH, Graefe G, Heytens ML, Kumar KB, Muralikrishna M. GAMMA – a high performance dataflow database machine. In: Proceedings of 12th international conference on very large data bases. 1986. p. 228–37.Google Scholar
- 7.Ozkarahan EA, Schuster SA, Smith KC. RAP – an associative processor for database management. In: Proceedings of national computer conference. 1975. p. 379–87.Google Scholar
- 8.Riedel E, Gibson GA, Faloutsos C. Active storage for large-scale data mining and multimedia. In: Proceedings of 24th international conference on very large data bases. 1998. p. 62–73.Google Scholar
- 9.SNIA Storage Management Initiative. Storage management technical specification, Overview Version 1.2.0, Revision 6. 2007.Google Scholar
- 10.SNIA XAM Initiative. XAM Initiative Overview. 2007.Google Scholar
- 11.Su SYW, Lipovski GJ. CASSM: a cellular system for very large data bases. In: Proceedings of 1st international conference on very data bases. 1975. p. 456–72.Google Scholar