Sun Java Solaris Communities My SDN Account Join SDN
 
Article

Object-Based Storage Devices

 
By Christian Bandulet, July 2007  

This article provides a basic overview of object-based storage devices (OSDs).

Contents

Introduction

There are two types of network storage systems, each distinguished by their command sets:

  • Storage area networks (SANs) use the SCSI block I/O command set, which provides high random I/O and data throughput performance using direct access to the data at the level of the disk drive or fibre channel.
  • Network attached storage (NAS) systems use Network File Systems (NFS) or Common Internet File System (CIFS) command sets for accessing data. Multiple nodes can access the data because the metadata on the media is shared.

Conversely, object storage is based on data objects that encapsulate user data, including the data attributes and metadata. The combination of data, attributes, and metadata enables object storage to determine data layout or quality of service on a per-object basis, improving flexibility and manageability.

The unique design of object storage differs from standard storage devices with a traditional block-based interface. Object storage is an intelligent evolution of disk drives that can store and serve objects rather than simply place data on tracks and sectors. This task is accomplished by moving low-level storage functions into the storage device and accessing the device through an object interface. Systems using object storage provide the following benefits, which are highly desirable across a wide range of typical IT storage applications:

  • Intelligent space management in the storage layer
  • Data-aware prefetching and caching
  • Robust, shared access by multiple clients
  • Scalable performance using an offloaded data path
  • Reliable security

This article provides basic information about object-based storage devices (OSDs). You can also refer to additional information about OSD support in the Solaris Operating System.

OSD History

The first Small Computer System Interface (SCSI) disc drive was introduced in 1983, and the SCSI standard was ratified by American National Standards Institute (ANSI) in 1986. In the years since then, this basic protocol has not changed significantly. There have been massive changes in the physical interface between storage devices and host computers from wide SCSI to fast SCSI to Fibre Channel SCSI (FCP) to Serial Attached SCSI (SAS). The initial interface speed in 1983 was 5 MB/s. Today, interface speeds can reach up to 320 MB/s. The first SCSI drive had a capacity of 5 MB. Today, the SCSI logical interface is used on 300-GB disc drives. Areal density with longitudinal recording has reached approximately 200 Gb/in2, targeting 1 Tb/in2 with the emerging perpendicular recording technology and Heat Assisted Magnetic Recording (HAMR).

However, the logical interface, or the command set, has seen only minor additions during this time. The interface that is today standardized as ANSI T10 SCSI OSD V1 originated in research work done by Carnegie Mellon University in a government-funded research project called Network Attached Secure Disks (NASD) that began in 1994.

Over the next few years, this effort was supported and advised by a group of industry collaborators organized by the National Storage Industry Consortium (NSIC). This effort resulted in a draft interface specification submitted for standardization to ANSI T10, the committee responsible for the SCSI interface, as project T10/1355-D. Over the next several years, this interface was modified and extended by the OSD Technical Work Group of the Storage Networking Industry Association (SNIA) with varied industry and academic contributors, resulting in a draft standard to T10 in 2004. This standard was ratified in September 2004 and became the ANSI T10 SCSI OSD V1 command set, released as INCITS 400-2004. The SNIA group continues to work on further extensions to the interface, such as the ANSI T10 SCSI OSD V2 command set.

OSD: The Basic Concept

Similar to traditional data access methods, including block-based methods (parallel SCSI, SAS, FCP, ATA, SATA) and file-based methods (NFS, CIFS), there is a new emerging technology called the object-based data access model.

An OSD is analogous to a logical unit. Unlike a traditional block-oriented device providing access to data organized as an array of unrelated blocks, an object store allows access to data by means of storage objects. A storage object is a virtual entity that groups data together that has been determined by the user to be logically related. Space for a storage object is allocated internally by the OSD itself instead of by a host-based file system. OSDs manage all necessary low-level storage, space management, and security functions. Because there is no host-based metadata for an object (such as inode information), the only way for an application to retrieve an object is by using its object identifier (OID). The following figure contrasts the data structure of a traditional block-based disk with that of an object-based disk.

 
Figure 1: Image of block-based disk compared with object-based disk
Figure 1: Block-Based Disk Compared With Object-Based Disk

The collection of objects in an OSD forms a flat space of unique OIDs. Virtual file hierarchies can be emulated by rearranging pointers to objects, as shown here.

 
Figure 2: Image of traditional hierarchical, flat, and virtual data access models.
Figure 2: Traditional Hierarchical, Flat, and Virtual Data Access Models

The object is the fundamental unit of data storage in an OSD. Each object is self-contained, consisting of user data, an OID, metadata (the physical location of blocks that constitute an object), and attributes, as shown here.

 
Figure 3: Image showing that object contains data, OID, metadata, and attributes.
Figure 3: Object Containing Data, OID, Metadata, and Attributes

The ANSI T10 SCSI OSD standard defines four different objects:

  • The root object -- The OSD itself
  • User objects -- Created by SCSI commands from the application or client
  • Collection objects -- A group of user objects, such as all .mp3 objects or all objects belonging to a project
  • Partition objects -- Containers for user objects and collections that share common security and space management characteristics, such as quotas and keys

These objects are shown in the following figure.

 
Figure 4:Image showing object types of root object, partition object, user object, and collection object.
Figure 4: Object Types: Root Object, Partition Object, User Object, and Collection Object

Because each object is self-contained, diverse object migration and sharing are possible.

File systems and other host-based data management applications store both user data and metadata. OSD object attributes enable the association of applications with any OSD object, such as root objects, partition objects, collection objects, or user objects. Attributes can be used to describe specific characteristics of an OSD object, such as the total amount of bytes occupied by the OSD object, the logical size of the OSD object, or when the OSD object was last modified.

The ANSI T10 SCSI OSD standard defines 2 to the 32nd power attribute pages per object and 2 to the 32nd power attributes per attribute page. Only a small range of this attribute name space is predefined by the standard. The predominant part can be defined by the application, enabling superior data services and improving Quality of Service (QoS). Two categories of attributes include:

  • Storage attributes (similar to inodes) -- Used by the OSD to manage block allocation for data, such as the OID, block pointers, logical length, and capacity used
  • User attributes -- Used by applications and metadata managers to store higher-level information about the object, such as density, capacity, performance, cost, adaptability, capability, manageability, reliability, availability, serviceability, interoperability, security, power usage, and quotas

Attributes and metadata are stored directly with the data object and automatically carried between layers and across devices. When objects pass through a certain system layer or device, that layer can react based on the values in the attributes that it understands. All other attributes are passed along unmodified and not acted upon. Therefore, objects marked as high-reliability can be treated differently than objects marked as temporary. Attributes stored with data can attach a service level to the data for better caching, prefetching, migration, and so on, as shown in this figure.

 
Figure 5: Attribute Layer Service Levels
Figure 5: Attribute Layer Service Levels
The SCSI Architecture Model

The OSD extension for the SCSI Architecture Model (SAM) defines these OSD-specific commands and an extended SCSI Command Descriptor Block (CDB):

  • APPEND (write without offset)
  • CREATE (object), REMOVE (object)
  • CREATE COLLECTION, REMOVE COLLECTION, LIST COLLECTION, FLUSH COLLECTION
  • CREATE PARTITION, REMOVE PARTITION, FLUSH PARTITION
  • FLUSH (force object to media), FLUSH OSD, LIST (objects)
  • FORMAT (OSD)
  • GET ATTR (of an object), SET ATTR
  • PERFORM SCSI COMMAND (such as a SCSI INQUIRY)
  • READ (object with OID), WRITE
  • SET KEY (shared secret for a single object)
  • SET MASTER KEY (shared secret for OSD)
 
Figure 6: Chart Showing SCSI Standards Architecture
Figure 6: Chart Showing SCSI Standards Architecture
OSD Security

The ANSI T10 SCSI OSD standard defines strong security for its capability-based protocol, which enforces the integrity of the SCSI request and its legitimate use by the client. Every command must be accompanied by a hash code (an HMAC-SHA1 160-bit keyed hash message authentication code) that identifies a specific object and the list of operations that can be performed against the specified object. The process is described in detail in the following steps and figure.

  1. The security manager exchanges a shared secret key with an OSD.
  2. A client must request a capability from the security manager and specify the OSD name, partition ID, and OID to access an object.
  3. The security manager determines if the client making the request must be authenticated using a method such as LDAP, NIS, or Kerberos.
  4. Because such authentication is beyond the scope of the ANSI T10 SCSI OSD protocol, the security manager contacts the policy manager to determine whether the client is authorized to perform the requested operation on the specified object. If the operation is permitted, the security manager generates a credential, including the requested capability and a CAP_key (an integrity check value). The CAP_key is derived using a pseudo-random function with the OSD's secret key and the capability.
  5. The credential is sent from the security manager to the client.
  6. The client presents the request, the capability, and a validation tag on each OSD command. The validation tag is computed by the client using a CAP_key. Before processing the command, the OSD verifies the following:

    • The validation tag, based on its knowledge of the secret key and capability
    • That the capability has not been modified in any way
    • That the capability permits the requested operation against the specified object

    If the tests pass, the OSD allows the operation based on the rights encoded in the capability. A client can request a credential permitting multiple types of operations, such as read, write, or delete. This allows the client to aggressively cache and reuse credentials, minimizing the number of messages to the security manager. Credentials can be cached, propagated, stalled, revoked, and expired.

 
Figure 7: Flow diagram of the OSD security process.
Figure 7: Flow Diagram of the OSD Security Process
Capabilities

The fields in a SCSI command descriptor block (CDB) specify which command functions the command can request (which OSD object can be accessed). The contents of capabilities can be managed for application clients by a policy manager and secured in credentials by a security manager.

Credentials

A data structure is prepared by the security manager and protected by an integrity check value (CAP_key). This credential is sent to an application client to define access to an OSD logical unit for specific command functions performed on specific OSD objects. The credential includes a capability prepared by the policy manager and copied by the application client to each CDB that requests the specified command functions.

OSD Shared Secret Key Hierarchy

The hierarchy of shared secret keys is shown in Figure 8. From highest to lowest, the keys include the following:

  • Master key -- The highest key in the hierarchy, allowing unrestricted access to the drive. Loss of the master key is considered a catastrophic event. Because of the importance of the master key, the protocol limits its use to the infrequent event of setting the root key. This master key cannot be changed unless the drive owner is changed.
  • Root key -- Similar to the master key, the root key provides unrestricted access to the drive. However, the root key cannot be used to initialize the drive or set a new master key or root key. After the root key is set, it can be used to set the partition keys. The root key can be changed as needed or as part of a scheduled update operation to maintain security.
  • Partition keys -- Used to generate working keys for each partition. An object store is divided into multiple partitions, each with a unique partition key and working keys.
  • Working keys -- Used to generate capability keys used by clients to access individual objects. Because of their frequent use, working keys should be refreshed frequently, such as on an hourly basis. Unfortunately, a key refresh immediately invalidates all credentials generated by that key, potentially causing significant performance degradation when all the clients must communicate with the security manager to obtain new credentials. The load on the OSD also increases because all new credentials must be validated before being cached by the object store. To resolve these issues, the object store can declare up to 16 refreshed versions of the working key as valid. This effectively defines multiple working keys that are valid concurrently. Therefore, a key refresh would impact a limited number of capabilities.
  • To support this feature, the protocol requires a key version field to be incorporated in the capability that indicates which key to use in the validation process.

 
Figure 8: Image showing the key hierarchy.
Figure 8: Hierarchy of OSD Shared Secret Keys
Summary

This document provided a brief overview of basic OSD information. For details about OSD implementation and open source objectives, look for the upcoming article to be published in late 2007.

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Comments:
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.