The criteria for data storage are scalability, security, permanence and availability. Making an acronym, we get PASS:
- Permanence means that no data is ever lost;
- Availability means that the user/application requirements for access/performance are met;
- Scalability defines the ease of meeting changing requirements;
- Security defines the granularity and durability of access privileges.
Traditionally, storage has been block-based direct attached storage (DAS)/SAN or file-based via NAS or SAN file sharing (SFS) with some kind of metadata controller. Applications can be designed to work with or without a file system. Simply put, block-based applications are faster; file-based are more flexible.
In the past, certain functions (multitrack audio, grading, etc.) required dedicated block storage. Increasing disk, controller and interconnect speeds have decreased the overhead that the file system adds to the aggregate access speeds; thus, block-based storage is no longer required to meet the needs of production.
The evolution of media storage has gone from achieving the required speed, to making it available to multiple users, to ensuring permanence. First we had block-based storage incorporated into applications. Then we had SAN with direct attached client shared access. V arious parity systems were incorporated for redundancy, and different backup schemes were implemented.
Today, these methods have reached a limit. A major cause for this is that the current architecture requires rebuild times greater than the MTBF. In other words, if you lose a disc on a petabyte storage system, the rebuild may not be completed before you lose another disc. Current methods for avoiding this require compromises in PASS. Object-based storage reduces rebuild time dramatically and negates the performance hit. (See Figure 1.)
Object storage device
Replication is performed by the object storage device (OSD). Clients submit a single write operation to the first primary OSD, which is responsible for consistently and safely updating all replicas. This shifts the replication-related bandwidth to the storage cluster’s internal network. There is a lot of discussion about the net results of this architecture change. One point is that no matter how many copies of the data are distributed across how many discs, there is always a rest risk. Object-based storage can reduce that risk to less than 2 percent of the data on the failed disk instead of the 100 percent in traditional parity systems. This is because replicas are stored at the object level, allowing for two copies with the same net loss of capacity as a single parity drive.
Systems using object storage provide the following benefits: data-aware prefetching and caching; intelligent space management in the storage layer; shared access by multiple clients; scalable performance using an offloaded data path; and reliable security.
Let’s look at some media use cases and see why an OSD is preferred.
Archiving is easy as there is substantial agreement within the industry that the two main requirements, scalability and permanence, can best be met by object-based storage systems.
Acquisition can take advantage of object-based storage because the nature of the object is stored in the object metadata. This will ensure that the object is always stored in a manner suited to the application. An essence requiring a continuous data rate of 150Mb/s will automatically be stored where this data rate can be provided. Yes, there are other ways to do this, but they are all add-ons inducing administrative as well as performance overhead.
Post production requires real-time shared access of the assets. In the OSD model, the protocol is system-agnostic and therefore system-heterogeneous by nature. Since the OSD is the storage device, and the underlying protocol is supported on either a SAN (SCSI) or a LAN (iSCSI), device sharing is simple. (See Figure 2.) Data sharing is accomplished the same way. The objects on an OSD are available to any system that has permission to access them.