Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Self-Describing Data: Page 2 of 5





Tell Me More



click to enlarge

The definition of storage capacity is a no-brainer: It's the physical space available for recording bits of data on storage media. Physical capacity imposes a fixed storage limit, though the size and shape of the data being stored can vary. You can apply compression algorithms, for example, to reduce the capacity requirements of certain types of data, but the real capacity of the media doesn't change.

Capacity allocation refers to the provisioning of physical capacity for discrete data sets. A disk or disk array with a certain capacity may be allocated to store the elements of a database, for example, while another disk or disk array may be allocated to store the data from an engineering CAD/CAM program, a Web site or user files. An efficient capacity-allocation system adapts to changing data storage requirements it senses or even anticipates, reallocating capacity on the fly. An inefficient capacity-allocation system doesn't respond to such changes automatically, introducing downtime when application processes encounter "disk full" conditions and ceasing operation until sufficient capacity is allocated manually.

Capacity utilization, by contrast, refers to processes designed to ensure that the use of allocated capacity is optimized according to the characteristics of the data being stored and the cost and capabilities of the devices being used to store it.

An efficient capacity-utilization system considers two general categories of data characteristics: frequency of access to the data, and requirements for its storage that are derived or "inherited" from the business process and application used to create the data. Access frequency is an important characteristic because it helps identify whether data must be stored "online"--on platforms that provide instantaneous accessibility to servers and end users--or can be moved to "near-line" or "offline" platforms that provide decreasing accessibility.

An efficient capacity-utilization system also takes into account the requirements that data inherits from the application used to generate it. Data from a critical application, for example, inherits the "criticality" of the application itself. In a disaster, this data must be recovered first--together with the application software that created it. Data may also inherit retention requirements from the originating application--how many days, weeks, months or years it must be kept, and even how it must be stored--especially if the application (and the business process it supports) is subject to regulatory requirements such as the Gramm-Leech-Bliley Act or the Sarbanes-Oxley Act.

In other cases, the data may inherit security or privacy requirements from applications to comply with regulatory or legal requirements like the Health Insurance Portability and Accountability Act. Other, more esoteric characteristics may be inherited from applications that vary from one company to another. For example, an application used to stream video or audio files across the Internet may impose a requirement that its data be stored to the outermost disk tracks, the longest contiguous area of storage available on the disk, to reduce jitter during playback.