Consider using multiple levels of primary storage pools to form a storage hierarchy. For example, assume that your fastest devices are disks, but space on these devices is scarce. You also have tape drives, which are slower to access, but have much greater capacity. You can define a hierarchy so that files are initially stored on the fast disk volumes in one storage pool, to provide clients with quick response to backup and recall requests. Then, as the disk storage pool becomes full, ADSM migrates, or moves, data to tape volumes in a different storage pool. Migrating files to sequential storage pool volumes is particularly useful because ADSM migrates all the files for a single node together. This is especially helpful if you have not enabled collocation.
When defining or updating a storage pool, you establish a hierarchy by identifying the next storage pool. ADSM migrates, or moves, data to the next storage pool if the original storage pool is full or unavailable.
Restrictions:
Understanding how the server selects and accesses a primary storage pool can help you estimate the amount of space required for each storage pool in the hierarchy.
When a user backs up or archives files from a client node, the server may group multiple client files into an aggregate, a single physical file. The size of the aggregate depends on the sizes of the client files being stored, and the number of bytes and files allowed for a single transaction. Two options, one in the server options file and one in the client options file, affect the number of bytes and files allowed for a single transaction:
This option sets a target size for the aggregate file. An aggregate file will usually be smaller than the value specified by the TXNBYTELIMIT option. A logical file (a single user's file) that is larger than the value specified by TXNBYTELIMIT option will not become part of an aggregate, but will be stored as a single physical file.
Together these options allow you to control the size of aggregate files stored by the server. For more information on using options to tune performance, see the performance tuning guide on the ADSM web page (http://www.ibm.com/storage/adsm).
When an HSM client migrates files (space-managed files), the files are not grouped into an aggregate.
When a user backs up, archives, or migrates a file from a client node, the server looks at the management class that is bound to the file to determine in which storage pool to store the file. The server then checks the storage pool to determine the following:
Version 2 Clients: | When a Version 2 client backs up or archives files, the Version 3 server must estimate the size of the aggregate file that the client will send. The server bases the estimate on earlier transactions with the client. The server uses the estimated size to check whether the storage pool has enough space to store the file. Because the server uses the estimated size rather than the actual size for Version 2 clients, the server may not always store files in the storage pool that you expect. |
Based on these factors, the server determines if the file can be written to that storage pool or the next storage pool in the hierarchy.
As an example, assume a company has a storage pool hierarchy as shown in Figure 16.
Figure 16. Storage Hierarchy, Read/Write Access, and Maximum File Size
The storage pool hierarchy consists of two storage pools:
Assume a user wants to archive a 5MB file named FileX. FileX is bound to a management class that contains an archive copy group whose storage destination is DISKPOOL, see Figure 16.
When the user archives the file, the server determines where to store the file based on the following process:
The maximum file size applies to the physical file being stored, which may be a single client file or an aggregate file.
If the DISKPOOL storage pool has no maximum file size specified, the server checks if there is enough space in the pool to store the physical file. If there is not enough space for the physical file, the server uses the next storage pool in the storage hierarchy to store the file.
It is strongly recommended that all primary storage pools that are linked to form a storage hierarchy use the same copy pool for backup. If this is done, then a file that is copied does not need to be recopied when it migrates to another primary storage pool.
For most cases, a single copy storage pool can be used for backup of all primary storage pools. The number of copy storage pools you need depends on the hierarchies you have set up with your primary storage pools and what type of disaster recovery protection you wish to implement.
Multiple copy storage pools may be needed to handle particular situations, including:
A common way to use the storage hierarchy is for initially storing client data on disk, then letting ADSM migrate the data to tape. A guideline for how much primary disk storage should be dedicated for this staging of client data is enough storage to handle one night's worth of the clients' incremental backups. While not always feasible, this guideline has even more value when considering storage pool backups.
For example, if you have enough disk space for nightly incremental backups for clients and have tape devices, you can set up the following pools:
Then you can schedule these steps every night:
Backing up disk storage pools before migration processing allows you to copy as many files as possible while they are still on disk. This saves mount requests while performing your storage pool backups.
When this migration completes, raise the high migration threshold back to 100%.
The tape primary storage pool must still be backed up to catch any files that might have been missed in the backup of the disk storage pools (for example, large files that went directly to sequential media).
See "Estimating Space Needs for Storage Pools" for more information about storage pool space.