wiki:HSM
Last modified 7 years ago Last modified on 04/01/11 17:27:50

Tape Backend System

Introduction

dCache installations, used as a frontend to tertiary storage system, need, at some point, to exchange data this such a system in order to store new, precious files and to retrieve files from the HSM if not yet, or no longer, available on one of the dCache pools. Unfortunately there is no well defined interface for such HSM operations. So the dCache overcomes this problem by calling configurable (dCache external) shell scripts or binaries whenever an HSM store or retrieve operation becomes necessary. The local HSM administrator is responsible for providing this procedure and to make it available and known to the dCache.

The workflow of the Flush (put on tape) and Restore (get from tape) operations are described below.

On Flush:

  1. Pool sends PoolFileFlushedMessage which contains the !pnfsid and the StorageInfo of a file. StorageInfo contains a list of HSM types, instanceNames and URI (is empty in this step).
  2. The script will be invoked, the put on tape command will be executed by the hsm.
  3. A tape location path where the file has been stored is returned to the script.
  4. The script generates an URI from this path.
  5. Pool sends a PoolFileFlushedMessage which contains the pnfsid and the updated StorageInfo with the URI to inform PnfsManager about the new location of the file.

On Restore (in case a file, not available on any pools, is requested for reading):

  1. PoolManager gets StorageInfo, flow direction, protocol, net/store unit information for pool selection.
  2. As soon as proper link is selected and pool group is resolved PoolManager selects a pool which connected to HSM instance which defined in the StorageInfo.
  3. The file is staged on a pool.
  4. If the selected pool is a write-only pool, the file will be moved to a appropriate pool with the lowest cost via Pool2Pool transfer.

URI description

URI (Uniform Resource Identifier, RFC 2396) is used to represent the location of a file with in HSM. The hierarchical syntax is used:

    [scheme:][//authority][path][?query][#fragment]

where :

scheme HSM type
authority instance identifier (hsmInstanceName)
path+query HSM specific information (not interpreted by dCache)

example:

    osm://desy-main/?store=sql&group=chimera&bfid=3434.0.994.1188400818542
    osm://desy-dup/?store=sql&group=chimera&bfid=3434.0.994.1188400818542

How-To Setup

To configure the pools you wish to connect to the HSM backend, remove the disk-only-option "lfs=precious" from the file /opt/d-cache/etc/layouts and /opt/d-cache/etc/dcache.conf respectively. In the poolsetup-file located in $poolHomeDir/$poolName/setup) define the hsmType, the hsmInstanceName and the path to the script and set the maximum number of the active flush and restore transfers as shown in the example below:

# define the path to the script executed for each flush operation
# Syntax: hsm set <hsmType> <hsmInstanceName> -<key>=<value>
hsm set osm osm-tape-1 -command=/opt/d-cache/jobs/stager.sh

# set the maximum number of active flush operations >= 1 (default: 0)
st set max active 5

#set the max. number of active restore operations >= 1 (default: 0)
rh set max active 5

Finally, all HSM-connected pools must be restarted.

The commands described above could be defined in the admin interface of corresponding pools while the pool is active (don't forget to "save").

To be able to read a file from the tape in case the cached file has been deleted from all pools, enable the restore-option in the /opt/d-cache/config/PoolManager.conf file and restart core-dCache:

rc set stage on

How-To Call Flush-/Restore commands

Seeing the state of files within a pool, login into the pool in the admin interface an run:

rep ls   

Example:

[hal9000.dcache.org] (pool_1) admin > rep ls                                      
00008F276A952099472FAD619548F47EF972 <-P---------L(0)[0]> 291910 si={dteam:STATIC}
00002A9282C2D7A147C68A327208173B81A6 <-P---------L(0)[0]> 2011264 si={dteam:STATIC}
0000EE298D5BF6BB4867968B88AE16BA86B0 <C----------L(0)[0]> 1976 si={dteam:STATIC}

Please note the "C" indicates Cached, i.e. the file has been staged to tape. "P" indicates Precious (the file cannot be deleted as not on tape).

In order to flush a file to the tape run the following command:

flush pnfsid <pnfsid>

Example:

[hal9000.dcache.org] (pool_1) admin > flush pnfsid 00002A9282C2D7A147C68A327208173B81A6
Flush Initiated

Removing a cached file once it has been staged (so it will be staged back when requested.)

rep rm <pnfsid>

Example:

[hal9000.dcache.org] (pool_1) admin > rep rm  00002A9282C2D7A147C68A327208173B81A6
Removed 00002A9282C2D7A147C68A327208173B81A6

To restore a file from the tape

rh restore [-block] <pnfsId>

Example:

[hal9000.dcache.org] (pool_1) admin > rh restore 00002A9282C2D7A147C68A327208173B81A6
Fetch request queued