wiki:dCacheHSMInterfaceV2

The dCache HSM Interface

Introduction

One of the features dCache provides, is the ability to migrate files from its disk repository to one or more connected tertiary storage systems and to move them back to disk when necessary. Although the interface between dCache and the Tertiary Storage System (TSS) is kept simple, dCache assumes to interact with an intelligent TSS. dCache doesn't drive tape robots or tape drives by itself. More detailed requirements to the storage system are described in one of the subsequent paragraphs.


Scope of this document

This document describes how to enable a standard dCache installation to interact with a tertiary storage system. The description focuses on a basic configuration, assuming that

  • all dCache disk pools are connected to only one TSS instance.
  • all dCache disk pools are connected to the same TSS instance.
  • the dCache instance has not yet populated with data, or only an negligible amount of files.

HSM requirements (What is an appropriate tertiary storage system ?)

dCache can only drive intelligent tertiary storage systems. This essentially means that tape robot and tape drive operations must be done by the TSS itself and that there is some simple way to abstract the file PUT, GET and REMOVE operation.

Migrating HSM's with a file system interface.

Most migrating storage systems provide a regular POSIX file system interface. Based on rules, data is migrated from primary to tertiary storage (mostly tape systems). Examples for migrating storage systems are :

  • HPSS (IBM)
  • DMF (SGI)

HSM's with a minimalistic PUT, GET and REMOVE interface.

Other tape systems provide a simple PUT, GET, REMOVE interface. Typically, a copy-like application writes a disk file into the TSS and returns an identifier which uniquely identifies the written file within the tertiary storage system. The identifier is sufficient to get the file back to disk or to remove the file from the TSS. Examples are :

  • OSM (ComputerAssociates)
  • Enstore (FERMIlab)

How dCache interacts with tertiary storage

Whenever dCache decides to copy a file from disk storage to tertiary storage, a user-provided executable, which can be a either a script or a binary, is automatically started on the pool where the file is located. That executable is expected to write the file into the backend storage system and to return an URI, uniquely identifying the file within that storage system. The format of the URI, as well as the arguments to the executable, are described later in this document. The unique part of the URI can either be provided by the storage element, in return of the 'FILE PUT' operation, or can be taken from dCache. A non-error return code from the executable lets dCache assume that the file has been successfully stored and, depending on the properties of the file, dCache can decide to remove the disk copy if space is running short on that pool. On a non-zero return from the executable, the file doesn't change its state and the operation is retried or an error flag is set on the file, depending on the error return code from the executable. If dCache needs to restore a file to disk, the same executable is launched with a different set of arguments, including the URI, provided when the file was written to tape. It is in the responsibility of the executable to fetch the file back from tape, based on the provided URI and to return '0' if the 'FILE FETCH' operation was successful or non-zero otherwise. In case of a failure, the pool retries the operation or dCache decides to the fetch the file from tape, using a different pool.

Not all pools need to be configured to interact with the same tertiary storage system or with a storage system at all. Furthermore pools can be configured to have more than one tertiary storage system attached. All those cases are not in the scope of the document.


Details on the executable glueing dCache and the tertiary storage system (TSS)

The executable and the file FILE STORE operation

Whenever a disk file needs to be copied to a tertiary storage system, dCache automatically launches an executable on the pool, containing the file to be copied. Exactly one instance of the executable is started for each file. Multiple instances of the executable may run concurrently for different files. The maximum number of concurrent instances of the executables per pool, as well as the full path of the executable can be configured in the 'setup' file of the pool as described in the chapter 'pool setup file configuration'.

The following arguments are given to the executable of a 'FILE STORE' operation on startup.

put <pnfsID>  <filename>  -si=<storage-information> <more options> 
  • put : The 'put' keyword indicates the 'FILE STORE' operation, meaning that the file has to be copying from the local disk to the tertiary storage system.
  • <pnfsID> : Is the internal identifier (i-node) of the file within dCache. The <pnfsID> is unique within a single dCache instance and globally unique with a very high properbility.
  • <filename> : Is the full path of the local file to be copied to the tertiary storage system.
  • <storage-information> : Is the (rather long) string in the format : <key>=<value>[;<key>=<value>;[…]] of which values for the following keys are always provided :
    • hsm : The name of the tertiary storage system to be used to store the file. This will always only be 'osm' in the scope of this document.
    • group : Is the storage group of the file to be stored as specified in the ".(tag)(sGroup)" tag of the parent directory of the file to be stored.
    • store : Is the store name of the file to be stored as specified in the ".(tag)(OSMTemplate)" tag of the parent directory of the file to be stored.
  • <more-options> : Are -<key>=<value> pairs taken from the hsm configuration commands of the pool 'setup' file. One of the options, always provided is the -command=<full path of this executable>.

With the arguments provided, the executable is supposed to copy the file into the tertiary storage system. The executable must not terminate before the transfer of the file was either successful or failed.

Success must be indicated by a '0' return of the executable. All non-zero values are interpreted as a failure which means, dCache assumes that the file has not been copied to tape. Details on the meaning of certain return codes are described later in this paragraph.

In case of a '0' return code, the executable has to return a valid URI to dCache. The URI is formatted as follows :

osm://osm/?store=<storename>&group=<groupname>&bfid=<bfid>
  • Where <storename> and '<groupname>' are the store and group name of the file as provided by the arguments to this executable.
  • The <bfid> is the unique identifier needed to restore or remove the file if necessary.

The <bfid> can either be provided by the HSM system as result of the 'STORE FILE' operation or the 'pnfsID' may be used. The latter assumes, that the file has to be stored with exactly that pnfsID within the HSM. Whatever URI is chosen, it must allow to uniquely identify the file within the tertiary storage system.

For return codes see Summary of return codes below.

NOTE : Only the URI must be printed to stdout by the executable. Additional information printed either before or after the URI will result in an error. stderr can be used for additional information in case of success or failure.

The executable and the FETCH FILE operation

Whenever a disk file needs to be stored on a tertiary storage system, dCache automatically launches an executable on the pool, containing the file to be copied. Exactly one instance of the executable is started for each file. Multiple instances of the executable may run concurrently for different files. The maximum number of concurrent instances of the executables per pool, as well as the full path of the executable can be configured in the 'setup' file of the pool as described in the chapter 'pool setup file configuration'.

The following arguments are given to the executable of a 'FILE FETCH' operation on startup.

get <pnfsID>  <filename>  -si=<storage-information> -uri=<storage-uri> <more options> 
  • get : The 'get' keyword indicates the 'FETCH FILE' operation, meaning that the file has to be copying from the tertiary storage system onto the local file system.
  • <pnfsID> : Is the internal identifier (i-node) of the file within dCache. The <pnfsID> is unique within a single dCache instance and globally unique with a very high properbility.
  • <filename> : Is the full path of the local file into which the file from the tertiary storage system should be copied.
  • <storage-information> : Is the (rather long) string in the format : <key>=<value>[;<key>=<value>;[…]] . This information is only provided for reference. For the 'FETCH FILE' operation, only the information of the <storage-uri> (see below) must be used.
  • <storage-uri> : This is the URI, which was returned by the executable, after the file was written to tertiary storage. In order to get the file back from tertiary storage the information of the URI is prefered over the information in the <storage-information>.
  • <more-options> : Are -<key>=<value> pairs taken from the hsm configuration commands of the pool 'setup' file. One of the options, always provided is the -command=<full path of this executable>.

For return codes see Summary of return codes below.

The executable and the REMOVE FILE operation

Whenever a file is removed from the dCache name space (file system) a process inside dCache makes sure that all copies of the file are removed from all internal and external media. For TSSes, one of the pools, which is connected to the TSS which stores the file, is activating the executable with the following command line options

remove -uri=<storage-uri> <more options> 
  • remove : The 'remove' keyword indicates the 'REMOVE FILE' operation, meaning that the file has to be removed from the tertiary storage system.
  • <storage-uri> : This is the URI, which was returned by the executable, after the file was written to tertiary storage. In order to get the file back from tertiary storage the information of the URI is prefered over
  • <more-options> : Are -<key>=<value> pairs taken from the hsm configuration commands of the pool 'setup' file. One of the option, always provided is the -command=<full path of this executable>.

The executable is supposed to remove the file from the TSS and report a zero return code. If an non-zero error code is returned, the dCache will call the script again at a later point in time.

Summary of command line options

put <pnfsID>  <filename>  -si=<storage-information> <more options> 
get <pnfsID>  <filename>  -si=<storage-information> -uri=<storage-uri> <more options> 
remove -uri=<storage-uri> <more options> 
  • put/get/remove : This keyword indicates the operation to be performed.
    • put : move file from disk to TSS.
    • get : move file back from TSS to disk.
    • remove : remove the file from TSS.
  • <pnfsID> : Is the internal identifier (i-node) of the file within dCache. The <pnfsID> is unique within a single dCache instance and globally unique with a very high properbility.
  • <filename> : Is the full path of the local file into which the file from the tertiary storage system should be copied.
  • <storage-information> : Is the (rather long) string in the format : <key>=<value>[;<key>=<value>;[…]] of the which values for the following keys are always provided :
    • hsm : The name of the tertiary storage system to be used to store the file. This will always only be 'osm' in the scope of this document.
    • group : Is the storage group of the file to be stored as specified in the ".(tag)(sGroup)" tag of the parent directory of the file to be stored.
    • store : Is the store name of the file to be stored as specified in the ".(tag)(OSMTemplate)" tag of the parent directory of the file to be stored.
  • <storage-uri> : This is the URI, which was returned by the executable, after the file was written to tertiary storage. In order to get the file back from tertiary storage the information of the URI is prefered over the information in the <storage-information>.
  • <more-options> : Are -<key>=<value> pairs taken from the hsm configuration commands of the pool 'setup' file. One of the option, always provided is the -command=<full path of this executable>.

Summary of return codes

Return Code Meaning Behavior for PUT FILE Behavior for GET FILE
30 <= rc < 40 User defined Deactivates request Reports Problem to PoolManager
41 No Space Left on device Pool Retries Disables pool and reports problem to PoolManager
42 Disk Read I/O Error Pool Retries Disables pool and reports problem to PoolManager
43 Disk Write I/O Error Pool Retries Disables pool and reports problem to PoolManager
other - Pool Retries Reports problem to PoolManager

Configuring pools to interact with a tertiary storage system

The executable, interacting with the tertiary storage system (TSS), as described in the chapter above, has to be provided to dCache on all pools connected to the TSS. The executable, either a script or a binary, has to be made 'executable' for the user, dCache is running as, on that host.

The following files have to be modifed to allow dCache to interact with the TSS.

  • The PoolManager.conf file (one per system)
  • The namespaceDomain layout file (one per system)
  • The pool layout file (one per pool host)
  • The pool 'setup' file (one per pool)

After the layout files and the various 'setup' files have been corrected, the following domians have to be restarted :

  • pool services
  • dCacheDomain
  • namespaceDomain

The dCache layout files

The PoolManager.con file

Somewhere in the PoolManager.conf file, the line

rc set stage on

has to be added and the dCacheDomain has to be restarted.

Alternatively, the following sequence may be typed into the dCache command line system :

[dcachetogo.dcache.org] (local) admin > cd PoolManager
[dcachetogo.dcache.org] (PoolManager) admin > set stage on
[dcachetogo.dcache.org] (PoolManager) admin > save

In this case, a restart of the dCacheDomain is not necessary.

The namespace layout

In order to allow dCache to remove files from attached TSSes, the cleaner.hsm = enabled must be added immediately underneath the [namespaceDomain/cleaner] service declaration. The namespace part should look like this :

[namespaceDomain]
 ... other services ...
[namespaceDomain/cleaner]
cleaner.hsm = enabled
.. more ...

The pool layout

The dCache layout file must be modified for each pool node connected to an TSS . If your pool nodes have been configured correctly to work w/o TSS, you will find the entry lfs=precious in the layout file for each pool service. This entry has to be removed for each pool which should be connected to an TTS. This will default the lfs parameter to hsm which is exactly what we need.

The pool 'setup' file

The pool 'setup' file mainly defines 3 details related to TTS connectivity.

  • pointer to the 'executable' which is launched on storing and fetching files.
  • The maximum number of concurrent 'STORE FILE' requests allowed per pool.
  • The maximum number of concurrent 'FETCH FILE' requests allowed per pool.

Defining the executable :

   hsm set <hsmType> [<hsmInstance>] -command=<full path to executable>
  • <hsmType> : is the type of the TTS system. Must be set to 'osm' for basic setups.
  • <hsmInstance> : is the instance name of the TTS system. Must be set to 'osm' for basic setups.
  • <full path to executable> : is the full path to the executable which should be launched for each TTS operation.

Setting the maximum number of concurrent PUT and GET operations.

#
#  PUT operations
#
st set max active <numberOfConcurrentPUTS>
#
# GET operations
#
rh set max active <numberOfConcurrentGETs>
#

Both numbers must be non zero to allow the pool to perform transfers.

What happens next

After restarting the necessary dCache domains, pools, already containing files, will start transferring them into the TSS, as those files only have a disk copy yet. The number of transfers is determined by the configuration in the pool 'setup' file, as described in the section "The pool 'setup' file".


How to monitor what's going on

This section briefly described the tools and mechanisms to monitor the TSS PUT, GET and REMOVE operations. dCache provides a configurable logging facility and a command line interface to query and manipulate transfer and waiting queues.

Log files

dCache is typically configured to only log information if something unexpected happens. However, to get familiar with tertiary storage system interactions you might be interested in more details. This section provides advise on how to obtain this kind of information. The section is important if you plan to follow the 'step by step' paragraph below.

The executable log file

Since you provide the executable, interfacing dCache and the TSS, it is in your responsibility to ensure sufficient logging information to be able to trace possible problems with either dCache or the TSS. Each request should be printed with the full set of parameters it receives, together with a timestamp. Furthermore information returned to dCache should be printed.

dCache log files in general

In dCache, each domain (e.g. dCacheDomain, pools etc), prints logging information into its own logfile, named after the domain. The location of those log files it typically the /var/log or /var/log/dCache directory, depending on the individual configuration. In the default logging setup only errors are printed reported. This behavior can be changed by either modifying /opt/d-cache/etc/logback.xml or using the dCache CLI to increase the log level of particular components. See the next section on the CLI, on how to increase the dCache log level.

If you intend to increase the log level of all components on a particular host you would need to change the /opt/d-cache/etc/logback.xml file as described below. dCache components need to be restarted to activate the changes.

 <threshold>
     <appender>stdout</appender>
     <logger>root</logger>
     <level>warn</level>
   </threshold>

needs to be changed to

 <threshold>
     <appender>stdout</appender>
     <logger>root</logger>
     <level>info</level>
   </threshold>

The change might result in a significant increase in log messages. So don't forget to change back before starting production operation. The next chapter described how to change the log level in a running system.

The dCache command line interface

The dCache command line interface gives access to information describing the process of storing and fetching files to and from the TSS, as there are :

  • The PoolManager Restore Queue. A list of all requests which have been issues to all pools for a FETCH FILE from the TSS (rc ls)
  • The Pool Collector Queue. A list of files, per pool and storage group, which will be scheduled for STORE FILE as soon as the configured trigger criteria match.
  • The Pool STORE FILE Queue. A list of files per pool, scheduled for STORE FILE operation. A configurable amount of requests within this queue are active, which is equivalent to the number of concurrent store processes, the rest is inactive, waiting to become active.
  • The Pool FETCH FILE" Queue. A list of files per pool, scheduled for FETCH FILE operation. A configurable amount of requests within this queue are active, which is equivalent to the number of concurrent fetch processes, the rest is inactive, waiting to become active.

For evaluation purposes, the pin-board of each component can be used to track down dCache behavior. The pin-board only keeps the most recent 200 lines of log information but reports not only errors but informational messages as well.

Login into the dCache Command Line Interface :

[root@dcachetogo ~]# ssh -1 -l admin -c blowfish -p 22223 localhost
admin@localhost's password: 

    dCache Admin (VII) (user=admin)

[dcachetogo.dcache.org] (local) admin > 

Increase the log level of a particular services (example of PoolManager)

dcachetogo.dcache.org] (local) admin > cd PoolManager
[dcachetogo.dcache.org] (PoolManager) admin > log set stdout ROOT INFO
[dcachetogo.dcache.org] (PoolManager) admin > log ls
stdout:
  ROOT=INFO
  dmg.cells.nucleus=WARN*
  logger.org.dcache.cells.messages=ERROR*
.....

Checking the pin board of a services (example : PoolManager)

dcachetogo.dcache.org] (local) admin > cd PoolManager
[dcachetogo.dcache.org] (PoolManager) admin > show pinboard 100
08.30.45  [Thread-7] [pool1 PoolManagerPoolUp] sendPoolStatusRelay : ...
08.30.59  [writeHandler] [NFSv41-dcachetogo PoolMgrSelectWritePool ...
....

Checking the PoolManager Restore Queue

cd PoolManager
rc ls

Checking the Pool Collector Queue

[dcachetogo.dcache.org] (pool1) admin > queue ls -l queue

                   Name : chimera:alpha
              Class@Hsm : chimera:alpha@osm
 Exiration rest/defined : -39 / 0   seconds
 Pending   rest/defined : 1 / 0
 Size      rest/defined : 877480 / 0
 Active Store Procs.    : 0
  00001BC6D76570A74534969FD72220C31D5D
# and 
#
flush ls

Checking the STORE FILE and FETCH FILE Queue

#
# for FETCH file
#
cd <poolName>
rh ls
#
# for STORE file
#
st ls

Checking the repository on the pools

[dcachetogo.dcache.org] (pool1) admin > rep ls

0000F1A789FE7F21411FA5628A1C70F13A82 <C----------L(0)[0]> 877480 si={chimera:alpha}
0000DF7308CF756C482794EC0EDA084AD487 <C----------L(0)[0]> 877480 si={chimera:alpha}
00001696C9FD5406499A91EB9AF1699F82F2 <C----------L(0)[0]> 877480 si={chimera:alpha}
0000C243E30800E747EE9804AE57A67703B9 <C----------L(0)[0]> 877480 si={chimera:alpha}
0000686F2E7F50BC44C69891E29E5A37E4A0 <C----------L(0)[0]> 7422470 si={chimera:chimera}
00009793BAEB17084A34A560ABBDDBB1A0DA <C----------L(0)[0]> 877480 si={chimera:chimera}
0000C367BE053A9A415F8D8304F851CC378C <C----------L(0)[0]> 7422470 si={chimera:alpha}
00001BC6D76570A74534969FD72220C31D5D <-P---------L(0)[0]> 877480 si={chimera:alpha}
00006ED4F4BE6AEC43A887D4D410A0CC5D5A <C----------L(0)[0]> 877480 si={chimera:alpha}

Where the

  • <C----------L(0)[0]> indicates that the file is on tape and only cached on disk.
  • <-P---------L(0)[0]> indicates that the file is only on disk. (but, depending on 'queue ls -l queue' scheduled for flush to tape)

Step by Step

Basic configuration

Following a file life cycle


Last modified Wed May 23 09:52:15 2012 by patrick