wiki:manuals/MultiHsmSupport
Last modified 7 years ago Last modified on 03/29/11 17:02:33

Proposal for the support of multiple HSM's assigned to different Pools

  • Configuring one or more HSMs for a particualar pool.
    • An HSM name (not the HSM type) is configured, using the hsm pool subcommand set. Multiple HSMs are allowed, thoug we don't now yet how they should be finally selected.
    • The information on the Name of the various HSM instances are communicated to the PoolManager by means of the 'hearbeat'. We decided not to use the 'tags' because the HSM supported HSM name is an integral part of dCache which should be treate as such. The tags are meant to be optional.
    • The information on the attached HSM's is stored in the Pool class of the 'selectionUnit'. (non pesistant)
  • Communicating the finally selected HSM instance name to the Pnfs back store.

In case of a successfull return of the flush script, the information on the HSM instance name will be added to the message which communicates the external URL to Pnfs. The information is made persistant in Pnfs.

  • Providing the selected HSM instance name to the restore mechanism.

The HSM (external URI) is part of the StorageInfo extracted from Pnfs and sent to the PoolManager when selecting an appropriate pool for refetching the file from the backend HSM.

  • Selecting pools by providing the instance name of the HSM used for this file (external URI).

The selection is done at the point where the pools are extracted from the links and the pool groups. For 'cache', only matching pools are selected and forwarded to the cost system.

Implementation details

  1. Each HSM has a type and InstanceName
  2. Each pool can be connected to multiple HSMs
  3. The hsm set command is extended to also define the HSM instance name:
    hsm set <hsmType> [<hsmInstance>] [-<key>=<value>] ...
    

All other hsm subcommands use the instance name rather than the type as the identifier. If the instance name is not specified, the type is used as the instance name - this ensure compatibility with old setups.

  1. A pool sends information about HSM type and instance to PoolManager in ping message
  2. SelectionPool class contains HSM related information
  3. On Flush:

6.1. Pool sends PoolFileFlushedMessage, which contains pnfsid and storageInfo

6.2 StorageInfo contains a list of HSM types, instances and URI

6.3 URI is uniquely refers the file in HSM instance ( bitfileid )

6.4 HSM instance name provided by pool and not by flush script

  1. PnfsManager delegates to storageInfo extractors to store/retrieve HSM type and instance name based on hsmType
  1. On Restore:

8.1 PoolManager gets !Storage Information, flow direction, protocol, net/store unit information for pool selection

8.2 PoolSelectionUnit.match() takes StorageInfo as an argument

8.3 As soon as proper link is selected and pool group is resolved PoolManager selects a pool which connected to HSM instance which defined in StorageInfo

8.4 From list of matching pools the pool with a lowest cost is taken

URI description

URI (Uniform Resource Identifier, RFC 2396) is used to represent the location of a file with in HSM. The hierarchical syntax is used:

    [scheme:][//authority][path][?query][#fragment]

where :

scheme HSM type
authority instance identifier
path+query HSM specific information (not interpreted by dCache)

example:

    osm://desy-main/?store=sql&group=chimera&bfid=3434.0.994.1188400818542
    osm://desy-dup/?store=sql&group=chimera&bfid=3434.0.994.1188400818542

On the way

(do we need blog?)

After long discussions with Martin, we have a feeling, that different types of StorageInfo ( OsmStorageInfo, EnstoreStorageInfo and HpssStorageInfo ) are not necessary, while all dCache components using StorageInfo interface only and HSM specific part are stored in HSM location, which is opaque for dCache and used by corresponding extractors. The single class GenericStorageInfo is sufficient to handle all types of StorageInfos. Nevertheless millions of serialized instances of different types stored on pools.


Last Modified Wed Apr 25 10:33:59 2018 by tigran