Last modified 13 years ago Last modified on 10/31/07 17:57:51

dCache developers phone conference Oct 31

Participated : Timur, Gerd, Alex, Gene, Tigran and Patrick

SRM deployment

  • How is NDGF doing.


  • In order to do implicit space allocation, the space manager need to get the VO of the request send by the door. The alternatives are to Put the VO information either in the
    • Pool Manager Select Pool message
    • StorageInfo
    • ProtocolInfo
    Will be introduced as soon as we got some confidence that NDGF is ok.
  • With the new way of selecting space by tokens instead of directories, the actual information for the tape backend is no longer available because we are sending the storageClass of the directory which is not necessarily the right information for the tape. A temp solution, the Tier I's could live with for now, is to provide the space token within the tape migration script. We could do this step wise
    • First we put the information (spaceToken=XXXXX) into the hash table of the StorageInfo which is stored on the pools and is with that available during the 'flush' process.
    • Later we should possibly put this into pnfs as well. Maybe we can just merge it somehow with the storage URI.
    Timur agreed to do that.
  • Request for comments : We currently allow to associate a AL and RP with a directory which is chosen if nothing is specified with the request. For larger sites this is not exactly what they need. It would be desirable to use something people call the 'default space token'. There is of course nothing like a default space token, but we could provide one. This would be a token associated with a directory subtree as we do with the AL and RP.

New Issues

  • Gerd observes a message in the cleaner log :
    10/31 16:34:00 Cell(cleaner@pnfsDomain) : Unexpected message arrived from : [>SrmSpaceManager@srm-srmDomain:*@srm-srmDomain:broadcast@dCacheDomain:*@dCacheDo
    main] diskCacheV111.vehicles.PoolRemoveFilesMessage Pool=broadcast;RemoveFiles=,000100000000000006682DB0 
  • Alex requires some minor changes in the PoolManager in order to support the ReplicaManager functionality. The requirments are appended.
  • Gerd would like to have a message send from the PoolManager back to the Door in case the PoolManager decides to 'suspend' a request. As for now Gerd will try to set "rc onerror suspend|fail"
  • Doris problem with transfers which returns error but after awhile succeeds. Timur :
    • The PoolManager reports that it doesn't find an apropriate pools.
    • Something (which I can't remember) removes the file from the pool.
    • The pool sends the 'fileRemoved' message to the broadcaster which sends it to the SpaceManager.
    • The SpaceManager removes the entry in its table.
    • The next retry won't find the pnfsid entry in the SpaceManager table and treats the request differently and it succeeds.
    • Timur considers to check the return from the PoolManager and let the whole operation fail if the PoolManager returns an error.
  • Tanya reminded us of ticket 2107. Martin checked and found the reason for the NullPointerException. It will be fixed with the next release.
  • CDF still observers slow memory leaking on the NON HSm pools. The Pools actually only do dcap and p2p.
    • Tigran will check with dtrace on whether dcap leaks memory
    • Alex will further investigate.

Large Installations

  • BNL got a new Prestager (with 1.7.0 47)
  • BNL needs to halt analysis jobs if production get reduced performance.
  • Pnfs/dCache performance for USCMS (separate phone conference)

Alex request on how change the funtionality of the PoolManager in order to support the ResilientManager funcationality

We may think replica manager shall get updates from PoolManager on following events:

  • get the list of pools in ResilientPoolGroup (this could be LinkGroup, etc)
  • get updates when the set of pools in ResilientPoolGroup changed (group reconfiguration) to avoid constant polling.
  • get reliable updates on pool status changes (pool goes up or down), now pool down message is not reliable
  • we may specify few ResilientPoolGroups to satisfy T2 requirement to have separate group of RAID pools (one reliable copy) and "other" disks having few unreliable copies.
  • we may have two or more independent resilient managers (as US CMS T1 does already).
  • introduce concept of operator managed pool state in PoolManager (move this functionality from replica manager to pool manager). This will help to drain pool from "unique files", but also wait for transfers to the tape to finish, flush precious files if need, etc. Proposed pools states are (I'm not fixed on names below):
  • online (fully operational), drainoff, drained,
  • suspended, prepare-to-suspend [when operator wants to reboot pool node without triggring replications]. Now it is called "offline", "offline-prepare".
  • read-only [exists in PoolMnager; keep if we need it]

Implement messages informing pool state changes described above.

  • message about incremental change
  • get complete list of pool states for the ResilientPoolGroup

Pool Selection : Files can not be copied to / from pools when pool is in some states:

  • e.g. file shall not be copied to / from "suspended" pool [now known as "oflline"]
  • some files can be copied from "drain off" pools to copy out "unique files", but all other copies from this pool are forbiden.
  • rules may change without restarting PoolManager, thus the selection module shall be pluggable and reloadable.
  • on "replicate pnfsid" message to poolManager it replies with selected source and destination pools; when replication done, there is another separate message confirmind end of task.

Updates from PnfsManager:

  • file arrived to the pool /deleted in the pool - add information regarding file AC / RP
  • same message - add info on file storage class - Nebraska suggests to have different replication policy for different pnfs subtrees; would it be better to use storage class ?


  • get list of files (we have it). Extend message to have replica status flags (E, B, ...)
  • perform pool inventory and update PnfsManager properly regarding pool content. Right now users using pools "register" command to correct inconsitensies. This causes storm of messages "file arrived to the pool", one message per file. PnfsManager shall send bulk update in the case of register

command, or just inform replica manager that it shall update replica list from pool.

  • to fix rogue files: introduce pool command "challenge pnfsId". Pool will consult with PnfsManager and will remove file replica if file does not exist in pnfs.

Could we filter messages from PnfsManager to cut off messages from non-resilient pools ? Probably in Broadcast cell.

At some points replica manager may have tens of thousands file requiring replication. I do not think it will be a good idea to request PoolManager to replicate all of them at bunch of requests. It could be preferable to keep list of file replications (the replication plan) outside of PoolManager. Then replica manager needs to have regular updates on p2p queue sizes so it may submit more replication task for the pools which have free p2p slots.

I'm afraid to overload PoolManager with extra functionality. I would tend to move most of the functionality into the separate module, and request PoolManager to perform updates & coordination functions. Another argument to move "logic" module out of PoolManager - users may want to change operation/slection logic, this may require to reload selection component.

Last modified by patrick @ Sun Mar 7 00:05:16 2021