wiki:WritingStagerScript
Last modified 13 years ago Last modified on 04/22/08 17:46:52

Writing A Stager Script

When testing a dcache stager script it is useful to be able to triggor a stage to and from dcache.

Its useful to limit the number of available pools as each pool can stage. We do this first to reduce the complexity on our test node.

[root@hal9000 ~]# grep "psu create pool"  /opt/d-cache/config/PoolManager.conf 
psu create pool hal9000_4 -disabled
psu create pool hal9000_1 -disabled
psu create pool hal9000_3 -disabled
psu create pool hal9000_2

Changing into a Pool via the admin interface.

[hal9000.dcache.org] (local) admin > cd hal9000_2
[hal9000.dcache.org] (hal9000_2) admin > 

Seeing the state of files within a pool.

[hal9000.dcache.org] (hal9000_2) admin > rep ls                                      
000063FB04A30ECD4B3DAD1946789AB2DA34 <C----------(0)[0]> 1467 si={sql:chimera}
0000DFEF57D0328D43DE89910CB75E170610 <C----------(0)[0]> 1467 si={sql:chimera}
000009FF46C3164941728606D169C67E6E2A <C----------(0)[0]> 1467 si={sql:chimera}
00007ECBAC45B58A48FFB54AEF54759E89BE <C----------(0)[0]> 1467 si={sql:chimera}

Please note the "C" indicates Cached, ie the file has been staged to tape. "P" indicates Precious (the file cant be deleted as not on tape).

The Poolmanager defines the caching policy.

[root@hal9000 ~]# cat  /opt/d-cache/config/hal9000.poollist 
hal9000_1  /pools/1/pool  sticky=allowed recover-space recover-control recover-anyway lfs=precious tag.hostname=hal9000
hal9000_2  /pools/2/pool  sticky=allowed recover-space recover-control recover-anyway tag.hostname=hal9000
hal9000_3  /pools/3/pool  sticky=allowed recover-space recover-control recover-anyway lfs=precious tag.hostname=hal9000
hal9000_4  /pools/4/pool  sticky=allowed recover-space recover-control recover-anyway lfs=precious tag.hostname=hal9000

Flushing a file from a pool to your stager script.

[hal9000.dcache.org] (hal9000_2) admin > flush pnfsid  00007ECBAC45B58A48FFB54AEF54759E89BE
Flush Initiated

Removing a cached file once it has been staged (so it will be staged back when requested.)

[hal9000.dcache.org] (hal9000_2) admin > rep rm  000009FF46C3164941728606D169C67E6E2A
Removed 000009FF46C3164941728606D169C67E6E2A

Finding the path of a file in the pool form the pools admin interface.

[hal9000.dcache.org] (hal9000_2) admin > pf 0000617949EF0C7D462FAF9547B107964CA4      
/pnfs/dcache.org/data/martin02

How to get the flushing policy on the pool

[hal9000.dcache.org] (hal9000_2) admin > info                                        
Base directory    : /pools/2/pool
Revision          : [$Revision: 8763 $]
Version           : production-1-8-0-14(8763) (Sub=4)
StickyFiles       : allowed
Gap               : 52428800
Report remove     : on
Recovery          : CONTROL SPACE ANYWAY 
Pool Mode         : enabled
Clean prec. files : off
Hsm Load Suppr.   : off
Ping Heartbeat    : 30 seconds
Storage Mode      : Dynamic
ReplicationMgr    : Disabled
Check Repository  : true
LargeFileStore    : None
DuplicateRequests : None
P2P Mode          : Separated
P2P File Mode     : Cached
Diskspace usage   : 
    Total    : 500M
    Used     : 4401    [8.394241E-6]
    Free     : 524283599
    Precious : 0    [0.0]
    Removable: 4401    [8.394241E-6]
    Reserved : 0
Flushing Thread
   Flushing Interval /seconds   : 5
   Maximum classes flushing     : 1000
   Minimum flush delay on error : 60
  Remote controlled (hold until) : Locally Controlled
Storage Queue     : 
   Classes  : 0
   Requests : 0
Mover Queue Manager : Not Configured
Mover Queue (regular) 0(100)/0
P2P   Queue 0(10)/0
StorageHandler [diskCacheV111.pools.HsmStorageHandler2]
  Version         : [$Id: HsmStorageHandler2.java,v 1.47 2007-10-26 11:17:06 behrmann Exp $]
 Sticky allowed   : true
 Restore Timeout  : 14400
   Store Timeout  : 14400
  Remove Timeout  : 14400
  Job Queues 
    to store   0(5)/0
    from store 0(5)/0
    delete     (1)/
 Pool to Pool (P2P) [$Id: P2PClient.java,v 1.21 2007-10-31 17:27:11 radicke Exp $]
  Listener   : Listen port (recommended=0) Inactive
  Max Active : 10
Pnfs Timeout : 300 seconds 
Job Timeout Manager
  regular (lastAccess=0;total=0)
  p2p (lastAccess=0;total=0)
  io (lastAccess=0;total=0)
ChecksumModuleV1 : $Id: ChecksumModuleV1.java,v 1.11 2007-08-30 21:11:02 abaranov Exp $
          Checksum type : ADLER32
                   Fake : ftp=false error=false
 Checkum calculation on : write enforceCRC 
  FullScan Idle  0 checked; 0 errors detected

An example script which does fake staging

#!/usr/bin/python

import sys
from xml.dom import minidom
from xml.dom import Node
import urllib
import getopt
import os
import commands



def usage():
  print "Usage : put <pnfsId> <filePath> -si=<storageInfo> [-key[=value] ...]" 
  print "        get <pnfsId> <filePath> -uri=<uri> [-key[=value] ...]" 
  print "        remove -uri=<uri> [-key[=value] ...]" 
  
def logout(fileName,commands):
  fp = open(fileName,'a')
  for i in commands:
    fp.write(i + '
')


def hsmPut(commands):
  pnfsId = commands[0]
  path = commands[1]
  miscStuff = commands[2]
  #command = commands[3]
  #print commands
  pass
  #logout('/tmp/stagein',commands)
  miskdic = {}
  for i in miscStuff.split(";"):
    fred = i.split('=')
    if len(fred) == 2:
      miskdic[fred[0]] = fred[1]
  #logout('/tmp/stagein',['Extra details=%s
' % (os.path.getsize(path) )])
  retinfo = "osm://osm?store=%s;group=%s&bfid=%s" % (miskdic['store'], miskdic['group'],os.path.getsize(path))
  print retinfo
  #logout('/tmp/stagein',[retinfo])
  
def hsmGet(commands):
  #print commands
  #logout('/tmp/stageout',commands)
  path=commands[1]
  misk=commands[2]
  size = 0
  for i in misk.split(";"):
    fred = i.split('=')
    if len(fred) == 3:
      if '-si' == fred[0]:
        if 'size' == fred[1]:
          size = int(fred[2])
  
  
  #logout('/tmp/stageoutcode',["size=%i" % size, "path=%s" % path])   
  fp = open(path,'w')
  i = 0
  while i < size:
    fp.write('a')
    i = i +1
  #logout('/tmp/stageoutcode',["done=%i"]) 
def hsmRemove(commands):
  #print commands
  pass

if __name__ == "__main__":
  commandline = sys.argv[1:]
  if not commandline[0] in ['put', 'get', 'remove']:
    usage()
    sys.exit(1)
  if commandline[0] == "put":
    hsmPut(commandline[1:])
  if commandline[0] == "get":
    hsmGet(commandline[1:])
  if commandline[0] == "remove":
    hsmRemove(commandline[1:])