wiki:developers-meeting-20070711
Last modified 14 years ago Last modified on 07/12/07 10:30:27

dCache / GSSD Meeting

Time

Wednesday Juli 11

  16:30 Europe
   9:30 Chicago
  10:30 New York

Dial - In coordinates

  Please dial :

     +49 40 8998 1390 or
                -1391 or
                -1392

  You will be asked for a pin (german)

  Pin:  04474 followed by the # sign

  The pin will be repeated (german) and
  you are connected.


Minutes

  • Formal Stuff
    • Participants : Flavia, Doris, Greig, Gerd, Mathias, Lionel, Jonathan, Dmitry, Gene, Alex, Ted, Carlos, Tigran, Patrick (BNL,CERN,DESY,GridKa,IN2P3,NDGF)
    • Current Status : See 'Experiences' below. Most sites are ok. gPlazma in use by ED and IN2P3. GridKa is currently not able to the read files back. They stuck in the 'restore system'. We will help Doris.
    • Greig and Jonathan will share their setup/knowledge with us. (on dcache.org wiki).
    • Participants will get an advise on how access the wiki. (Today or Tomorrow). (Flavia and Greig already have access) Please read this, on how to contribute and send e-mail to wiki-admin@dcache.org for further questions and requests.
    • Flavia will be on vacation starting end of this week until end of July. She will provide us with the names of people watching the tests while she is not available.
  • Technical SRM 2.2 issues :
    • Handling T1D1 space tokens (Flavia) : There is still a bug which makes the 1.8.0-7 non WLCG compliant. The Space occupied by data ist not freed after this data has been moved (actually copied) to Tape. This is already fixed in the cvs-head and deployed at srm-devel.desy.de. We are expecting a new release tomorrow including this fix. Dmitry will update next :-)
    • Return code on write, if the required space is not available (Flavia) : We would expect the system to return an error rather than queue the request until the space becomes available. Needs to be checked.
    • Information System : Endpoints have to be published in the PPS and the production system. Make sure the srm is published in capital letters (SRM). The closest CE should be your CE. Greig will use the CERN CE. In order not to make a mistake here, I would like to quote Flavia
      I proposed to publish the test endpoints in
      production but only as SRM v2 endpoints (therefore publishing
      GlueServiceType=SRM) and declaring them close to the local (at the site)
      production CE. This will allow the experiments to use production
      computing resources during their SRM v2.2 test exercise. This was
      actually requested by the experiments.
      At the moment at CERN we are trying to understand all implications of
      this action. We will keep you posted.
      
    • Stability, performance e.tc. :
      • Limiting the number of threads in the tomcat server stablized the system.
      • System recovers by itself after being overloaded.
      • After a series of requests, the system slows down. Flavia will add more information to the log and Dmitry will check the dCache logfiles.
  • Next Steps :
    • NDGF will be next to be green.
    • BNL will move from the current (rather small) system to larger one which has HSM (HPSS) access (soon).
    • We all need to move to gPlazma.
    • Sites need to configure the correct Roles to allow Flavia to reserve spaces for the experiments as well as some (small) space for dteam in order to run the functionality and stress tests.
    • Moreover, sites need to configure the requested spaces and VO/Groups/Roles for the experiements (Altas and LHCb) WOuld be nice if you would share your setup.

Experiences so far

  • In order to write into dCache without space reservation (e.g. Flavias 'avail' test) you need to have a link which is not part of a link group. This is actually not a bug, but by design. This separates 'space reserved' and 'non space reserved' writes. Another way to handle this is to configure all access to use space reservation. This is done in the dCacheSetup file Variable : # SpaceManagerReserveSpaceForNonSRMTransfers=true, though hasn't been tested yet.
  • If gPlazma is not in use (kpwd file instead) the VO entry in the linkGroup attribute has to be the username, the incoming DN is mapped to. (e.g. dteam001)
    psu set linkGroup attribute non-hsmGroup VO=dteam001
    
  • With gPlazma in use, the entry should be set VO=/dteam. We need some more input from the gPlazma developers on how groups and roles are defined in relation to the gPlazma configuration files.
  • There is in issue Lionel detected : When writing with srm, and the retention policy and access latencies are not specified, the default values are taken. They are defined in the dCacheSetup file as :
    # ----if space reservation request does not specify retention policy
    #     we will assign this retention policy by default
    # SpaceManagerDefaultRetentionPolicy=CUSTODIAL
    #
    # ----if space reservation request does not specify access latency
    #     we will assign this access latency by default
    # SpaceManagerDefaultAccessLatency=NEARLINE
    
    Based on this and on the attributes of the defined link groups, a link group is selected. All links within this link groups are now handled exactly as they were in the past. All other links are no more regarded. That means that if the storage group, taken from the 'tag' in pnfs can not be resolved within this link group, the request will fail, most likely with
    No write pools configured for <disk:dteam@osm>
    
    This is actually the correct behaviour, except that a piece of functionality is still missing, which is :
    The default retentionPolicy/accessLatency should not be static, but should depend on the directory the file will be written into. Those values are of course only used if the srm reqeuest doesn't specify those values. We plan to provide this missing part within a month from now.

Last Modified : Fri Mar 5 11:52:21 2021 by Patrick