wiki:developers-meeting-20081104
Last modified 12 years ago Last modified on 11/05/08 10:06:36

dCache/SRM deployment phone conference Nov 4, 2008

Participants : Triumf, BNL(Iris), gridKa(Doris,Silke), PIC(Pacco), dCache(Gerd, Timur, Dmitry, Patrick)

Support issues

Iris : For some directories, not the expected pnfs DB is populated. We suggest to get the pnfsid's for the various root directories and find where the wrong pnfsid has been assigned on setup. Doris : The log4j seemed to have changed between 1.9.0-2 and 3. She will provide her log4j configuration so that we can check.

Comments on high pnfs load

Why

There is a general observation that pnfs tends to be stressed at nearly all sites, showing an a-symmetric load in the different queues of the PnfsManager. Gerd has been explaining this is due to the fact that FTS is doing quite a lot consistency checking of the file system before and after a transfer is done. Because data-sets mostly have a common file system parent directory, the pnfs ID of this directory is more often checked than other pnfs ID's which would explain the asymmetry in the PnfsManager queue usage.

What to do

  • Make sure there is no other activity on the 'disk' used by the postgres database. In particular don't have the pnfsd.log or other log files on the same disk.
  • dcache.org will talk to the FTS developers to negotiate a less aggressive way of talking to the SE when transferring data-sets.
  • dcache.org will try to limit the SRM pnfs communication to a minimum.
  • Site may get a faster pnfs from here. Please note that the faster version doesn't disallow to become root on pnfs even if the client host is not listed in the 'trusted' directory.
  • Be prepared for upgrading to chimera.

Remarks on postgres databases

  • postgres 8.3 is recommended.
  • set the max_fsm_pages parameter in the postgres configuration file to some millions.
  • set the autovacuum to happen at least once per day.

Misc

Gerd or Mathias will provide information during the dCache workshop at CERN on how NDGF is doing pnfs db backups to have minimum downtime in the case of an incidence. Gerd explained (in detail) what vacuuming means for databases an how the different parameter influence the performance. He may add this here by himself :-)


Last modified by patrick @ Wed Mar 3 06:11:55 2021