dCache Tier I meeting MONTH DATE, 2010

Present,Patrick), IN2P3(), Sara(Onno), Triumf(), BNL(), NDGF(), PIC(), GridKa(Doris), Fermi(), CERN()


  • Release process delayed due to holiday season.

Site reports

Onno : Altas is testing staging at SARA with several 100000 stage (SRM_BRING_ONLINE) request organized in SRM requests with 30 files each. Timeouts in Atlas system after 6 hours. This is ticket 5769. After 1 1/2 minutes the jobs (bundle) expires. After that individually files are failing, some are ok. Patrick will contact Timur. Onno will send result of 'info'.

gridKa : Break down of SRM. Postgres for SRM was using 100% CPU Only reboot helped. PnfsManager was OFFLINE on the web page. Still printed something into the log. Mounted pnfs was ok. Removing temporary stage files and restarting PnfsManager which fixed it. Restart PnfsManager w/o removing the file didn't fix it. Our guess for now is that the PnfsManager gets stuck for some (yet unkown reason) which causes all the other problems. Doris observed several 100.000 requests in the PnfsManager queue.

gridKa: hardware failure of the CMS headnode.

PIC : Gerard won't be able to join but says : "Everything's OK at PIC"

Support tickets for discussion

SARA : question on progress of 5769: Problem with file staging

RT 5753: dCache 1.9.5-21@PIC: PoolManager cost issue (ATLAS impacted)


Proposed: same time, next week.