Last modified 12 years ago Last modified on 03/10/09 18:09:10

dCache Tier I meeting Mar 10, 2009

Present,Timur,Gerd,Patrick), IN2P3(), Sara(), Triumf(), BNL(), NDGF(Gerd), PIC(Paco), GridKa(Doris,Silke), Fermi(Timur), CERN()



Site reports

  • gridKa
    • The Split off of the Atlas instance has been going well so far. The remaining system is up and running again since yesterday and the new Atlas instance is in test mode since later today. First tests have been ok so far. Tomorrow Atlas will be encouraged to do more testing.
    • Incidence : On some of the larger pools, 350 concurrent 'restore' operations had been allowed. Some of those pools share a single machine with 5 more pools. So the total number of restores per machine could be as high as 2100. This caused some of those host to run out of memory. This then let the restore requests queue up in the PoolManager, which after some time got OutOfMemoryExceptions. The system has been reconfigured to 50 concurrent restores per pool. Seem it is behaving fine now.
    • Question : When will the information provider correctly publish space tokens. Answer : The information is already available in the new information system. However it may take till after CHEP before we are able to publish this in the information system.
  • Pic : There is an open ticket with low priority : It seems that if files are removed from pools using 'rep rm' the associated pin is not removed. (Ticket 4362). This will be discussed in tomorrows developers phone conference.
  • NDGF : Question : is there a way to 'rc retry' only those requests which are in SUSPEND mode. It seems 'no', except you would use the GUI (As Doris remarked correctly)


Tuesday 17 March 2009

Last modified by patrick @ Sun Mar 7 01:11:30 2021