Last modified 11 years ago Last modified on 09/29/09 16:43:52

dCache Tier I meeting September 29, 2009

[part of a series of meetings]

Present, IN2P3(), Sara(Onno), Triumf(Simon), BNL(Pedro), NDGF(Gerd), PIC(Gerard), GridKa(Doris), Fermi(), CERN()


(see box on the other side)

Site reports


  • nothing to report


Things are OK.

Two tickets are ongoing.


We have the same problem with stage tests. Almost every time it collapses our SRM. This happened this afternoon. We started the stage test and immediately SRM collaposed again. Stopped the stage and restart the SRM and it is fine.

The test is a reasonable load on dcache.

WHen SRM seems to hang we see

SEVERE: increase .. max threads servlet status


All internal message threads are busy.

Do you have 500 or 1,000 concurrent connections

Onno has the number of dump of netstat command.

If you have 500 clients concurrently connections then this explains the problem.

If not then the question is what are those worker threads doing.

Thread dump should be in the ticket

Ticket #5112

Which client.

We use gfal prestage

Pedro: we also had this problem.

Increase the queue in front of tomcat.

We only have 500 threads;

DQ2 srm bringOn then do srm ls polling that would kill the SRM with load.


Things seem to be fine.

PoolManager high load 50%.

25,000 pin manager requests.

pool manager rcls high numbers of requests.

After restart everything is fine.

Which version: 1.9.2-7 on servers and 1.9.2-5 pools.

More recent versions of dCache drop requests that have timeout.


Everything fine: nothing to report.

CMS migration of old to the new. New one is in production.


Everything seems OK.

We are upgrading to 1.9.5-preview tomorrow.


Owen asked what version of Java people have installed on their worker nodes.


Proposed: same time, next week.