Ticket #166 (new bugs)

Opened 12 years ago

Last modified 12 years ago

'rc onerror fail' doesn't work properly.

Reported by: patrick Owned by: tigran
Priority: major Milestone: 1.8.0-15p8
Component: core Keywords: PoolManager, rc onerror fail
Cc: Sub Version:


The list keeps growing. I also suspect that new requests on these files do not get served. Since the files are still listed in the pool manager, new request just seem to be added to the failed entries (the client counter increases).

I have not checked yet whether the pool manager sends a failure back to the door or whether the door simply hangs, but the ever growing list in the pool manager does seem like a bug.




Following Patrick's suggestion, we have configured the pool manager with 'rc onerror fail'. This however doesn't seem to do what I expected.

I expected the pool manager to fail a request when there is a repeatable problem with resolving a file. Failing meaning that an error is sent back to the door and the request dequeued from the pool manager.

What we observe is that the request is marked failed and shows up as such on the poolInfo/restoreHandler/lazy pages in the dCache web monitor. That is nice, but will it ever be removed from this page or will they keep accumulating?

Also from looking at the code, I am not 100% convinced that the pool manager actually sends back an error reply to the door (the request is set into the ST_DONE state, but then flagged with WAIT, which as far as I can see means that answerRequest is never called).

Could you please clarify what the intended semantics of 'rc onerror fail' is?


Change History

comment:1 Changed 12 years ago by patrick

  • Keywords onerror fail added
Note: See TracTickets for help on using tickets.