wiki:tier-one-meeting-20161018
Last modified 3 years ago Last modified on 10/18/16 15:01:47

dCache Tier I meeting October 18, 2016

[part of a series of meetings]

Present

dCache.org(Paul), PIC(Marc), KIT(Xavier)

Agenda

(see box on the other side)

Site reports

PIC

Marc reported that everything is OK at PIC.

Dead pool

One pool died some two days ago. This was a BerkeleyDB problem that requires a restart. The concern was that PIC's monitoring scripts didn't pick up on the dead pool, so they are investigating how to adjust their scripts to better monitor the pool's status.

Paul pointed out that this is somewhat redundant, given the alarms service -- dCache knew about the problem; it just needs a convenient way of letting PIC know about it.

Marc said he was going to investigate the alarms service.

KIT

Xavier reported that things are running fine; no issues to report.

Stalled tickets

Xavier mentioned that there has been no progress with the ATLAS deletions problem and no progress with the SQL query.

On the ATLAS deletion, Paul asked if they are using replica manager with ATLAS instance (no) and whether there was any problems with pin-manager (not that they know of)

transfers

Xavier provided us with the transfers.txt parsing script so we can validate the output for future releases. This isn't the actual monitoring script (since that is KIT specific), but something similar.

The script has already found one problem (entries without a PNFS-ID), a problem that Al knows about.

HSM

Xavier is continuing to work on the HSM integration.

He did a stress test of some 100,000 files. This was OK, but showed a lot of delay after an initially promising results. The plugin was configured to have 200 concurrent threads. Further investigation is needed to discover what exactly was going wrong.

Support tickets for discussion

[Items are added here automagically]

DTNM

Same time, next week.