wiki:developers-meeting-20141204
Last modified 3 years ago Last modified on 12/04/14 17:05:13

dCache Tier I meeting 4 December, 2014

[part of a series of meetings]

Present

dCache.org(Paul), IN2P3(), Sara(), Triumf(), BNL(), NDGF(), PIC(), KIT(Xavier), Fermi(Natalia), CERN()

Agenda

(see box on the other side)

Site reports

Fermi

Natalia reported that they are working through the issues in setting up their testbed instance, making sure the puppet configuration inherits correctly from the Tier-1 configuration.

As both dCache and puppet expert, Gerard is being very helpful.

KIT

Xavier reported that dCache is running well for them.

They have upgraded one production instance from 2.6 --> 2.10 --> 2.11.

This basically went fine. The was an issue with the nfs door starting, but this was only because it was waiting for pnfsmanager to finish applying the schema changes. This took a while, but once the update was complete, the nfs door started OK.

So far, the monitoring of this instance shows it working fine.

It is used mainly by students to get store their theses on some secure storage. It is tape-attached, but the instance, in general, is not heavily loaded.

xrootd plugins

Xavier asked about the xrootd plugins: Tigran promised to send an email about them.

Paul now promised to do this after the meeting.

This is becoming more urgent as Xavier plans to upgrade the ATLAS and CMS Tier-1 instances to 2.11 on Tuesday and Thursday next week.

xrootd =

Xavier noticed a problem with the web monitoring of transfers. He saw some xrootd transfers that were apparently stuck in pool-selection, but neither the xrootd door nor poolmanager seemed to be aware of them.

The problem is (likely) not an artefact of httpd itself as Xavier restarted the domain hosting httpd and these stuck transfers came back.

This was with 2.6.23; the instance has (subsequently) been upgraded to 2.11. Doing this cleared the problem; but this was likely due to the complete restart the upgrade required.

Xavier will open a ticket describing this problem.

access log files

Xavier expressed concern that the new access log files may grow rather large, rather quickly.

Paul said that, for SRM, we log non-error responses at INFO and error responses at WARN; he would need to check what is done for FTP logging.

Paul also suggested that it may be worth keeping the logging threshold at info and to be more aggressive in rotating the logging. This way, if a user reports a problem "quickly enough" then it would be possible to copy the corresponding log file, allowing more detailed analysis. It's possible that some action, logged at INFO level, becomes significant in understanding the user's problem.

Support tickets for discussion

[Items are added here automagically]

DTNM

Meetings continue as normal: the next "European" meeting is Tuesday 14:00; the next "Americas" meeting is Thursday 16:00