dCache Tier I meeting December 05, 2017

[part of a series of meetings]

Present, IN2P3(), Sara(), Triumf(), BNL(), NDGF(Ulf), PIC(), KIT(Xavier), Fermi(), CERN()


Site reports


Generally everything is working fine.

Have one strange thing going on -- on a ALICE tape pool, 'migration move' that moves away files that are smaller than 20 MiB to disk.

The job

` migration move -id=move\-small\-files -permanent -storage=alice:tape -state=precious -size=..20000000 -tmode=cached+system -target=pgroup -- alice_disk `

Very occasionally get precious files on disk after the move.

Paul suggested making the pool LSF-mode, just to work-around the pool making lots of noise about this.

Ulf currently uses the migration copy command to copy the file back, which then the permanent job re-copies, so fixing the issue.

Xavier suggested using the 'rep set cached' command, to fix the file's state.

Ulf to open an RT ticket.


Running file right now.

Updated all dCache to v2.16.54. The update went fine, need to get used to how to upgrade dCache with puppet.

problem with getFileMetaData

It seems that SRM getMetadata seems to be broken -- CMS reported the problem. The upgrade was from v2.16.48. Nothing logged in dCache log file.

Problem seems to be client using SRM v1, instead of SRM v2.2

dcap problem

Latest dcap client version. Xavier tested the dcap binary from EPEL testing. This resolved the problem.

RT ticket 9307

Problem where the public interface was chosen by ZK / dCache for cell communication.

Problem is resolved (temporarily) by moving the core domains to a node that only has an interface on the private address.


Deploying IPv6 addresses on test setup. Everything seems to work fine, until SRM redirects client to

For IPv6, want to have only one address per hosts.

Multiple hostnames that resolve to the same IPv6: internal and external.

Xavier to explore whether DNS can be configured to support a single hostname that has different IPv4, depending on whether the client is internal or external.

Problem with failing uploads

Upload failures for WebDAV RT ticket 9295.

Make sure that logging of slow activity set of PnfsManager set to 500 ms. Slow logging in PostgreSQL at 400 ms.

Almost nothing in PnfsManager (only 5 in past ..; seem to be when database backups were taking place).

For the database everyone 2--3 space-manager looking for files in state=1; e.g.,

2017-12-02 23:53:47 CET LOG: duration: 663.809 ms execute <unnamed>: SELECT * FROM srmspacefile WHERE state IN (1) AND creationtime < $1 LIMIT 1000

Query duration seems to be fairly consistent.

They have "only" 3.2 million files.

One other query that took 14 seconds, from StAR.

RT ticket 9277

Using automatic leader election.

creating an account for KIT mailing list

RT management account.

Paul to ask Tigran

Support tickets for discussion

Proposed: same time, next week.