Last modified 4 years ago Last modified on 09/19/17 14:51:22

dCache Tier I meeting September 19, 2017

[part of a series of meetings]

Present, Paul), IN2P3(), Sara(), Triumf(), BNL(), NDGF(Ulf), PIC(Marc, Elena), KIT(Xavier), Fermi(), CERN()


(see box on the other side)

Site reports


Going quite well.

Longest running problem is with a pool that was corrupting data. 7--800,000 files checked without

Querying PNFSID given file size and ADLER32 checksum.


Today migrated to 3.1.15 in production instance -- everything looks OK on dCache. Some issues, but not dCache-related.

Running smoothly for about one hour. Some activity involving tape.

Downtime until 4PM.

BDII part -- for some reason the information it seems some incorrect information is being published.


Core and satellite domains -- opened the ticket.


Database crash

On Saturday SRM "database for ATLAS crashed."

Restored the database from backup and roll forward using the WAL files.

Bug in the firmware RAID controller, resulting in the filesystem becoming unavailable to PostgreSQL.

Plan to change db deployment and put both SRM and Chimera into the same database, but have a master-slave.

Upgrade plans

Upgraded production (but smallest) instance to 2.16.

Next upgrade is tomorrow, upgrading ATLAS instance to 2.16

dCache view

Need to run the new 'frontend' service.

Support tickets for discussion

[Items are added here automagically]


Proposed: same time, next week.