wiki:developers-meeting-20091110
Last modified 11 years ago Last modified on 11/10/09 17:50:54

dCache Tier I meeting November 10, 2009

[part of a series of meetings]

Present

dCache.org(Patrick,Paul,Tigran,Timur,Gene), IN2P3(), Sara(Onno), Triumf(Simon), BNL(Pedro), NDGF(), PIC(Gerard), GridKa(), Fermi(Jon), CERN()

Agenda

(see box on the other side)

Site reports

BNL

PIC

Things are OK.

PIC currently has dCache version 1.9.5-7 on servers and 1.9.5-3 on the poolmanager and pools.

Have two issues with the info-provider: how to publish multiple SRM end-points and how to handle tape accounting. In communication with Paul about how to fix these issues.

Triumf

Things are OK. Nothing to report

Patrick asked which version of dCache are you using?

We currently have dCache v1.9.3 deployed.

The testing for 1.9.5 (currently 1.9.5-6) is still underway. This is using a fresh Chimera-based system with 100 TB storage. The first phase completed and indicated a number of issues. These were fixed and a second phase of testing started. This second phase is still going on, but might be finished tomorrow. Once the testing is complete, Triumf will make a decision; perhaps having this decision by next week.

Fermi

Jon reported that things remain fine, no current issues.

FZK

Doris reported via email:

We can report that we have all our dCache instances now on a 1.9.5 release. (Atlas
1.9.5-4, CMS 1.9.5-6, and the old instance (for LHCb and Alice on 1.9.5-5) except
for the PNFS server which has on all instances minimum 1.9.5-6rc and Info Provider
with the "newest" xml file)

IN2P3

Lionel reported via email:

we have upgraded to 1.9.5-6 yesterday, no major issue so far, except that our monitoring and
operations scripts cannot use anymore the adminDoor. We will eventuelly open a ticket if
we cannot fix this ourselves.

SARA

Two small issues:

Having problems with the replication manager for ATLASHOTDATA files. There are some files that are not being replicated. Ron is going to send an email to support@….

Are you really using the "replica manager"? Onno wasn't sure. (From talking to Ron) Found out that it is the replica manager. The error is "Connection refused." Could be a networking issue, but have checked for open ports, etc, so we don't believe that this is the cause.

The other issue is with tape protection. This was introduced in 1.9.4 but seems to be broken. We believe Irina has fixed the problem. Is this fix going to be released for 1.9.4? If so, when will it be released.

The fix will also be for 1.9.4. There will be a 1.9.4-5 release by the end of the week.

Tigran asked whether the hanging dcap movers is still a problem? It should have been fixed with 1.9.4-4.

Onno reported that SARA don't see this problem just now, but that they have configured a time-out on the pool, so hanging movers are killed.

Patrick asked whether they see the hanging movers being killed? This should be logged in the pool log file.

NDGF

Gerd reported via email:

- We upgraded our dCache instance to 1.9.5-7 + the latest patches + a 
backport of the WebDAV door.

- We deployed the latest version of the info provider.

- We replaced our HTTP door with a WebDAV door (see 
http://ftp1.ndgf.org:2880/biogrid/; it is read-only; you can view it in 
a web browser or mount it through webdav, aka web drive or web share in 
some OSes).

DTNM

Same time, next week.